CN111340180B

CN111340180B - Countermeasure sample generation method and device for designated label, electronic equipment and medium

Info

Publication number: CN111340180B
Application number: CN202010084790.1A
Authority: CN
Inventors: 乔鹏; 林扬飞; 窦勇; 姜晶菲; 李荣春; 牛新; 苏华友; 潘衡岳
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2020-02-10
Filing date: 2020-02-10
Publication date: 2021-10-08
Anticipated expiration: 2040-02-10
Also published as: CN111340180A

Abstract

The application provides a method and a device for generating a countermeasure sample of a designated label, an electronic device and a computer readable medium. The method comprises the following steps: inputting an original image sample into a preset multi-label classification network to obtain a prediction score value of each label for performing multi-label classification on the original image sample; extracting a prediction score value corresponding to the appointed label from the prediction score values of all labels; generating a first attack disturbance by adopting a momentum fast gradient iteration MI-FGSM method according to a prediction score value corresponding to the appointed label; clipping the first attack disturbance by using a Grad-CAM method of a gradient weight class response graph to obtain a second attack disturbance; and superposing a second attack disturbance on the original image sample to generate a confrontation sample corresponding to the specified label. Through the scheme, the confrontation sample of the designated label can be selectively generated, and the method can be used for data augmentation of the multi-label classification network, so that the classification capability of the multi-label classification model is improved.

Description

Countermeasure sample generation method and device for designated label, electronic equipment and medium

Technical Field

The present application relates to the field of image recognition technologies, and in particular, to a method and an apparatus for generating a countermeasure sample of a specific tag, an electronic device, and a computer-readable medium.

Background

In recent years, Convolutional Neural Networks (CNNs) have been developed in a breakthrough manner in a plurality of application fields of computer vision, such as image classification, object detection, semantic segmentation, and the like. Szegedy et al in 2014 found that adding a small perturbation to a picture can cause a classification error of CNN, and then arouses a wide range of concerns, and the sample is called a countermeasure sample.

The challenge samples can be divided into various forms, wherein white box attacks and black box attacks can be divided according to the known situation of the structure and parameters of the CNN model. The white-box attack means that an attacker knows the structure and parameters of the neural network and can carry out targeted attack according to the condition of the neural network. White-box attacks are relatively easy, but are difficult to satisfy in real-world conditions. A black box attack means that the attacker does not know specific information of the neural network, e.g. only the structure of the network and not the parameters of the network or neither information. The black box attacks can be further divided into detection and non-detection, wherein detection means that an attacker can observe output by inputting data to a neural network and attack the network according to input and output conditions. Non-probing means that the output of the network cannot be observed through the input data, and generally attacks are implemented on the network by attacking a general model or utilizing the transitivity of the attack.

The method for attacking the multi-label classification network can be divided into two methods, wherein the first method is non-specified label attack, namely only the attack is needed to make the classification model wrongly classify a certain label; the second is directed to label attacks, even if the scoring model fails to classify a given label while keeping other labels correctly classified. Because the confrontation samples of the specific label attack are relatively less, the improvement of the classification capability of the multi-label model is restricted. Therefore, for the multi-label classification network, how to generate the countermeasure sample of the specified label is a technical problem to be solved in the field.

Disclosure of Invention

The application aims to provide a method and a device for generating countermeasure samples of a designated label, an electronic device and a computer readable medium.

The application provides a method for generating a countermeasure sample of a designated label in a first aspect, which comprises the following steps:

inputting an original image sample into a preset multi-label classification network to obtain a prediction score value of each label for multi-label classification of the original image sample;

extracting a prediction score value corresponding to the appointed label from the prediction score values of all the labels;

generating a first attack disturbance by adopting a momentum fast gradient iteration MI-FGSM method according to the prediction score value corresponding to the specified label;

clipping the first attack disturbance by using a Grad-CAM method of a gradient weight class response graph to obtain a second attack disturbance;

and superposing the second attack disturbance on the original image sample to generate a confrontation sample corresponding to the specified label.

In some embodiments of the present application, the preset multi-label classification network comprises: the characteristics connected in sequence extract a backbone network, a full connection layer and an activation function.

In some embodiments of the present application, the generating a first attack perturbation by using a momentum fast gradient iteration MI-FGSM method according to the prediction score value corresponding to the specified tag includes:

step S1, setting MI-FGSM iteration parameters, wherein the iteration parameters comprise momentum, disturbance value range epsilon and iteration step number iter_numIteration step length alpha;

step S2, calculating a partial derivative grad of the prediction score value corresponding to the designated label with respect to the original image sample to initialize a gradient;

step S3, updating attack disturbance according to formula attack disturbance noise ═ momentum × noise + α × grad, and clipping the updated attack disturbance according to the disturbance value range epsilon;

step S4, repeating step S3 until iter_numAnd step four, obtaining a first attack disturbance.

In some embodiments of the present application, the clipping the first attack disturbance by using a Grad-CAM method of a gradient weight class response graph to obtain a second attack disturbance includes:

deriving and weighting and summing the characteristic images of the original image sample in each characteristic layer of the preset multi-label classification network through a loss function corresponding to the specified label to obtain a target characteristic image;

restoring the target characteristic image to the same scale as the original image sample by a bilinear interpolation method to obtain Grad-CAM;

and performing noise clipping on the first attack disturbance according to the Grad-CAM to generate a second attack disturbance.

A second aspect of the present application provides a tag-designating confrontation sample generation apparatus comprising:

the prediction module is used for inputting an original image sample into a preset multi-label classification network to obtain a prediction score value of each label for multi-label classification of the original image sample;

the extraction module is used for extracting the prediction score value corresponding to the appointed label from the prediction score values of all the labels;

the gradient iteration module is used for generating first attack disturbance by adopting a momentum fast gradient iteration MI-FGSM method according to the prediction score value corresponding to the specified label;

the cutting module is used for cutting the first attack disturbance by utilizing a Grad-CAM method of a gradient weight class response graph to obtain a second attack disturbance;

and the generating module is used for superposing the second attack disturbance on the original image sample to generate a confrontation sample corresponding to the specified label.

In some embodiments of the present application, the gradient iteration module is specifically configured to:

s1, setting MI-FGSM iteration parameters including momentum, disturbance value range epsilon, iteration step number iter_numIteration step length alpha;

s2, calculating a partial derivative grad of the prediction fraction value corresponding to the designated label relative to the original image sample to initialize a gradient;

s3, updating attack disturbance according to the formula attack disturbance noise ═ momentum × noise + α × grad, and clipping the updated attack disturbance according to the disturbance value range epsilon;

s4, repeating S3 until iter_numAnd step four, obtaining a first attack disturbance.

In some embodiments of the present application, the clipping module is specifically configured to:

A third aspect of the present application provides an electronic device comprising: memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the computer program when executing the computer program to perform the method of the first aspect of the application.

A fourth aspect of the present application provides a computer readable medium having computer readable instructions stored thereon which are executable by a processor to implement the method of the first aspect of the present application.

Compared with the prior art, the method, the device, the electronic equipment and the medium for generating the countermeasure sample of the designated label can selectively generate the countermeasure sample of the designated label, the countermeasure sample can eliminate the label of a specific class after passing through a multi-label classification network, and meanwhile, the original label can be maintained and a new label cannot be generated in the image. The scheme can limit attack disturbance to a specific type image area, reduce the scope of the attack disturbance and make the attack more difficult to be perceived by people or machines. In addition, the generated confrontation sample can be used for data augmentation of the multi-label classification network, and the influence on the classification precision of the multi-label classification model caused by unbalanced distribution of the sample labels in the whole sample space in the multi-classification data set is made up, so that the classification capability of the multi-label classification model is improved.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

FIG. 1 illustrates a flow diagram of a method for generating a labeled challenge sample according to some embodiments of the present application;

FIG. 2 illustrates a schematic diagram of a labeled challenge sample generating device provided by some embodiments of the present application;

FIG. 3 illustrates a schematic diagram of an electronic device provided by some embodiments of the present application;

FIG. 4 illustrates a schematic diagram of a computer-readable medium provided by some embodiments of the present application.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

It is to be noted that, unless otherwise specified, technical or scientific terms used herein shall have the ordinary meaning as understood by those skilled in the art to which this application belongs.

In addition, the terms "first" and "second", etc. are used to distinguish different objects, rather than to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

The embodiment of the application provides a method and a device for generating a countermeasure sample of a designated label, an electronic device and a computer readable medium, which are described below with reference to the accompanying drawings.

Referring to fig. 1, which illustrates a flowchart of a labeled countermeasure sample generation method provided in some embodiments of the present application, as shown, the labeled countermeasure sample generation method may include the following steps:

step S101: inputting an original image sample into a preset multi-label classification network to obtain a prediction score value of each label for multi-label classification of the original image sample.

In some embodiments, the preset multi-label classification network may include: the characteristics connected in sequence extract a backbone network, a full connection layer and an activation function.

Specifically, the feature extraction backbone network may be a CNN-based inclusion V3 network for feature extraction, and parameters of the backbone network may be pre-trained by an Imagenet image data set; the full connection layer receives the features extracted by the backbone network, combines the features, and activates through a sigmoid activation function to obtain a prediction score value of each label, wherein the prediction score value is a score value used for classifying the labels. Specifically, the score value range is in the [0,1] interval, the label with the score value greater than or equal to 0.5 is divided into a positive sample, the label with the score value less than 0.5 is divided into a negative sample, and the parameters of the full connection layer can be trained by the VOC2007 and COCO2014 multi-label image data sets.

The above generation process of the prediction score values of the respective labels is exemplified:

step 1.1: the Inceposition V3 is responsible for extracting a feature layer from an input image, extracting a Mixed _7c horizon feature layer of the Inceposition V3, wherein the dimension of the feature layer corresponding to each image is 8 x 2048, the feature layer is used for subsequently calculating Grad-CAM images, then extracting a PreLogits layer of the Inceposition V3 as a network output feature layer, and the dimension corresponding to each feature image is 2048.

Step 1.2, after 2048-dimensional characteristic values are obtained by an inclusion V3 backbone convolution network, the characteristic vectors are accessed into a full connection layer, if the total number of types of the Imagenet image data set is C, the full connection layer is composed of parameters weights and bias with the parameters of 2048 × C and C dimensions respectively, and the formula is as follows: and (4) calculating the logits as PreLologs multiplied by weights + bias to obtain a non-normalized log probability logits value, namely, the original image sample is transmitted forward in the preset multi-label classification network to obtain the logits value.

And step 1.3, obtaining a C-dimensional image prediction score value prediction by the logits value through a sigmoid activation function layer.

Step S102: and extracting the prediction score value corresponding to the appointed label from the prediction score values of the labels.

Step S103: and generating a first attack disturbance by adopting a momentum fast gradient iteration MI-FGSM method according to the prediction score value corresponding to the specified label.

Specifically, after an original image sample obtains a prediction score value of each label through a preset multi-label classification network, it is equivalent to mapping an m-dimensional vector (an image with a length and a width of L, H respectively, where m is L × H) of the image to an n-dimensional label space, calculating a partial derivative of the m-dimensional vector by the n-dimensional label score value to obtain a jacobian matrix, and then selecting a partial derivative of a row corresponding to an attack label on the jacobian matrix according to a specified attack label to perform MI-FGSM (Momentum Fast Gradient iteration) according to the value of the partial derivative to generate attack disturbance.

Further, step S103 may be implemented as:

step S1, setting MI-FGSM iteration parameters, wherein the iteration parameters comprise momentum, disturbance value range epsilon and iteration step number iter_numIteration step length alpha; for example, momentum value 1, disturbance value range epsilon 32/255, iteration step iter _num20, iteration step α ∈/iter_numWhere e represents epsilon.

step S3, updating attack disturbance according to formula attack disturbance noise ═ momentum × noise + α × grad, and clipping the updated attack disturbance according to the disturbance value range epsilon; that is, after each step of iterative updating attack disturbance, the disturbance is clipped according to the disturbance value range [ - [ E ], + [ E ]).

Step S104: and utilizing a Grad-CAM method of a gradient weight class response graph to cut the first attack disturbance to obtain a second attack disturbance.

Further, step S104 may be implemented as:

Specifically, a 2048-dimensional feature layer A with the size of 8 multiplied by 8 is obtained before an original sample image is accessed to a full connection layer after passing through a multi-classification backbone network by utilizing a gradient weight class response map (Grad-CAM) to cut attack disturbance_k(1. ltoreq. k. ltoreq.2048), cross entropy of designated label, cross entropy, cross label, and cross-entropy of designated label]Calculating gradient of the characteristic layer, and taking the gradient value as weight omega_k(k is more than or equal to 1 and less than or equal to 2048) linearly combining characteristic layers with 2048 dimensions through a Relu function:

finally will be

And reducing the image to the size of the original image by a bilinear interpolation method to obtain a gradient weight class response image of the image.

Wherein the cross entropy cross _ entropy [ c ] of the specified tag is calculated as follows:

step one, an attack tag c is appointed, images are transmitted in a multi-classification network in a forward direction to obtain logits values, and prediction score value prediction is obtained through activation of a sigmoid function;

secondly, the values of logits and the real label y of the image_trueThe value is in accordance with cross _ entry ═ y_true×log(sigmoid(logits))+(1-y_true) The Xlog (1-sigmoid (logis)) formula calculates the cross entropy cross _ entropy.

Furthermore, the cross entropy cross _ entry [ c ] corresponding to the designated label c can be selected to perform backward propagation in the multi-label classification network to obtain a corresponding gradient value.

In the step, the attack disturbance is cut by using a gradient weight class response map (Grad-CAM), and the attack disturbance is reduced, so that the disturbance only acts on the appointed label, and the interference of the disturbance on other classes of labels is reduced.

Step S105: and superposing the second attack disturbance on the original image sample to generate a confrontation sample corresponding to the specified label.

Compared with the prior art, the countermeasure sample generation method for the designated label provided by the embodiment of the application can selectively generate the countermeasure sample of the designated label, and the countermeasure sample can eliminate the label of a specific class after passing through the multi-label classification network, and meanwhile, the original label can be maintained and a new label cannot be generated in the image. The method can limit the attack disturbance to a specific class of image area, narrow the scope of the attack disturbance and make the attack more difficult to be perceived by people or machines. In addition, the generated confrontation sample can be used for data augmentation of the multi-label classification network, and the influence on the classification precision of the multi-label classification model caused by unbalanced distribution of the sample labels in the whole sample space in the multi-classification data set is made up, so that the classification capability of the multi-label classification model is improved.

In the above embodiment, a method for generating a challenge sample of a designated label is provided, and correspondingly, a device for generating a challenge sample of a designated label is also provided. The countermeasure sample generation device for the designated label provided by the embodiment of the application can implement the countermeasure sample generation method for the designated label, and the countermeasure sample generation device for the designated label can be implemented by software, hardware or a combination of software and hardware. For example, the labeled challenge sample generating device may include integrated or separate functional modules or units to perform the corresponding steps of the methods described above. Referring to fig. 2, a schematic diagram of a labeled challenge sample generating device according to some embodiments of the present disclosure is shown. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.

As shown in fig. 2, the labeled countermeasure sample generation apparatus 10 may include:

the prediction module 101 is configured to input an original image sample into a preset multi-label classification network, and obtain a prediction score value of each label for performing multi-label classification on the original image sample;

an extracting module 102, configured to extract a prediction score value corresponding to a specified tag from the prediction score values of the tags;

the gradient iteration module 103 is configured to generate a first attack disturbance by using a momentum fast gradient iteration MI-FGSM method according to the prediction score value corresponding to the specified tag;

the cutting module 104 is configured to cut the first attack disturbance by using a gradient weight class response map Grad-CAM method to obtain a second attack disturbance;

a generating module 105, configured to superimpose the second attack disturbance on the original image sample, and generate a countermeasure sample corresponding to the specified label.

In some embodiments of the present application, the gradient iteration module 103 is specifically configured to:

In some embodiments of the present application, the cropping module 104 is specifically configured to:

The device 10 for generating the confrontation sample of the designated label provided by the embodiment of the present application has the same beneficial effects as the method for generating the confrontation sample of the designated label provided by the previous embodiment of the present application.

The embodiment of the present application further provides an electronic device corresponding to the method for generating a countermeasure sample of a specific tag provided in the foregoing embodiments, please refer to fig. 3, which shows a schematic diagram of an electronic device provided in some embodiments of the present application. As shown in fig. 3, the electronic device 20 includes: the system comprises a processor 200, a memory 201, a bus 202 and a communication interface 203, wherein the processor 200, the communication interface 203 and the memory 201 are connected through the bus 202; the memory 201 stores a computer program that can be executed on the processor 200, and the processor 200 executes the countermeasure sample generation method for the specific tag provided in any of the foregoing embodiments when executing the computer program.

The Memory 201 may include a high-speed Random Access Memory (RAM) and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 203 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used.

Bus 202 can be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. The memory 201 is used for storing a program, and the processor 200 executes the program after receiving an execution instruction, and the countermeasure sample generation method for the specific tag disclosed in any embodiment of the present application can be applied to the processor 200, or implemented by the processor 200.

The processor 200 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 200. The Processor 200 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 201, and the processor 200 reads the information in the memory 201 and completes the steps of the method in combination with the hardware thereof.

The electronic equipment provided by the embodiment of the application and the method for generating the confrontation sample of the designated label provided by the embodiment of the application have the same beneficial effects as the method adopted, operated or realized by the electronic equipment.

Referring to fig. 4, a computer readable storage medium is shown as an optical disc 30, on which a computer program (i.e., a program product) is stored, and when the computer program is executed by a processor, the computer program executes the method for generating the countermeasure sample of the designated tag according to any of the embodiments.

It should be noted that examples of the computer-readable storage medium may also include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory, or other optical and magnetic storage media, which are not described in detail herein.

The computer-readable storage medium provided by the above embodiments of the present application and the countermeasure sample generation method for the specific tag provided by the embodiments of the present application have the same advantages as the method adopted, run or implemented by the application program stored in the computer-readable storage medium.

It should be noted that the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the present disclosure, and the present disclosure should be construed as being covered by the claims and the specification.

Claims

1. A method for generating a challenge sample for a given tag, comprising:

superimposing the second attack disturbance on the original image sample to generate a confrontation sample corresponding to the specified label;

the generating a first attack disturbance by adopting a momentum fast gradient iteration MI-FGSM method according to the prediction score value corresponding to the specified label comprises the following steps:

step S3, according to the formula of attack disturbance: updating attack disturbance, and clipping the updated attack disturbance according to a disturbance value range epsilon;

2. The method of claim 1, wherein the pre-defined multi-label classification network comprises: the characteristics connected in sequence extract a backbone network, a full connection layer and an activation function.

3. The method according to claim 1, wherein the clipping the first attack perturbation by using a Grad-CAM method to obtain a second attack perturbation comprises:

4. A labeled challenge sample generating device, comprising:

the generating module is used for overlaying the second attack disturbance to the original image sample to generate a confrontation sample corresponding to the specified label;

the gradient iteration module is specifically configured to:

s3, according to the formula of the attack disturbance: updating attack disturbance, and clipping the updated attack disturbance according to a disturbance value range epsilon;

5. The apparatus of claim 4, wherein the predetermined multi-label classification network comprises: the characteristics connected in sequence extract a backbone network, a full connection layer and an activation function.

6. The apparatus of claim 4, wherein the cropping module is specifically configured to:

7. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor executes the computer program to implement the method according to any of claims 1 to 3.

8. A computer readable medium having computer readable instructions stored thereon which are executable by a processor to implement the method of any one of claims 1 to 3.