CN116109886A - Attack resistance method and system by using class activation diagram - Google Patents

Attack resistance method and system by using class activation diagram Download PDF

Info

Publication number
CN116109886A
CN116109886A CN202310094149.XA CN202310094149A CN116109886A CN 116109886 A CN116109886 A CN 116109886A CN 202310094149 A CN202310094149 A CN 202310094149A CN 116109886 A CN116109886 A CN 116109886A
Authority
CN
China
Prior art keywords
attack
round
disturbance
challenge
cam
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310094149.XA
Other languages
Chinese (zh)
Inventor
张寒萌
姜雪
刘兴钊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202310094149.XA priority Critical patent/CN116109886A/en
Publication of CN116109886A publication Critical patent/CN116109886A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/778Active pattern-learning, e.g. online learning of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of deep learning, and provides a method for resisting attack by using a class activation diagram, which comprises the following steps: s1: loading a trained deep learning module for generating disturbance resistance, setting attack times of attack resistance, and reading original image data to be attacked; s2: carrying out attack on the original image data according to the attack times, wherein the disturbance added to each round of the original image data is obtained by carrying out weighting operation on the initial disturbance and the CAM (CAM) graph calculated according to the countermeasure sample generated in the previous round; s3: after the number of attacks, a final challenge sample is generated. The attack intensity of different areas in the image is limited by means of the class activation graph (classification ionActivationMap, CAM), so that the attack success rate is not affected, and the disturbance quantity can be reduced. And may be combined with any gradient-based attack method.

Description

Attack resistance method and system by using class activation diagram
Technical Field
The invention relates to the technical field of deep learning, in particular to a method and a system for resisting attack by using a class activation diagram.
Background
With the rapid development of deep learning, deep neural networks have been widely used in the field of computer vision in recent years. At the same time, deep learning also presents some security challenges. Researchers have found that deep neural networks are vulnerable to challenge samples (adversarial examples). When some deliberate minor perturbations are added, the generated challenge sample can cause the model to sort incorrectly. This way of generating challenge samples is called challenge attack.
The challenge attacks can be classified into white-box attacks and black-box attacks. White-box attacks represent that an attacker can access all information of the network architecture, parameters, etc. of the target model, whereas in a black-box attack scenario, the attacker cannot acquire the information of the target model. The attack method provided by the invention is a scene aiming at white box attack. In a white-box attack, there are two important indicators reflecting the performance of the attack method: attack success rate and disturbance quantity. Attack success rate refers to the percentage of the number of challenge samples that are misclassified to the number of all challenge samples. The disturbance quantity is the original image and the contrast sample in l p Distance under the norm, the common norm is L 2 (representing the sum of squares of the elements and reopening the square root of the sum of squares of the elements representing the perturbations in the anti-sample; L for image data) 2 A smaller norm indicates that the challenge sample is more difficult for the human eye to recognize) and l (maximum value representing absolute value of each element, maximum value of each element representing disturbance in the countermeasure sample). The larger the attack success rate is, the lower the disturbance quantity is, which means that the attack method is stronger.
Researchers have proposed a series of white-box attack methods, two main categories: gradient-based challenge and optimization-based challenge. The former thought is to restrict the disturbance quantity within a certain range, the larger the attack success rate is, the better the attack success rate is, the representing method is FGSM, PGD, MIM, TIM and the like; the latter is to ensure that the disturbance quantity is optimized in the case of the generated challenge sample, so that the smaller the better.
Compared with the optimization method, the gradient-based attack resistance method is wider in practical application because of higher calculation speed, but is lack ofThe point is that the disturbance amount against the sample is large. Existing gradient-based attack methods, including FGSM, PGD, MIM, TIM, are directed to global addition of perturbations to an image, i.e., each pixel value of the image may change. However, this approach has the following problems: excessive alteration of the image, especially at L 2 The disturbance quantity under the norm is large, so that the masking property is low and the disturbance quantity is easy to be perceived by human eyes.
Disclosure of Invention
In view of the above problems, an object of the present invention is to provide a method and a system for combating attacks using class activation diagrams, in which the attack intensities of different areas in an image are limited by means of class activation diagrams (Classification Activation Map, CAM), and the attack success rate is not affected, and the disturbance quantity is reduced. And may be combined with any gradient-based attack method.
The above object of the present invention is achieved by the following technical solutions:
a method of counterattack using class activation graphs, comprising the steps of:
s1: loading a trained deep learning module for generating disturbance resistance, setting attack times of attack resistance, and reading original image data to be attacked;
s2: carrying out attack on the original image data according to the attack times, wherein the disturbance added to each round of the original image data is obtained by carrying out weighting operation on the initial disturbance and the CAM (CAM) graph calculated according to the countermeasure sample generated in the previous round;
s3: after the number of attacks, a final challenge sample is generated.
Further, in step S1, the method further includes:
selecting any gradient attack method including FGSM, PGD, MIM, TIM;
and setting the maximum iteration times according to the selected type of the gradient attack method.
Further, in step S2, further includes: the challenge sample generated by each round is generated by the challenge sample generated by the previous round adding the disturbance of the current round, in particular:
let the original image data be x and the disturbance added in the t-th round be p t The challenge sample generated by the t-th round is
Figure BDA0004071174840000021
The challenge samples generated at round t+1 are:
Figure BDA0004071174840000022
wherein ,
Figure BDA0004071174840000023
i.e. the challenge sample of the first round is the raw image data entered.
Further, in step S2, further includes: calculating the initial disturbance of the current round according to the countermeasure sample generated in the previous round, specifically:
inputting the countermeasure sample generated in the previous round into the deep learning model, and calculating the gradient of the deep learning model relative to the countermeasure sample generated in the previous round by using the counter propagation of the deep learning model, namely the initial disturbance of the current round.
Further, when the gradient attack method is PGD, a disturbance is added to the image in an iterative manner along the gradient increasing direction, specifically:
let the original image be x, the category be y, and the model beθThe loss function is L (theta, x, y), the gradient of the loss function to the original is
Figure BDA0004071174840000031
Let the coefficient alpha denote the limit value of each disturbance turn, the initial disturbance is obtained by processing the gradient through a sign function and multiplying the gradient by alpha:
Figure BDA0004071174840000032
wherein ,
Figure BDA0004071174840000033
for the initial disturbance of the t +1 th round, and (2)>
Figure BDA0004071174840000034
A gradient of the challenge sample generated for the opposite round.
Further, in step S2, further includes: calculating the CAM diagram of the current round according to the countermeasure sample generated in the previous round, specifically:
defining the CAM map for the challenge sample of the previous round using an image of the same size as the input image data;
differentiating contribution distribution conditions of different areas to a specified category in the CAM graph through the size of pixel values, wherein the larger the pixel value is, the higher the significance score is, the larger the contribution of the corresponding area to a prediction result of the specified category is, and on the t+1st round, representing the calculated CAM graph generated according to the countermeasure sample of the previous round as C t+1
Further, in step S2, the initial disturbance and the CAM map calculated according to the challenge sample generated in the previous round of disturbance added in each round of the original image data are obtained by performing a weighting operation, specifically:
the input is the initial disturbance
Figure BDA0004071174840000035
The weight is the CAM pattern C t+1 The output is a weighted disturbance, denoted p t+1 Directly dot multiplying the initial perturbation with the CAM map, representing pixel-by-pixel weighting of the initial perturbation;
for a pixel in the image at position (i, j), the perturbation value of pixel (i, j) in the t+1 round is equal to the initial perturbation corresponding to pixel (i, j) multiplied by the fraction of the corresponding position of pixel (i, j) in the CAM map of the present round, namely:
Figure BDA0004071174840000041
the challenge samples eventually generated for each round are:
Figure BDA0004071174840000042
wherein, the ". As used herein, the product of Hadamard products is the multiplication of the corresponding position elements;
after having been subjected to an attack with said number of attacks, obtained
Figure BDA0004071174840000043
For the final challenge sample.
A challenge system for performing a challenge method using a class activation map as described above, comprising:
the challenge preparation module is used for loading the trained deep learning module for generating the challenge disturbance, setting the number of times of the challenge, and reading the original image data to be attacked;
the attack resisting module is used for attacking the original image data according to the attack times, wherein the disturbance added to each round of the original image data is obtained by weighting operation according to the initial disturbance calculated by the attack sample generated in the previous round and the CAM image;
and the final sample generation module is used for generating a final challenge sample after the attack of the attack times is carried out.
A computer device comprising a memory and one or more processors, the memory having stored therein computer code which, when executed by the one or more processors, causes the one or more processors to perform a method as described above.
A computer readable storage medium storing computer code which, when executed, performs a method as described above.
Compared with the prior art, the invention has at least one of the following beneficial effects:
(1) By providing a method for combating attacks using class activation graphs, comprising the steps of: s1: loading a trained deep learning module for generating disturbance resistance, setting attack times of attack resistance, and reading original image data to be attacked; s2: carrying out attack on the original image data according to the attack times, wherein the disturbance added to each round of the original image data is obtained by carrying out weighting operation on the initial disturbance and the CAM (CAM) graph calculated according to the countermeasure sample generated in the previous round; s3: after the number of attacks, a final challenge sample is generated. According to the technical scheme, the class activation diagram CAM is used for limiting the attack intensity of different areas in the image, so that the attack success rate is not influenced, and the disturbance quantity can be reduced.
(2) The attack resistance method using the class activation graph provided by the invention can be combined with all gradient attack methods, including a single-step attack method and an iterative attack method.
Drawings
FIG. 1 is a schematic view of an aerial raw image of a river of the type of the present invention;
FIG. 2 is a schematic view of a CAM image of a river according to the present invention;
FIG. 3 is a schematic view of an original image of a beach in accordance with the present invention;
FIG. 4 is a schematic view of a CAM image of beach in the category of the present invention;
FIG. 5 is a flowchart of a method for countering attacks using class activation graphs in accordance with the present invention;
FIG. 6 is a detailed flowchart of a challenge method utilizing a class activation diagram in accordance with the present invention;
fig. 7 is an overall block diagram of a challenge system utilizing a class activation diagram in accordance with the present invention.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The gradient-based anti-attack method is more widely applied in practice because of faster calculation speed, but has the disadvantage of large disturbance quantity of an anti-sample. Therefore, the invention aims to design a novel gradient-based attack method, and the disturbance quantity is reduced on the basis of the existing method.
With the development of deep learning interpretability theory, researchers have proposed class activation graphs (Clas sification Activation Map, CAM) that utilize feature visualization to analyze the decision principle of a model. Studies have found that for a given class, the deep neural network makes decisions with respect to certain areas of the image, rather than the entire image. As shown in fig. 1-4, two images of an AI D dataset (large aerial image dataset) and corresponding class activation map CAM are selected. As can be seen from fig. 1 and 2, the image class is river, the model is seen from class activation map CAM to pay more attention to the river channel section, the image class is beach, and the model is seen from class activation map CAM to pay more attention to the coastal junction.
Therefore, the invention fully utilizes the characteristic of the deep neural network, uses the class activation map CAM as a weight, and weights the disturbance so that the disturbance is added in the most focused area of the model. Thereby reducing the disturbance quantity while ensuring the success rate of attack. The invention can be used as a plug and play module and can be combined with any gradient-based method.
The following is described by way of specific examples:
first embodiment
As shown in fig. 5 and 6, the present embodiment provides a method for combating attacks using class activation graphs, including the steps of:
s1: loading a trained deep learning module for generating disturbance resistance, setting the attack times of the attack resistance and reading the original image data to be attacked.
Specifically, in this embodiment, a trained deep learning model for generating a subsequent disturbance countermeasure is first loaded, and the deep learning model may be any kind of deep learning model for disturbance countermeasure trained by the prior art, which is not described in detail in this embodiment because it is not the core invention of the present invention.
In addition, the gradient attack method used in the present invention is not limited in this embodiment, and the technical point of the present invention can be combined with any gradient-based method. The gradient attack method may be any one selected from FGSM, PGD, MIM, TIM. The maximum number of iterations (i.e., number of attacks) is set according to the type of gradient attack method selected. For example, FGSM is a single-step attack method, the maximum number of iterations is 1, pgd, MIM and TIM are iterative attack methods, and the maximum number of iterations is set according to actual needs. Taking PGD as an example in this embodiment, the maximum iteration number max_iter may be set to 5.
The original image data is then read, either as a single image or as a batch of images.
S2: and attacking the original image data according to the attack times, wherein the disturbance added to each round of the original image data is obtained by carrying out weighting operation on the initial disturbance and the CAM image calculated according to the countermeasure sample generated in the previous round.
Specifically, in the attack section, max_iter round attack is performed on the original image data. If counting from 1, then the start round is 1 and the end round is max_iter. Let the original image data be x and the disturbance added in the t-th round be p t The challenge sample generated by the t-th round is
Figure BDA0004071174840000071
Then the challenge samples for each round are generated by the disturbance of the previous round of challenge sample addition for that round, then the challenge samples generated for round t+1 are:
Figure BDA0004071174840000072
wherein ,
Figure BDA0004071174840000073
i.e. the input of the challenge sample for the first round is the raw image data.
For each round of attack, the initial disturbance of the current round is first calculated from the challenge samples generated in the previous round. Here, the initial disturbance refers to a disturbance obtained directly by a gradient attack method, and corresponds to a weighted disturbance, and the specific calculation method of the initial disturbance is as follows: inputting the countermeasure sample generated in the previous round into the deep learning model, and calculating the gradient of the deep learning model relative to the countermeasure sample generated in the previous round by using the counter propagation of the deep learning model, namely the initial disturbance of the current round.
Taking a gradient attack method PGD as an example, adding disturbance to an image in an iterative manner along the gradient increasing direction, specifically: let the original image be x, the category be y, the model be θ, the loss function be L (θ, x, y), the gradient of the loss function to the original image be
Figure BDA0004071174840000074
Let the coefficient alpha denote the limit value of each disturbance turn, the initial disturbance is obtained by processing the gradient through a sign function and multiplying the gradient by alpha:
Figure BDA0004071174840000075
wherein ,
Figure BDA0004071174840000076
for the initial disturbance of the t +1 th round, and (2)>
Figure BDA0004071174840000077
A gradient of the challenge sample generated for the opposite round.
And then calculating the CAM diagram of the current round according to the countermeasure sample generated in the previous round, wherein the CAM diagram is specifically: defining the CAM map for the challenge sample of the previous round using an image of the same size as the input image data; in the CAM, the contribution distribution conditions of different areas to the appointed category are distinguished by the size of the pixel value, the larger the pixel value is, the higher the significance score is, the larger the contribution of the corresponding area to the prediction result of the appointed category is, the range of the pixel value is [0,1]. CAM graphs have a series of algorithms, such as CAM, grad-CAM++, etc. Because the premise of CAM use is that a full connection layer in the network is replaced by a GAP layer, when the model structure is inconsistent, retraining is needed, and the CAM is more troublesome in practical application, and therefore, follow-up optimization algorithms of CAM, such as Grad-CAM, grad-CAM++, and the like are more commonly used. The specific algorithm is not described in detail. At round t+1, the CAM pattern generated from the challenge samples of the previous round is denoted as C t+1
For each round of iterative attack (except the first round), the CAM map of the challenge sample of the previous round was calculated. In the first round of iterative attack and single step attack, the CAM map of the original image is computed.
The following is the key step of the present invention, namely the weighting operation of the initial countermeasure disturbance. The input is the initial disturbance
Figure BDA0004071174840000081
The weight is the CAM pattern C t+1 The output is a weighted disturbance, denoted p t+1 . The initial challenge disturbance and the CAM map are the same size as the original image, and therefore are the same size, and the initial disturbance and the CAM map are directly dot multiplied to represent pixel-by-pixel weighting of the initial disturbance;
for a pixel in the image at position (i, j), the perturbation value of pixel (i, j) in the t+1 round is equal to the initial perturbation corresponding to pixel (i, j) multiplied by the fraction of the corresponding position of pixel (i, j) in the CAM map of the present round, namely:
Figure BDA0004071174840000082
the challenge samples eventually generated for each round are:
Figure BDA0004071174840000083
wherein, the ". As used herein, the product of Hadamard products is the multiplication of the corresponding position elements;
after having been subjected to an attack with said number of attacks, obtained
Figure BDA0004071174840000084
Furthermore, it should be noted that if the attack is successful in the iterative process, the attack may be terminated in advance. When the max_iter round is not reached, if the attack has succeeded, the iteration may be stopped.
In another embodiment, the CAM map may be binarized and then used as a mask against disturbance, which is different from the present embodiment in that the present embodiment does not binarize, but retains floating point numbers, and the weighting method is more refined.
S3: after the number of attacks, a final challenge sample is generated, i.e
Figure BDA0004071174840000085
Second embodiment
As shown in fig. 7, the present embodiment provides a challenge system for executing a challenge method using a class activation diagram as in the first embodiment, including:
the challenge preparation module 1 is used for loading a trained deep learning module for generating challenge, setting the number of times of attack of the challenge, and reading the original image data to be attacked;
the attack resistance module 2 is used for attacking the original image data according to the attack times, wherein the disturbance added in each round of the original image data is obtained by weighting operation according to the initial disturbance calculated by the attack resistance sample generated in the previous round and the CAM;
a final sample generation module 3, configured to generate a final challenge sample after the attack of the attack number.
A computer readable storage medium storing computer code which, when executed, performs a method as described above. Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the present invention may occur to one skilled in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.
The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
It should be noted that the above embodiments can be freely combined as needed. The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (10)

1. A method of combating attacks using class activation graphs, comprising the steps of:
s1: loading a trained deep learning module for generating disturbance resistance, setting attack times of attack resistance, and reading original image data to be attacked;
s2: carrying out attack on the original image data according to the attack times, wherein the disturbance added to each round of the original image data is obtained by carrying out weighting operation on the initial disturbance and the CAM (CAM) graph calculated according to the countermeasure sample generated in the previous round;
s3: after the number of attacks, a final challenge sample is generated.
2. The method for combating attacks using class activation graphs according to claim 1, wherein in step S1, further comprising:
selecting any gradient attack method including FGSM, PGD, MIM, TIM;
and setting the maximum iteration times according to the selected type of the gradient attack method.
3. The method for combating attacks using class activation graphs according to claim 1, wherein in step S2, further comprising: the challenge sample generated by each round is generated by the challenge sample generated by the previous round adding the disturbance of the current round, in particular:
let the original image data be x and the disturbance added in the t-th round be p t The challenge sample generated by the t-th round is
Figure FDA0004071174830000011
The challenge samples generated at round t+1 are:
Figure FDA0004071174830000012
wherein ,
Figure FDA0004071174830000013
i.e. the input of the challenge sample for the first round is the raw image data.
4. The method for combating attacks using class activation graphs according to claim 1, wherein in step S2, further comprising: calculating the initial disturbance of the current round according to the countermeasure sample generated in the previous round, specifically:
inputting the countermeasure sample generated in the previous round into the deep learning model, and calculating the gradient of the deep learning model relative to the countermeasure sample generated in the previous round by using the counter propagation of the deep learning model, namely the initial disturbance of the current round.
5. The method for combating attacks using class activation graphs of claim 4, further comprising:
when the gradient attack method is PGD, adding disturbance to the image in an iterative manner along the gradient increasing direction, specifically:
let the original image be x, the category be y, the model be θ, the loss function be L (θ, x, y), the gradient of the loss function to the original image be
Figure FDA0004071174830000021
Let the coefficient alpha denote the limit value of each disturbance turn, the initial disturbance is obtained by processing the gradient through a sign function and multiplying the gradient by alpha:
Figure FDA0004071174830000022
wherein ,
Figure FDA0004071174830000023
for the initial disturbance of the t +1 th round, and (2)>
Figure FDA0004071174830000024
A gradient of the challenge sample generated for the opposite round.
6. The method for combating attacks using class activation graphs of claim 5, wherein in step S2, further comprising: calculating the CAM diagram of the current round according to the countermeasure sample generated in the previous round, specifically:
defining the CAM map for the challenge sample of the previous round using an image of the same size as the input image data;
differentiating contribution distribution conditions of different areas to a specified category in the CAM graph through the size of pixel values, wherein the larger the pixel value is, the higher the significance score is, the larger the contribution of the corresponding area to a prediction result of the specified category is, and on the t+1st round, representing the calculated CAM graph generated according to the countermeasure sample of the previous round as C t+1
7. The method of challenge attack in accordance with claim 6 wherein in step S2, the initial perturbation and the CAM map calculated from the challenge samples generated in the previous round are weighted by the perturbation added for each round of the original image data, specifically:
the input is the initial disturbance
Figure FDA0004071174830000025
The weight is the CAM pattern C t+1 The output is a weighted disturbance, denoted p t+1 Directly dot multiplying the initial perturbation with the CAM map, representing pixel-by-pixel weighting of the initial perturbation;
for a pixel in the image at position (i, j), the perturbation value of pixel (i, j) in the t+1 round is equal to the initial perturbation corresponding to pixel (i, j) multiplied by the fraction of the corresponding position of pixel (i, j) in the CAM map of the present round, namely:
Figure FDA0004071174830000026
the challenge samples eventually generated for each round are:
Figure FDA0004071174830000027
wherein, the ". As used herein, the product of Hadamard products is the multiplication of the corresponding position elements;
after having been subjected to an attack with said number of attacks, obtained
Figure FDA0004071174830000031
For the final challenge sample.
8. A challenge attack system for executing a challenge attack method using class activation diagrams of the class activation diagrams according to claims 1-7, comprising:
the challenge preparation module is used for loading the trained deep learning module for generating the challenge disturbance, setting the number of times of the challenge, and reading the original image data to be attacked;
the attack resisting module is used for attacking the original image data according to the attack times, wherein the disturbance added to each round of the original image data is obtained by weighting operation according to the initial disturbance calculated by the attack sample generated in the previous round and the CAM image;
and the final sample generation module is used for generating a final challenge sample after the attack of the attack times is carried out.
9. A computer device comprising a memory and one or more processors, the memory having stored therein computer code that, when executed by the one or more processors, causes the one or more processors to perform the method of any of claims 1-7.
10. A computer readable storage medium storing computer code which, when executed, performs the method of any one of claims 1 to 7.
CN202310094149.XA 2023-02-10 2023-02-10 Attack resistance method and system by using class activation diagram Pending CN116109886A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310094149.XA CN116109886A (en) 2023-02-10 2023-02-10 Attack resistance method and system by using class activation diagram

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310094149.XA CN116109886A (en) 2023-02-10 2023-02-10 Attack resistance method and system by using class activation diagram

Publications (1)

Publication Number Publication Date
CN116109886A true CN116109886A (en) 2023-05-12

Family

ID=86261286

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310094149.XA Pending CN116109886A (en) 2023-02-10 2023-02-10 Attack resistance method and system by using class activation diagram

Country Status (1)

Country Link
CN (1) CN116109886A (en)

Similar Documents

Publication Publication Date Title
CN110222831A (en) Robustness appraisal procedure, device and the storage medium of deep learning model
CN112883874B (en) Active defense method aiming at deep face tampering
Li et al. Black-box attack against handwritten signature verification with region-restricted adversarial perturbations
Ye et al. Detection defense against adversarial attacks with saliency map
CN113627543A (en) Anti-attack detection method
Chen et al. Patch selection denoiser: An effective approach defending against one-pixel attacks
CN112528675A (en) Confrontation sample defense algorithm based on local disturbance
CN116071797B (en) Sparse face comparison countermeasure sample generation method based on self-encoder
CN116109886A (en) Attack resistance method and system by using class activation diagram
CN116188439A (en) False face-changing image detection method and device based on identity recognition probability distribution
CN114638356A (en) Static weight guided deep neural network back door detection method and system
CN115620100A (en) Active learning-based neural network black box attack method
CN113486736A (en) Black box anti-attack method based on active subspace and low-rank evolution strategy
Zhang et al. Certified defense against patch attacks via mask-guided randomized smoothing
CN111723864A (en) Method and device for performing countermeasure training by using internet pictures based on active learning
CN117786682B (en) Physical challenge attack resisting method, device, equipment and medium based on enhanced framework
Mao et al. Object-free backdoor attack and defense on semantic segmentation
Jellali et al. Data Augmentation for Convolutional Neural Network DeepFake Image Detection
Yang et al. Network traffic threat feature recognition based on a convolutional neural network
Wang et al. The Security Threat of Adversarial Samples to Deep Learning Networks
Luo et al. Defective Convolutional Networks
Liu et al. Adversarial examples generated from sample subspace
Yang et al. LpAdvGAN: Noise Optimization Based Adversarial Network Generation Adversarial Example
Zhu et al. Adversarial Example Defense via Perturbation Grading Strategy
CN117786682A (en) Physical challenge attack resisting method, device, equipment and medium based on enhanced framework

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination