CN111737691B - Method and device for generating confrontation sample - Google Patents

Method and device for generating confrontation sample Download PDF

Info

Publication number
CN111737691B
CN111737691B CN202010725498.3A CN202010725498A CN111737691B CN 111737691 B CN111737691 B CN 111737691B CN 202010725498 A CN202010725498 A CN 202010725498A CN 111737691 B CN111737691 B CN 111737691B
Authority
CN
China
Prior art keywords
sample
image
pixel
current
iteration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010725498.3A
Other languages
Chinese (zh)
Other versions
CN111737691A (en
Inventor
傅驰林
黄启印
周俊
张晓露
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202010725498.3A priority Critical patent/CN111737691B/en
Publication of CN111737691A publication Critical patent/CN111737691A/en
Application granted granted Critical
Publication of CN111737691B publication Critical patent/CN111737691B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the specification provides a method and a device for generating a confrontation sample, wherein the method comprises the following steps: obtaining a current confrontation sample to be strengthened in the current iteration; in the direction of reducing the target loss function, performing preset geometric deformation for the current confrontation sample for the first time to obtain a deformation image; performing pixel-by-pixel updating of a second time on the deformed image to obtain a first antagonizing sample; performing pixel-by-pixel update of a third time on the current countermeasure sample to obtain a second countermeasure sample; determining a countermeasure sample with a smaller corresponding loss value in the first countermeasure sample and the second countermeasure sample as an updated countermeasure sample; when the iteration stopping condition is met, taking the updated countermeasure sample as a final countermeasure sample; when the stop iteration condition is not satisfied, performing the next iteration based on the updated confrontation sample. The generated countermeasure sample can be made to have stronger aggressivity, so that the defense is targeted.

Description

Method and device for generating confrontation sample
Technical Field
One or more embodiments of the present disclosure relate to the field of computers, and more particularly, to a method and apparatus for generating a challenge sample.
Background
With the large-scale application of the image recognition model, the attack layer aiming at the image recognition model is infinite, and the research needs to be followed in time, so that potential attack means are found, and the danger is prevented in the future. Among many attack methods, the anti-attack is a novel and highly aggressive attack means. The counterattack is obtained by intentionally adding interference to an input sample, and the countersample causes an image recognition model to give an erroneous output with high confidence.
In the generation method of the prior art countermeasure sample, tiny high-frequency disturbance is often added in an original image to generate the countermeasure sample, and the countermeasure sample is easily blocked by a filtering countermeasure method and is not strong in aggressivity.
Therefore, improved solutions are desired that enable the generation of challenge samples with greater aggressiveness and thus targeted defense.
Disclosure of Invention
One or more embodiments of the present specification describe a method and an apparatus for generating a countermeasure sample, which can make the generated countermeasure sample have stronger aggressivity, thereby performing targeted defense.
In a first aspect, a method for generating a challenge sample is provided, the method comprising:
obtaining a current countermeasure sample to be strengthened in the iteration of the current round, wherein the current countermeasure sample is an image attacking the target image recognition model;
in the direction of reducing the target loss function, performing preset geometric deformation for the current confrontation sample for a first time to obtain a deformation image; performing a second number of pixel-by-pixel updates on the deformed image to obtain a first antagonizing sample; wherein the pixel-by-pixel update is targeted to reduce the target loss function determined from a similarity between the challenge sample identified by the target image identification model and a target image;
performing the pixel-by-pixel update of a third time on the current countermeasure sample to obtain a second countermeasure sample;
determining a first loss value of the target loss function corresponding to the first antagonizing sample and a second loss value of the target loss function corresponding to the second antagonizing sample; determining a countermeasure sample with a smaller corresponding loss value in the first countermeasure sample and the second countermeasure sample as an updated countermeasure sample;
when the iteration stopping condition is met, taking the updated countermeasure sample as a final countermeasure sample;
when the stop iteration condition is not satisfied, performing a next iteration based on the updated countermeasure sample.
In one possible implementation, the current iteration is a first iteration, and the current confrontation sample is an original image of an attacker; or,
the current round of iteration is not the first round of iteration, and the current confrontation sample is an updated confrontation sample generated in the previous round of iteration.
In a possible embodiment, the preset geometrical deformation comprises:
detecting an initial key point in the current confrontation sample to obtain a first coordinate of the initial key point;
graduating the initial keypoints in a direction that reduces the target loss function;
moving the initial key point along the gradient direction for a first step length to obtain a second coordinate of the initial key point after expected geometric deformation;
solving a transformation matrix according to the first coordinate and the second coordinate of the initial key point;
transforming the current countermeasure sample according to the transformation matrix.
Further, the current confrontation sample is a face image, and the initial key points are key points for describing eyebrows, eyes, a nose, a mouth or a face contour.
In one possible implementation, the pixel-by-pixel update includes:
solving the gradient of each pixel point of the image in the direction of reducing the target loss function;
and updating each pixel point under the condition of meeting the constraint condition according to the second step length and the gradient.
In a possible implementation manner, the attack is a non-target attack, the target image is an original image of an attacker, and the target loss function is positively correlated with the similarity; or the attack is targeted attack, the target image is not the original image of the attacker, and the target loss function is inversely related to the similarity.
In one possible embodiment, gaussian noise is added to the first and second antagonizing samples.
In a possible implementation, the stop iteration condition includes: the current iteration times reach preset times; or the smaller loss value of the first loss value and the second loss value is smaller than a preset value.
In one possible embodiment, the method further comprises:
training a detection model with the final confrontation sample, the detection model being used for classifying the input image into a normal sample and a confrontation sample.
Further, the method further comprises:
inputting an image to be recognized into the trained detection model;
and when the image to be recognized output by the detection model is a normal sample, inputting the image to be recognized into the target image recognition model, and performing image recognition by the target image recognition model.
Further, the method further comprises:
and when the detection model outputs the image to be identified as a countermeasure sample, manually identifying the image.
In a second aspect, there is provided an apparatus for generating a challenge sample, the apparatus comprising:
the acquisition unit is used for acquiring a current countermeasure sample to be strengthened in the iteration, wherein the current countermeasure sample is an image attacking the target image identification model;
the first branch unit is used for performing preset geometric deformation for a first time on the current confrontation sample acquired by the acquisition unit in the direction of reducing the target loss function to obtain a deformation image; performing a second number of pixel-by-pixel updates on the deformed image to obtain a first antagonizing sample; wherein the pixel-by-pixel update is targeted to reduce the target loss function determined from a similarity between the challenge sample identified by the target image identification model and a target image;
the second branch unit is used for performing pixel-by-pixel updating for a third time on the current countermeasure sample acquired by the acquisition unit to obtain a second countermeasure sample;
a single-round determining unit, configured to determine a first loss value of the target loss function corresponding to a first antagonistic sample obtained by the first branch unit, and a second loss value of the target loss function corresponding to a second antagonistic sample obtained by the second branch unit; determining a countermeasure sample with a smaller corresponding loss value in the first countermeasure sample and the second countermeasure sample as an updated countermeasure sample;
a final determination unit, configured to take the updated confrontation sample determined by the single-round determination unit as a final confrontation sample when a stop iteration condition is satisfied;
and the iteration triggering unit is used for carrying out the next iteration based on the updated countermeasure sample when the iteration stopping condition is not met.
In a third aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.
In a fourth aspect, there is provided a computing device comprising a memory having stored therein executable code and a processor that, when executing the executable code, implements the method of the first aspect.
According to the method and the device provided by the embodiment of the specification, the current confrontation sample to be strengthened in the iteration of the current round is obtained, and the current confrontation sample is an image attacking the target image recognition model; and then respectively executing two branch processing flows, wherein in one branch processing flow, aiming at the current confrontation sample, geometric deformation is executed firstly, and then pixel-by-pixel updating is carried out, and in the other branch processing flow, aiming at the current confrontation sample, pixel-by-pixel updating is directly carried out. At each iteration, the processing result of the branch processing flow that drops the target loss function more is selected as the final countermeasure sample, or as the updated countermeasure sample, to perform the next iteration. As can be seen from the above, in the embodiments of the present description, geometric deformation is combined with pixel-by-pixel update, and since the change of the geometric deformation to the image does not belong to a small and high-frequency disturbance, a defense algorithm for eliminating noise through high-frequency filtering and the like cannot prevent a countermeasure sample obtained through geometric deformation; in addition, the self-adaptive branch selection processing result in the iterative process enables the final confrontation sample to have a better attack effect when facing a defense algorithm, and the generated confrontation sample can have stronger aggressivity, so that the targeted defense is realized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic diagram illustrating an implementation scenario of an embodiment disclosed herein;
FIG. 2 illustrates a flow diagram of a method of generating a challenge sample according to one embodiment;
FIG. 3 shows a schematic diagram of a geometric variation according to an embodiment;
FIG. 4 illustrates an overall flow diagram for generating a challenge sample according to one embodiment;
fig. 5 shows a schematic block diagram of a challenge sample generation apparatus according to an embodiment.
Detailed Description
The scheme provided by the specification is described below with reference to the accompanying drawings.
Fig. 1 is a schematic view of an implementation scenario of an embodiment disclosed in this specification. The implementation scenario involves the generation of countermeasure samples. Referring to fig. 1, an image recognition model is used to classify an input image, an original image belongs to a category a, and a countermeasure sample is obtained by adding interference to the original image, which is not perceived by human eyes because the interference is small, and the countermeasure sample still belongs to the category a when viewed by human eyes, but the countermeasure sample is input to the image recognition model, and the recognition result of the image recognition model is the category B. This type of attack, which by deliberately adding interference to the input samples, causes the model to give a false output with high confidence, is known as a counter attack.
It will be appreciated that the effectiveness of the counterattack depends on the countersample generated, and that defense can be targeted generally for countersamples generated in different ways.
The embodiment of the specification can be suitable for white box attack or black box attack. Among them, white-box attacks: the countermeasure samples are generated with the structure and parameters of the training set and the target image recognition model known. Black box attack: the challenge sample is generated with the structure and parameters of the target image recognition model unknown.
The embodiment of the specification can be suitable for non-target attack or target attack. The non-target attack enables the target image identification model to identify the input image into other categories except the real category of the target image identification model, and the measurement standard for success of the non-target attack is that the similarity between an anti sample identified by the target image identification model and the original image is smaller than a threshold value a, for example, the real category of the input image is an apple, and the identification result can be pear, peach, mango and the like if the attack is successful if the identification result is not the apple; or, the real type of the input image is twilight, and the recognition result can be reddish, rigid, cloudless, and the like as long as the recognition result is not twilight, even if the attack is successful. A targeted attack causes a targeted image recognition model to recognize the input image as a specified target class. The target attack success measurement standard is that the similarity between the confrontation sample identified by the target image identification model and the target image is greater than a threshold b, for example, the real category of the input image is apple, the target category is mango, the identification result is that the attack is successful only if the mango is obtained, and the attack is unsuccessful if the identification result is pear; or the real category of the input image is bright, the target category is small red, the attack is successful only if the identification result is small red, and the attack is unsuccessful only if the identification result is small red.
It can be understood that, when a countermeasure attack is performed, the input image of the target image recognition model is a countermeasure sample, and the countermeasure sample is generated by adding interference to the original image. The target image recognition model may be a face recognition model, and in the embodiments of the present description, the face recognition model is often used as an example for description, but the target image recognition model is not limited to this. The face recognition model is commonly used for face-brushing payment, access control systems and the like, and if the face recognition model is attacked, the privacy of a user is easily leaked and economic loss is easily caused, so that the attack of the type needs to be found in advance and a targeted defense needs to be carried out.
Fig. 2 shows a flowchart of a method for generating countermeasure samples according to an embodiment, which may generate countermeasure samples for a target image recognition model from an original image based on the implementation scenario shown in fig. 1. As shown in fig. 2, the method for generating the countermeasure sample in this embodiment includes the following steps: step 21, obtaining a current countermeasure sample to be strengthened in the current iteration, wherein the current countermeasure sample is an image attacking the target image recognition model; step 22, in the direction of reducing the target loss function, performing preset geometric deformation on the current confrontation sample for a first time to obtain a deformation image; performing a second number of pixel-by-pixel updates on the deformed image to obtain a first antagonizing sample; wherein the pixel-by-pixel update is targeted to reduce the target loss function determined from a similarity between the challenge sample identified by the target image identification model and a target image; step 23, performing a third number of pixel-by-pixel updates on the current countermeasure sample to obtain a second countermeasure sample; step 24, determining a first loss value of the target loss function corresponding to the first antagonizing sample and a second loss value of the target loss function corresponding to the second antagonizing sample; determining a countermeasure sample with a smaller corresponding loss value in the first countermeasure sample and the second countermeasure sample as an updated countermeasure sample; step 25, when the iteration stopping condition is met, taking the updated countermeasure sample as a final countermeasure sample; and 26, when the iteration stopping condition is not met, performing the next round of iteration based on the updated countermeasure sample. Specific execution modes of the above steps are described below.
First, in step 21, a current countermeasure sample to be reinforced in the current iteration is obtained, where the current countermeasure sample is an image of an attack on the target image recognition model. It is understood that the above-mentioned iteration may be any one of a plurality of iterations, and the attack effect on the sample may be gradually enhanced through the plurality of iterations.
In one example, the current iteration is a first iteration, and the current confrontation sample is an original image of an attacker; or, the current round of iteration is not the first round of iteration, and the current confrontation sample is an updated confrontation sample generated in the previous round of iteration.
It is to be understood that, when the target image recognition model is a face recognition model, the original image of the attacker may be a face image.
Then, in step 22, in the direction of decreasing the target loss function, performing a first number of times of preset geometric deformation on the current confrontation sample to obtain a deformation image; performing a second number of pixel-by-pixel updates on the deformed image to obtain a first antagonizing sample; wherein the pixel-by-pixel update is targeted to reduce the target loss function determined from a similarity between the challenge sample identified by the target image identification model and a target image. It will be appreciated that the first antagonizing sample is obtained by a combination of geometric deformation and pixel-by-pixel update.
In one example, the preset geometric deformation includes:
detecting an initial key point in the current confrontation sample to obtain a first coordinate of the initial key point;
graduating the initial keypoints in a direction that reduces the target loss function;
moving the initial key point along the gradient direction for a first step length to obtain a second coordinate of the initial key point after expected geometric deformation;
solving a transformation matrix according to the first coordinate and the second coordinate of the initial key point;
transforming the current countermeasure sample according to the transformation matrix.
Further, the current confrontation sample is a face image, and the initial key points are key points for describing eyebrows, eyes, a nose, a mouth or a face contour. It can be understood that, for the case that the current confrontation sample is not a face image, the initial key point may be determined as needed, for example, a pixel point on the contour of the object in the current confrontation sample is determined as the initial key point.
Fig. 3 shows a schematic view of a geometric variation according to an embodiment. Referring to fig. 3, first, a face keypoint P = { P } in a confrontation sample is detected using a face keypoint detection model1, …, pk},pi=(ui, vi) And u and v are coordinates on the countermeasure sample picture. Let p bei adv=(ui adv, vi adv) In order to resist the coordinate of the key point corresponding to pi of the sample after geometric deformation, the transformation matrix of the geometric deformation is M, and then P isadv= MP. Then let P before each geometric deformationadvAnd (c) = P. According to a loss function L to PadvGradient is determined, PadvMoving the step length eta along the gradient direction to obtain the coordinates of the key points of the human face after the expected geometric deformation:
Figure DEST_PATH_IMAGE001
. Finally, according to the key point P before deformation and the key point P after deformationadvSolving the transformation matrix M, M = PadvP-1. Transforming the whole confrontation sample picture x according to M to obtain a new confrontation sample, xadv=Mx。
In one example, the pixel-by-pixel update comprises:
solving the gradient of each pixel point of the image in the direction of reducing the target loss function;
and updating each pixel point under the condition of meeting the constraint condition according to the second step length and the gradient.
It will be appreciated that the above constraints are used to constrain the resizing of the original picture to generate small perturbations at high frequencies. For example, the confrontation sample picture x is directly graded according to the loss function L, and then graded according to the step length α and the infinite norm LinfConstraint epsilon updates the confrontation sample to obtain an updated confrontation sample xadv
Figure 323757DEST_PATH_IMAGE002
Wherein the constraint is s.t.
Figure DEST_PATH_IMAGE003
In one example, the attack is a non-target attack, the target image is an original image of an attacker, and the target loss function is positively correlated with the similarity, that is, the smaller the similarity between the confrontation sample identified by the target image identification model and the original image is, the smaller the target loss function is; or, the attack is a targeted attack, the target image is not an original image of the attacker, and the target loss function is inversely related to the similarity, that is, the greater the similarity between the countermeasure sample identified by the target image identification model and the target image is, the smaller the target loss function is.
Then, at step 23, a third number of the pixel-by-pixel updates is performed on the current countermeasure sample, resulting in a second countermeasure sample. It will be appreciated that by updating this way pixel by pixel, a second antagonizing sample is obtained.
In the embodiment of the present specification, step 22 and step 23 are two branch processing flows, and both branch processing flows include pixel-by-pixel update, and the basic processing manner of pixel-by-pixel update is the same, but the details may be slightly different, for example, the step size of pixel-by-pixel update in step 22 and step 23 may be different,
in one example, gaussian noise is added to the first and second antagonizing samples. In the example, Gaussian noise is added, so that the generalization capability of resisting the sample is improved, and the black box attack effect is enhanced.
In step 24, determining a first loss value of the target loss function corresponding to the first antagonizing sample and a second loss value of the target loss function corresponding to the second antagonizing sample; determining, as an updated countermeasure sample, a countermeasure sample of the first countermeasure sample and the second countermeasure sample corresponding to a smaller loss value. It will be appreciated that steps 22 and 23 are two branch process flows, with step 24 being preferred in the processing results of the two branch process flows.
Finally, in step 25, when the stop iteration condition is satisfied, the updated countermeasure sample is taken as the final countermeasure sample. It can be understood that the final countermeasure sample is considered to be a countermeasure sample with a higher probability of attacking success, and can be used for attacking the target image recognition model.
In one example, the stop iteration condition includes: the current iteration times reach preset times; or the smaller loss value of the first loss value and the second loss value is smaller than a preset value.
At step 26, when the stop iteration condition is not satisfied, a next iteration is performed based on the updated countermeasure sample. It is understood that the updated confrontation sample obtained in the current round can be used as the current confrontation sample in the next iteration.
In this embodiment, after the final confrontation sample is generated, a detection model for classifying the input image into a normal sample and a confrontation sample may be trained by using the final confrontation sample.
Further, the method further comprises:
inputting an image to be recognized into the trained detection model;
and when the image to be recognized output by the detection model is a normal sample, inputting the image to be recognized into the target image recognition model, and performing image recognition by the target image recognition model.
Further, the method further comprises:
and when the detection model outputs the image to be identified as a countermeasure sample, manually identifying the image. It can be understood that, if the detection model outputs that the image to be recognized is a countermeasure sample, it indicates that the image to be recognized is likely to be a countermeasure sample of attacking the target image recognition model, and recognition using the target image recognition model or manual review should be denied.
According to the method provided by the embodiment of the specification, a current countermeasure sample to be strengthened in the iteration of the current round is obtained, wherein the current countermeasure sample is an image attacking a target image recognition model; and then respectively executing two branch processing flows, wherein in one branch processing flow, aiming at the current confrontation sample, geometric deformation is executed firstly, and then pixel-by-pixel updating is carried out, and in the other branch processing flow, aiming at the current confrontation sample, pixel-by-pixel updating is directly carried out. At each iteration, the processing result of the branch processing flow that drops the target loss function more is selected as the final countermeasure sample, or as the updated countermeasure sample, to perform the next iteration. As can be seen from the above, in the embodiments of the present description, geometric deformation is combined with pixel-by-pixel update, and since the change of the geometric deformation to the image does not belong to a small and high-frequency disturbance, a defense algorithm for eliminating noise through high-frequency filtering and the like cannot prevent a countermeasure sample obtained through geometric deformation; in addition, the self-adaptive branch selection processing result in the iterative process enables the final confrontation sample to have a better attack effect when facing a defense algorithm, and the generated confrontation sample can have stronger aggressivity, so that the targeted defense is realized.
FIG. 4 illustrates an overall flow diagram for generating a challenge sample according to one embodiment. Referring to fig. 4, the overall process flow is divided into two branches, one branch is to perform geometric deformation first and then perform pixel-by-pixel update, and the other branch is to perform pixel-by-pixel update directly. And a heuristic algorithm is adopted to select the result of the branch which enables the loss function to be reduced more at each iteration. The method mainly comprises the following steps: after executing m times of geometric deformation, executing n times of pixel-by-pixel updating by step length alpha to obtain confrontation sample x1 adv. Performing k pixel-by-pixel updates with step size β to obtain a challenge sample x2 adv. Are respectively paired with x1 advAnd x2 advNoise added from r to N (0, 1): x is the number ofadv= xadv+ r. X is to be1 advAnd x2 advThe loss values L1 and L2 are calculated by the insertion loss function L if L1<l2, then in x1 advUpdate the current confrontation sample, otherwise with x2 advThe current confrontation sample is updated. Repeating the steps until the iteration times N of the stop condition is more than or equal to N or the value L of the loss function is less than or equal to Lt
In the embodiment of the present specification, each parameter may be set in advance. As an example, m, n is set depending on whether the generated confrontation sample is mainly geometrically deformed, which means that the deformation of five sense organs is slightly larger but the entire color change is smaller, or pixel-by-pixel updated, which means that the deformation of five sense organs is slightly smaller but the entire color change is larger. When the geometric deformation is dominant, m =2 and n = 1. If pixel update is dominant, m =1 and n = 2. Correspondingly, k may be 3.
α, β are the magnitude of the change in pixel value at each step, depending on the particular scene. If k is greater than n, β may be set slightly less than α, ensuring that the overall pixel value changes in both branches are substantially consistent.
N 、LtMay be empirical parameters set based on the case in which the historical attack was successful.
In addition, the countermeasure sample in the embodiment of the present specification is a broad concept, and includes a countermeasure sample generated by disturbance on the whole graph, and also includes a countermeasure patch generated by applying disturbance to a specific area.
In the embodiment of the specification, on the basis of pixel-by-pixel updating, the micro geometric deformation of the five sense organs of the face is added, so that the generated confrontation sample can break through the bottleneck of only micro high-frequency disturbance and is difficult to be perceived by human beings. And through an adaptive framework, whether the geometric deformation of the human face is added or not is determined during each step of iterative updating, and the deformation is abandoned when the deformation result is not directly updated pixel by pixel. Therefore, the countermeasure sample generated by the embodiment of the present specification has stronger aggressivity against a common countermeasure defense method than a common countermeasure sample generation method. In addition, the face is subjected to deformation modification by semantic features, so that the resisting sample has better mobility in the black box attack.
According to an embodiment of another aspect, a generation device of a countermeasure sample is further provided, and the device is used for executing the generation method of the countermeasure sample provided by the embodiment of the present specification. Fig. 5 shows a schematic block diagram of a challenge sample generation apparatus according to an embodiment. As shown in fig. 5, the apparatus 500 includes:
an obtaining unit 51, configured to obtain a current countermeasure sample to be strengthened in the current iteration, where the current countermeasure sample is an image that attacks the target image recognition model;
a first branch unit 52, configured to perform a first-order preset geometric deformation on the current confrontation sample acquired by the acquisition unit 51 in a direction in which the target loss function decreases, so as to obtain a deformation image; performing a second number of pixel-by-pixel updates on the deformed image to obtain a first antagonizing sample; wherein the pixel-by-pixel update is targeted to reduce the target loss function determined from a similarity between the challenge sample identified by the target image identification model and a target image;
a second branch unit 53, configured to perform the pixel-by-pixel update of a third time on the current countermeasure sample acquired by the acquisition unit 51, so as to obtain a second countermeasure sample;
a single-round determining unit 54, configured to determine a first loss value of the target loss function corresponding to a first antagonistic sample obtained by the first branch unit 52, and a second loss value of the target loss function corresponding to a second antagonistic sample obtained by the second branch unit 53; determining a countermeasure sample with a smaller corresponding loss value in the first countermeasure sample and the second countermeasure sample as an updated countermeasure sample;
a final determination unit 55, configured to take the updated confrontation sample determined by the single-round determination unit 54 as a final confrontation sample when the stop iteration condition is satisfied;
and an iteration triggering unit 56, configured to, when the iteration stop condition is not satisfied, perform a next iteration based on the updated countermeasure sample.
Optionally, as an embodiment, the current iteration is a first iteration, and the current confrontation sample is an original image of an attacker; or,
the current round of iteration is not the first round of iteration, and the current confrontation sample is an updated confrontation sample generated in the previous round of iteration.
Optionally, as an embodiment, the preset geometric deformation includes:
detecting an initial key point in the current confrontation sample to obtain a first coordinate of the initial key point;
graduating the initial keypoints in a direction that reduces the target loss function;
moving the initial key point along the gradient direction for a first step length to obtain a second coordinate of the initial key point after expected geometric deformation;
solving a transformation matrix according to the first coordinate and the second coordinate of the initial key point;
transforming the current countermeasure sample according to the transformation matrix.
Further, the current confrontation sample is a face image, and the initial key points are key points for describing eyebrows, eyes, a nose, a mouth or a face contour.
Optionally, as an embodiment, the pixel-by-pixel updating includes:
solving the gradient of each pixel point of the image in the direction of reducing the target loss function;
and updating each pixel point under the condition of meeting the constraint condition according to the second step length and the gradient.
Optionally, as an embodiment, the attack is a no-target attack, the target image is an original image of an attacker, and the target loss function is positively correlated with the similarity; or the attack is targeted attack, the target image is not the original image of the attacker, and the target loss function is inversely related to the similarity.
Optionally, as an embodiment, gaussian noise is added to the first and second antagonizing samples.
Optionally, as an embodiment, the stop iteration condition includes: the current iteration times reach preset times; or the smaller loss value of the first loss value and the second loss value is smaller than a preset value.
Optionally, as an embodiment, the apparatus further includes:
a training unit, configured to train a detection model using the final confrontation sample obtained by the final determining unit 55, where the detection model is used to classify the input image into a normal sample and a confrontation sample.
Further, the apparatus further comprises:
the detection unit is used for inputting the image to be recognized into the trained detection model;
and the first identification unit is used for inputting the image to be identified into the target image identification model when the image to be identified is output as a normal sample by the detection model, and carrying out image identification by the target image identification model.
Further, the apparatus further comprises:
and the second identification unit is used for manually identifying the image when the detection model outputs the image to be identified as a confrontation sample.
With the apparatus provided in this specification, first, the obtaining unit 51 obtains a current countermeasure sample to be reinforced in this iteration, where the current countermeasure sample is an image attacking a target image recognition model; then, the first branch unit 52 and the second branch unit 53 respectively execute two branch processing flows, wherein in one branch processing flow, geometric deformation is executed first for the current confrontation sample, and then pixel-by-pixel update is performed, and in the other branch processing flow, pixel-by-pixel update is performed directly for the current confrontation sample. At each iteration, the single-round determination unit 54 selects the processing result of the branch processing flow that lowers the target loss function more, and the final determination unit 55 serves as the final countermeasure sample, or the iteration trigger unit 56 serves as the updated countermeasure sample, to execute the next iteration. As can be seen from the above, in the embodiments of the present description, geometric deformation is combined with pixel-by-pixel update, and since the change of the geometric deformation to the image does not belong to a small and high-frequency disturbance, a defense algorithm for eliminating noise through high-frequency filtering and the like cannot prevent a countermeasure sample obtained through geometric deformation; in addition, the self-adaptive branch selection processing result in the iterative process enables the final confrontation sample to have a better attack effect when facing a defense algorithm, and the generated confrontation sample can have stronger aggressivity, so that the targeted defense is realized.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 2.
According to an embodiment of yet another aspect, there is also provided a computing device comprising a memory having stored therein executable code, and a processor that, when executing the executable code, implements the method described in connection with fig. 2.
Those skilled in the art will recognize that, in one or more of the examples described above, the functions described in this invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments, objects, technical solutions and advantages of the present invention are further described in detail, it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (24)

1. A method of generating a challenge sample, the method comprising:
obtaining a current countermeasure sample to be strengthened in the iteration of the current round, wherein the current countermeasure sample is an image attacking the target image recognition model;
in the direction of reducing the target loss function, performing preset geometric deformation for the current confrontation sample for a first time to obtain a deformation image; performing a second number of pixel-by-pixel updates on the deformed image to obtain a first antagonizing sample; wherein the pixel-by-pixel update is targeted to reduce the target loss function determined from a similarity between the challenge sample identified by the target image identification model and a target image; the change of the pixel-by-pixel updating to the image belongs to the disturbance of high frequency, and the change of the preset geometric deformation to the image does not belong to the disturbance of high frequency;
performing the pixel-by-pixel update of a third time on the current countermeasure sample to obtain a second countermeasure sample;
determining a first loss value of the target loss function corresponding to the first antagonizing sample and a second loss value of the target loss function corresponding to the second antagonizing sample; determining a countermeasure sample with a smaller corresponding loss value in the first countermeasure sample and the second countermeasure sample as an updated countermeasure sample;
when the iteration stopping condition is met, taking the updated countermeasure sample as a final countermeasure sample;
when the stop iteration condition is not satisfied, performing a next iteration based on the updated countermeasure sample.
2. The method of claim 1, wherein the current iteration is a first iteration, and the current confrontation sample is an original image of an attacker; or,
the current round of iteration is not the first round of iteration, and the current confrontation sample is an updated confrontation sample generated in the previous round of iteration.
3. The method of claim 1, wherein the pre-set geometric deformation comprises:
detecting an initial key point in the current confrontation sample to obtain a first coordinate of the initial key point;
graduating the initial keypoints in a direction that reduces the target loss function;
moving the initial key point along the gradient direction for a first step length to obtain a second coordinate of the initial key point after expected geometric deformation;
solving a transformation matrix according to the first coordinate and the second coordinate of the initial key point;
transforming the current countermeasure sample according to the transformation matrix.
4. The method of claim 3, wherein the current confrontation sample is a face image and the initial keypoints are keypoints depicting eyebrows, eyes, nose, mouth, or face contours.
5. The method of claim 1, wherein the pixel-by-pixel update comprises:
solving the gradient of each pixel point of the image in the direction of reducing the target loss function;
and updating each pixel point under the condition of meeting the constraint condition according to the second step length and the gradient.
6. The method of claim 1, wherein the attack is a no-target attack, the target image is an original image of an attacker, and the target loss function is positively correlated with the similarity; or the attack is targeted attack, the target image is not the original image of the attacker, and the target loss function is inversely related to the similarity.
7. The method of claim 1, wherein gaussian noise is added to the first and second antagonizing samples.
8. The method of claim 1, wherein the stop iteration condition comprises: the current iteration times reach preset times; or the smaller loss value of the first loss value and the second loss value is smaller than a preset value.
9. The method of claim 1, wherein the method further comprises:
training a detection model with the final confrontation sample, the detection model being used for classifying the input image into a normal sample and a confrontation sample.
10. The method of claim 9, wherein the method further comprises:
inputting an image to be recognized into the trained detection model;
and when the image to be recognized output by the detection model is a normal sample, inputting the image to be recognized into the target image recognition model, and performing image recognition by the target image recognition model.
11. The method of claim 10, wherein the method further comprises:
and when the detection model outputs the image to be identified as a countermeasure sample, manually identifying the image.
12. An apparatus for generating a challenge sample, the apparatus comprising:
the acquisition unit is used for acquiring a current countermeasure sample to be strengthened in the iteration, wherein the current countermeasure sample is an image attacking the target image identification model;
the first branch unit is used for performing preset geometric deformation for a first time on the current confrontation sample acquired by the acquisition unit in the direction of reducing the target loss function to obtain a deformation image; performing a second number of pixel-by-pixel updates on the deformed image to obtain a first antagonizing sample; wherein the pixel-by-pixel update is targeted to reduce the target loss function determined from a similarity between the challenge sample identified by the target image identification model and a target image; the change of the pixel-by-pixel updating to the image belongs to the disturbance of high frequency, and the change of the preset geometric deformation to the image does not belong to the disturbance of high frequency;
the second branch unit is used for performing pixel-by-pixel updating for a third time on the current countermeasure sample acquired by the acquisition unit to obtain a second countermeasure sample;
a single-round determining unit, configured to determine a first loss value of the target loss function corresponding to a first antagonistic sample obtained by the first branch unit, and a second loss value of the target loss function corresponding to a second antagonistic sample obtained by the second branch unit; determining a countermeasure sample with a smaller corresponding loss value in the first countermeasure sample and the second countermeasure sample as an updated countermeasure sample;
a final determination unit, configured to take the updated confrontation sample determined by the single-round determination unit as a final confrontation sample when a stop iteration condition is satisfied;
and the iteration triggering unit is used for carrying out the next iteration based on the updated countermeasure sample when the iteration stopping condition is not met.
13. The apparatus of claim 12, wherein the current iteration is a first iteration, and the current confrontation sample is an original image of an attacker; or,
the current round of iteration is not the first round of iteration, and the current confrontation sample is an updated confrontation sample generated in the previous round of iteration.
14. The apparatus of claim 12, wherein the preset geometric deformation comprises:
detecting an initial key point in the current confrontation sample to obtain a first coordinate of the initial key point;
graduating the initial keypoints in a direction that reduces the target loss function;
moving the initial key point along the gradient direction for a first step length to obtain a second coordinate of the initial key point after expected geometric deformation;
solving a transformation matrix according to the first coordinate and the second coordinate of the initial key point;
transforming the current countermeasure sample according to the transformation matrix.
15. The apparatus of claim 14, wherein the current confrontation sample is a face image and the initial keypoints are keypoints depicting eyebrows, eyes, nose, mouth, or face contours.
16. The apparatus of claim 12, wherein the pixel-by-pixel update comprises:
solving the gradient of each pixel point of the image in the direction of reducing the target loss function;
and updating each pixel point under the condition of meeting the constraint condition according to the second step length and the gradient.
17. The apparatus of claim 12, wherein the attack is a no-target attack, the target image is an original image of an attacker, and the target loss function is positively correlated with the similarity; or the attack is targeted attack, the target image is not the original image of the attacker, and the target loss function is inversely related to the similarity.
18. The apparatus of claim 12, wherein gaussian noise is added to the first and second antagonizing samples.
19. The apparatus of claim 12, wherein the stop iteration condition comprises: the current iteration times reach preset times; or the smaller loss value of the first loss value and the second loss value is smaller than a preset value.
20. The apparatus of claim 12, wherein the apparatus further comprises:
and the training unit is used for training a detection model by using the final confrontation sample obtained by the final determining unit, and the detection model is used for classifying the input image into a normal sample and a confrontation sample.
21. The apparatus of claim 20, wherein the apparatus further comprises:
the detection unit is used for inputting the image to be recognized into the trained detection model;
and the first identification unit is used for inputting the image to be identified into the target image identification model when the image to be identified is output as a normal sample by the detection model, and carrying out image identification by the target image identification model.
22. The apparatus of claim 21, wherein the apparatus further comprises:
and the second identification unit is used for manually identifying the image when the detection model outputs the image to be identified as a confrontation sample.
23. A computer-readable storage medium, on which a computer program is stored which, when executed in a computer, causes the computer to carry out the method of any one of claims 1-11.
24. A computing device comprising a memory having stored therein executable code and a processor that, when executing the executable code, implements the method of any of claims 1-11.
CN202010725498.3A 2020-07-24 2020-07-24 Method and device for generating confrontation sample Active CN111737691B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010725498.3A CN111737691B (en) 2020-07-24 2020-07-24 Method and device for generating confrontation sample

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010725498.3A CN111737691B (en) 2020-07-24 2020-07-24 Method and device for generating confrontation sample

Publications (2)

Publication Number Publication Date
CN111737691A CN111737691A (en) 2020-10-02
CN111737691B true CN111737691B (en) 2021-02-23

Family

ID=72657694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010725498.3A Active CN111737691B (en) 2020-07-24 2020-07-24 Method and device for generating confrontation sample

Country Status (1)

Country Link
CN (1) CN111737691B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112200257B (en) * 2020-10-16 2022-08-19 支付宝(杭州)信息技术有限公司 Method and device for generating confrontation sample
CN112241790B (en) * 2020-12-16 2021-03-30 北京智源人工智能研究院 Small countermeasure patch generation method and device
CN112329931B (en) * 2021-01-04 2021-05-07 北京智源人工智能研究院 Countermeasure sample generation method and device based on proxy model
CN113311429B (en) * 2021-04-26 2023-11-14 清华大学 1-bit radar imaging method based on countermeasure sample
CN113313404B (en) * 2021-06-15 2022-12-06 支付宝(杭州)信息技术有限公司 Method and device for generating countermeasure sample
CN113760358B (en) * 2021-08-30 2023-08-01 河北大学 Antagonistic sample generation method for source code classification model
CN114332446B (en) * 2021-10-18 2022-07-12 北京计算机技术及应用研究所 Image countermeasure sample generation method with rotation robustness in physical world
CN114531274B (en) * 2022-01-13 2022-11-04 西安电子科技大学 Intelligent countermeasure method, system, medium and equipment for communication signal modulation recognition
CN114612688B (en) * 2022-05-16 2022-09-09 中国科学技术大学 Countermeasure sample generation method, model training method, processing method and electronic equipment
CN115277065B (en) * 2022-06-15 2024-01-23 北京信息科技大学 Anti-attack method and device in abnormal traffic detection of Internet of things

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084002A (en) * 2019-04-23 2019-08-02 清华大学 Deep neural network attack method, device, medium and calculating equipment
CN110443203B (en) * 2019-08-07 2021-10-15 中新国际联合研究院 Confrontation sample generation method of face spoofing detection system based on confrontation generation network
CN110674938B (en) * 2019-08-21 2021-12-21 浙江工业大学 Anti-attack defense method based on cooperative multi-task training
CN110598400B (en) * 2019-08-29 2021-03-05 浙江工业大学 Defense method for high hidden poisoning attack based on generation countermeasure network and application
CN110991299B (en) * 2019-11-27 2023-03-14 中新国际联合研究院 Confrontation sample generation method aiming at face recognition system in physical domain
CN111428853B (en) * 2020-01-16 2023-07-11 东华大学 Negative sample countermeasure generation method with noise learning function
CN111340143A (en) * 2020-05-15 2020-06-26 支付宝(杭州)信息技术有限公司 Method and system for obtaining confrontation sample generation model

Also Published As

Publication number Publication date
CN111737691A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
CN111737691B (en) Method and device for generating confrontation sample
CN111310802B (en) Anti-attack defense training method based on generation of anti-network
Chen et al. A frank-wolfe framework for efficient and effective adversarial attacks
CN110348475B (en) Confrontation sample enhancement method and model based on spatial transformation
Diao et al. BASAR: black-box attack on skeletal action recognition
CN111027628B (en) Model determination method and system
JP7512523B2 (en) Video detection method, device, electronic device and storage medium
CN113255816B (en) Directional attack countermeasure patch generation method and device
CN113269241B (en) Soft threshold defense method for remote sensing image confrontation sample
CN114140670B (en) Method and device for verifying ownership of model based on exogenous characteristics
CN111783853A (en) Interpretability-based method for detecting and recovering neural network confrontation sample
CN114387449A (en) Image processing method and system for coping with adversarial attack of neural network
CN113222480B (en) Training method and device for challenge sample generation model
CN112861759B (en) Method and device for generating confrontation sample
CN117454187A (en) Integrated model training method based on frequency domain limiting target attack
Wang et al. Harden deep convolutional classifiers via k-means reconstruction
Zhang et al. Adversarial semantic contour for object detection
CN115222990A (en) Meta-learning neural network fingerprint detection method based on self-adaptive fingerprints
Xu et al. Lancex: A versatile and lightweight defense method against condensed adversarial attacks in image and audio recognition
CN116596045A (en) Apparatus and method for determining an countermeasure patch for a machine learning system
CN116306830A (en) Multi-step gradient countermeasure sample generation method and system based on feature and label smoothing
Hu et al. Towards Transferable Attack via Adversarial Diffusion in Face Recognition
CN112750067A (en) Image processing system and training method thereof
Lan et al. CloudFort: Enhancing Robustness of 3D Point Cloud Classification Against Backdoor Attacks via Spatial Partitioning and Ensemble Prediction
CN114332982B (en) Face recognition model attack defense method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant