WO2024041346A1 - Method and apparatus for generating facial recognition adversarial sample, and storage medium - Google Patents

Method and apparatus for generating facial recognition adversarial sample, and storage medium Download PDF

Info

Publication number
WO2024041346A1
WO2024041346A1 PCT/CN2023/111034 CN2023111034W WO2024041346A1 WO 2024041346 A1 WO2024041346 A1 WO 2024041346A1 CN 2023111034 W CN2023111034 W CN 2023111034W WO 2024041346 A1 WO2024041346 A1 WO 2024041346A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
attack
attack mask
feature vector
face
Prior art date
Application number
PCT/CN2023/111034
Other languages
French (fr)
Chinese (zh)
Inventor
王镜茹
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Publication of WO2024041346A1 publication Critical patent/WO2024041346A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection

Definitions

  • Embodiments of the present disclosure relate to but are not limited to the technical field of face recognition, and in particular, to a method and device for generating face recognition adversarial samples, and a storage medium.
  • Face recognition is a biometric technology that performs identity recognition based on people's facial feature information. It is usually implemented based on deep learning models to extract facial features. However, deep learning models are vulnerable to attacks called “adversarial examples,” which can cause the model to produce incorrect predictions by adding small perturbations that are imperceptible to the human eye.
  • Embodiments of the present disclosure provide a method for generating face recognition adversarial samples, including:
  • Preprocess the first image and the second image and generate multiple candidate attack mask images, where the first image and the second image both contain human face areas;
  • each of the candidate attack mask images and the second image perform K iterative modification attacks on the first image, and calculate the similarity score between the modified image obtained after the K iterative modification attacks and the second image. , generate the final attack based on the similarity scores corresponding to multiple candidate attack mask images.
  • Mask image wherein the attack mask image and the final attack mask image are both used to specify the modification area when performing an iterative modification attack on the first image, and the different areas specified by the alternative attack mask image are The modification areas are different;
  • Embodiments of the present disclosure also provide a device for generating face recognition adversarial samples, including a memory; and a processor coupled to the memory, where the processor is configured to execute based on instructions stored in the memory.
  • the steps of the method for generating face recognition adversarial samples according to any embodiment of the present disclosure.
  • An embodiment of the present disclosure also provides a storage medium on which a computer program is stored.
  • the program is executed by a processor, the method for generating face recognition adversarial samples as described in any embodiment of the present disclosure is implemented.
  • Figure 1 is a schematic flowchart of a method for generating face recognition adversarial samples according to an exemplary embodiment of the present disclosure
  • Figure 2 is a schematic diagram of multiple alternative attack mask images generated by an exemplary embodiment of the present disclosure
  • Figure 3 is a schematic flowchart of an iterative modification attack according to an exemplary embodiment of the present disclosure
  • Figure 4 is a schematic diagram of a method for correcting the final attack mask image according to an exemplary embodiment of the present disclosure
  • Figure 5 is a flowchart of another method for generating face recognition adversarial samples according to an exemplary embodiment of the present disclosure.
  • Figure 6 is a schematic diagram of the generation effect of face recognition adversarial samples according to an exemplary embodiment of the present disclosure
  • FIG. 7 is a schematic structural diagram of a device for generating face recognition adversarial samples according to an exemplary embodiment of the present disclosure.
  • the scale of the drawings in this disclosure can be used as a reference in actual processes, but is not limited thereto.
  • the width-to-length ratio of the channel, the thickness and spacing of each film layer, and the width and spacing of each signal line can be adjusted according to actual needs.
  • the number of pixels in the display panel and the number of sub-pixels in each pixel are not limited to the numbers shown in the figures.
  • the figures described in the present disclosure are only structural schematic diagrams. One mode of the present disclosure is not limited to the figures. The shape or numerical value shown in the figure.
  • an embodiment of the present disclosure provides a method for generating face recognition adversarial samples, which includes the following steps:
  • Step 101 Preprocess the first image and the second image, and generate multiple candidate attack mask images, where the first image and the second image both contain human face areas;
  • Step 102 Use each candidate attack mask image and the second image to perform K iterative modification attacks on the first image, calculate the similarity score between the modified image obtained after K iterative modification attacks and the second image, and calculate the similarity score between the modified image and the second image based on the multiple candidate attacks.
  • the similarity score corresponding to the mask image is used to generate the final attack mask image.
  • the candidate attack mask image and the final attack mask image are both used to specify the modification area when iteratively modifying the first image. Different alternatives Attack the modified area specified by the mask image different;
  • Step 103 Use the final attack mask image and the second image to perform M iterative modification attacks on the first image to obtain face recognition adversarial samples, where M and K are both positive integers greater than 1, and M>K.
  • the face recognition adversarial sample generation method generates multiple candidate attack mask images, uses each candidate attack mask image and the second image, performs K iterative modification attacks on the first image, and calculates K times The similarity score between the modified image obtained after iteratively modifying the attack and the second image; a final attack mask image is generated based on the similarity scores corresponding to multiple alternative attack mask images; using the final attack mask image and the second image, the first An image is subjected to M iterative modification attacks, M>K>1, to obtain a face recognition adversarial sample, which reduces the modification area of the face image, reduces the visual difference before and after modification, and achieves the goal of modifying the face image as little as possible. It can "cheat" the face recognition system, which can prevent face abuse on the one hand, and can help the deep neural network model improve the robustness on the other hand.
  • preprocessing the first image and the second image in step 101 includes:
  • the aligned first image and the second image are both face images
  • preprocessing the image includes the following operations: accurately calibrating the position and size of the face in the image through the face detection model; obtaining the key point coordinates of the face through the face key point detection model; according to The key point coordinates of the face adjust the angle of the face to align the face; adjust the aligned face image to the preset size.
  • the default size may be 112 pixels*112 pixels.
  • multiple candidate attack mask images are generated in step 101, including:
  • the starting position coordinates include the horizontal starting position coordinates along the first direction and the longitudinal starting position along the second direction.
  • the starting position coordinates, the ending position coordinates include the transverse ending position coordinates along the first direction and the longitudinal ending position coordinates along the second direction, the step length includes the transverse step length along the first direction and the longitudinal step length along the second direction, One direction intersects the second direction;
  • first direction and the second direction may be perpendicular to each other.
  • the window shape may be a rectangular shape
  • the window size may include a window width along a first direction and a window height along a second direction.
  • the specified modification area of the two candidate attack mask images is the rectangular area from [startx+stepx, starty] to [startx+stepx+winw, starty+winh],..., and the specified modification area of the ath candidate attack mask image is The rectangular area from [endx, starty] to [endx+winw, starty+winh], the modified area specified by the a+1 candidate attack mask image is from [startx, starty+stepy] to [startx+winw, starty+ stepy+winh],..., the modified area specified by the a*b candidate attack mask image is the rectangular area from [endx, endy] to [endx+winw, endy+winh].
  • the size of the modification area specified for each candidate attack mask image is winw*winh. However, the modification areas specified for different candidate attack mask images are different.
  • each alternative attack mask image the coordinates of the upper left position are the starting position coordinates of the area specified by the alternative attack mask image (i.e., the coordinates of the upper left corner vertex of the black rectangular block), and the coordinates of the lower right position are the coordinates of the alternative attack
  • the end position coordinates of the area specified by the mask image that is, the coordinates of the lower right corner vertex of the black rectangular block.
  • the size of the modified area specified by each candidate attack mask image is 13 pixels * 10 pixels.
  • the first image and the second image may be different images belonging to the same identity.
  • the goal of the face recognition adversarial attack is to modify the first image so that it has the same similarity score as the face feature vector extracted by the face recognition model with other face images under the same identity (i.e., the second image). Probably low.
  • an iterative modification attack is performed on the first image in step 102 or step 103, including the following steps :
  • the value of the loss function between the face feature vector and the second face feature vector is backpropagated, and the first image is modified to make the loss function between the first face feature vector and the second face feature vector.
  • the value of increases, the modified area is the modified area specified by the candidate attack mask image or the final attack mask image.
  • the loss function of the face recognition model is -1*the sum of the first face feature vector and the second face feature vector. loss function between.
  • the modification area is the modification area specified by the alternative attack mask image used in this iterative modification attack; in step 103, the first image is modified in an iterative modification attack.
  • the modified area is the modified area specified by the final attack mask image.
  • the second face feature vector corresponding to the two images when performing a non-first modification attack in K iterative modification attacks or M iterative modification attacks, input the first image and the second image modified by the previous modification attack respectively.
  • the face recognition model obtains the first face feature vector corresponding to the first image and the second face feature vector corresponding to the second image.
  • the face recognition model may include one or more.
  • the third image in step 102 or step 103 is Perform an iterative modification attack on an image, including the following steps: input the first image or the modified first image and the second image into each face recognition model, and obtain the first face feature vector and the corresponding first face feature vector of each face recognition model.
  • the second face feature vector calculates the value of the loss function between the first face feature vector and the second face feature vector corresponding to each face recognition model, and weights the values of the loss functions corresponding to multiple face recognition models Average and back propagate, modify the first image to increase the value of the weighted average loss function, and the modified area is the modified area specified by the alternative attack mask image or the final attack mask image.
  • the loss function between the first facial feature vector and the second facial feature vector may be calculated by a mean squared loss function (Mean Squared Error, MSE).
  • MSE mean squared Error
  • a final attack mask image is generated based on the similarity scores corresponding to multiple candidate attack mask images, including:
  • the modified areas specified by the top N candidate attack mask images are synthesized to obtain the modified area specified by the final attack mask image.
  • the first image and the second image may be different images belonging to different identities.
  • the goal of the face recognition adversarial attack is to modify the first image so that it has the same similarity score as the face feature vector extracted by the face recognition model with other face images under different identities (i.e., the second image). Probably high.
  • an iterative modification attack is performed on the first image in step 102 or step 103, including:
  • the loss function of the face recognition model is the loss function between the first face feature vector and the second face feature vector.
  • the face recognition model may include one or more.
  • the third image in step 102 or step 103 is Perform an iterative modification attack on an image, including: inputting the first image or the modified first image and the second image into each face recognition model, and obtaining the first face feature vector and the second face feature vector corresponding to each face recognition model.
  • Face feature vectors calculate the value of the loss function between the first face feature vector and the second face feature vector corresponding to each face recognition model, weighted average the values of the loss functions corresponding to multiple face recognition models and combine them
  • the first image is modified so that the value of the weighted average loss function decreases, and the modified area is the modified area specified by the alternative attack mask image or the final attack mask image.
  • a final attack mask image is generated based on the similarity scores corresponding to multiple candidate attack mask images, including:
  • the modified areas specified by the ranked N candidate attack mask images are synthesized to obtain the modified area specified by the final attack mask image.
  • N may range from 3 to 8.
  • N can be 5 or 6.
  • the method may further include:
  • each stepped-shaped area is divided into a plurality of rectangular areas.
  • the face recognition adversarial sample generation method provided by the embodiments of the present disclosure is to mask the final attack.
  • the specified modified area of the membrane image is corrected (that is, divided into multiple rectangular areas), which further reduces the modified area of the original image and reduces the visual difference before and after modification.
  • M is between 120 and 180, and K is between 20 and 40.
  • M is 150 and K is 30.
  • the method for generating face recognition adversarial samples includes the following steps:
  • attacks on face recognition models can be divided into two types. The first is to modify P1 so that its similarity score with the feature vectors extracted by the model from other face images under the same identity is as low as possible; It is to modify P1 so that its similarity score with the feature vectors extracted by the model from other face images under different identities is as high as possible.
  • each alternative attack mask image the coordinates of the upper left position are the alternative attack mask images.
  • the starting position coordinates of the specified area i.e., the coordinates of the upper left corner vertex of the black rectangular block
  • the coordinates of the lower right position are the ending position coordinates of the area specified by the alternative attack mask image (i.e., the coordinates of the lower right corner vertex of the black rectangular block) coordinate).
  • the first image is P1 and the second image is P2.
  • the two images P1 and P2 are input into the face recognition model and the face feature vectors are extracted respectively. Calculate the MSE Loss between the extracted facial feature vectors.
  • the parameters of the fixed model remain unchanged.
  • the P1 image is modified through backpropagation to make the MSE Loss continue to increase.
  • the modified area of the P1 image is the alternative attack mask image.
  • the gradient descent method is used for iteration, and the loss function is -1*MSE Loss. In this way, the MSE Loss can be increased when performing gradient descent.
  • One gradient descent is the completion of one iteration.
  • K the similarity between the final modified image and the P2 image corresponding to each alternative attack mask image
  • N the number of alternative attack mask images with lower similarity scores (for example, N can be 5 or 6 )
  • the areas corresponding to the N candidate attack mask images are the top N areas that are easier to attack successfully.
  • the first image is P1 and the second image is P3.
  • the two images P1 and P3 are input into the face recognition model and extracted into face feature vectors respectively. Calculate the MSE Loss between the extracted facial feature vectors.
  • the parameters of the fixed model remain unchanged.
  • the P1 image is modified through backpropagation to continuously reduce the MSE Loss.
  • the modified area of the P1 image is the candidate attack mask image.
  • the specified modification area The gradient descent method is used for iteration, and the loss function is MSE Loss. In this way, the MSE Loss can be reduced when performing gradient descent.
  • One gradient descent is the completion of one iteration.
  • K the similarity between the final modified image and the P3 image corresponding to each alternative attack mask image
  • N the number of alternative attack mask images with higher similarity scores (for example, N can be 5 or 6 )
  • the areas they correspond to are the top N areas that are easier to attack successfully.
  • the embodiment of the present disclosure scans the generated mask pixel by pixel along the horizontal direction, and finds that when the mask heights in the vertical direction are different between two adjacent pixels in the horizontal direction, The mask area is cut off at this position to obtain the final attack mask image, as shown in Figure 4, the corrected attack mask image.
  • This disclosed embodiment uses the Fast Gradient Sign Attack (FGSM) to modify the original face image.
  • FGSM in a white box environment, calculates the derivative of the model on the input data, uses a function to find the gradient direction, and then multiplies it by Step size, what is obtained is the amount of disturbance. Adding this amount of disturbance to the original input, we get the face recognition adversarial sample under the FGSM attack. This face recognition adversarial sample has a high probability of making the model classify incorrectly. , to achieve the purpose of attack.
  • FGSM Fast Gradient Sign Attack
  • the final attack effect is shown in Figure 6.
  • the image on the left side of Figure 6 is the original image
  • the middle image is the modified image, that is, the generated face recognition adversarial sample
  • the image on the right side is Confusing images for different identities.
  • the visual difference between the modified image and the original image is not obvious, and the similarity score of the confused image with different identity can reach more than 0.3 in the face recognition model.
  • Embodiments of the present disclosure provide a method for generating adversarial samples for face recognition.
  • By using sliding windows to perform block retrieval on the entire face image areas in the entire face image that are easier to attack are found, thereby generating optimal
  • the attack mask image is used to generate the final attack image (i.e. face recognition adversarial sample) based on the optimal attack mask image.
  • the face image Under the premise of changing a small area, the face image can "deceive" the recognition system, thereby Effectively prevent face images uploaded to the cloud from being abused.
  • Embodiments of the present disclosure also provide a device for generating face recognition adversarial samples, including a memory; and a processor coupled to the memory, where the processor is configured to generate The instructions in execute the steps of the method for generating face recognition adversarial samples as described in any embodiment of the present disclosure.
  • the device for generating face recognition adversarial samples may include: a processor 710, a memory 720, and a bus system 730.
  • the processor 710 and the memory 720 are connected through the bus system 730, and the memory 720
  • the processor 710 is used for executing instructions stored in the memory 720 to preprocess the first image and the second image and generate a plurality of candidate attack mask images, wherein the first image and the second image Both include face areas; use each of the candidate attack mask images and the second image to perform K iterative modification attacks on the first image, and calculate the modified image obtained after the K iterative modification attacks and the third image.
  • the similarity scores of the two images are used to generate a final attack mask image based on the similarity scores corresponding to multiple alternative attack mask images, where both the alternative attack mask images and the final attack mask image are used to specify the target.
  • the processor 710 can be a central processing unit (Central Processing Unit, CPU).
  • the processor 710 can also be other general-purpose processors, digital signal processors (DSP), application-specific integrated circuits (ASICs), and off-the-shelf programmable gate arrays. (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
  • Memory 720 may include read-only memory and random access memory and provides instructions and data to processor 710 .
  • a portion of memory 720 may also include non-volatile random access memory.
  • memory 720 may also store device type information.
  • bus system 730 may also include a power bus, a control bus, a status signal bus, etc.
  • bus system 730 may also include a power bus, a control bus, a status signal bus, etc.
  • the various buses are labeled as bus system 730 in FIG. 7 .
  • the processing performed by the processing device may be completed by instructions in the form of hardware integrated logic circuits or software in the processor 710 . That is to say, the method steps of the embodiments of the present disclosure may be implemented by a hardware processor, or may be executed by a combination of hardware and software modules in the processor.
  • Software modules can be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media.
  • the storage medium is located in the memory 720.
  • the processor 710 reads the information in the memory 720 and completes the steps of the above method in combination with its hardware. To avoid repetition, it will not be described in detail here.
  • An embodiment of the present disclosure also provides a storage medium on which a computer program is stored.
  • the program is executed by a processor, the method for generating face recognition adversarial samples as described in any embodiment of the present disclosure is implemented.
  • various aspects of the method for generating face recognition adversarial samples provided by the present disclosure can also be implemented in the form of a program product, which includes program code.
  • the program product is run on a computer device
  • the program code is used to cause the computer device to execute the steps in the method for generating face recognition adversarial samples according to various exemplary embodiments of the present disclosure described above in this specification.
  • the computer device can execute the present disclosure.
  • the program product may take the form of any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to: electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more conductors, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Collating Specific Patterns (AREA)
  • Image Processing (AREA)

Abstract

A method and apparatus for generating a facial recognition adversarial sample, and a storage medium. The method comprises: preprocessing a first image and a second image, and generating a plurality of alternative attack mask images, wherein the first image and the second image each include a facial region; performing K iterative modification attacks on the first image by using each alternative attack mask image and the second image, calculating a similarity score between a modified image, which is obtained after the K iterative modification attacks, and the second image, and generating a final attack mask image according to similarity scores corresponding to the plurality of alternative attack mask images, wherein the alternative attack mask images and the final attack mask image are all used for specifying modification regions when the iterative modification attacks are performed on the first image, and the modification regions specified by different alternative attack mask images are different; and performing M iterative modification attacks on the first image by using the final attack mask image and the second image, so as to obtain a facial recognition adversarial sample, wherein M and K are both positive integers greater than 1, and M > K.

Description

人脸识别对抗样本的生成方法及装置、存储介质Face recognition adversarial sample generation method, device and storage medium
本申请要求于2022年8月23日提交中国专利局、申请号为202211013503.3、发明名称为“人脸识别对抗样本的生成方法及装置、存储介质”的中国专利申请的优先权,其内容应理解为通过引用的方式并入本申请中。This application requests the priority of the Chinese patent application submitted to the China Patent Office on August 23, 2022, with the application number 202211013503.3 and the invention title "Method and device for generating adversarial samples for face recognition, and storage medium", and its content should be understood are incorporated by reference into this application.
技术领域Technical field
本公开实施例涉及但不限于人脸识别技术领域,尤其涉及一种人脸识别对抗样本的生成方法及装置、存储介质。Embodiments of the present disclosure relate to but are not limited to the technical field of face recognition, and in particular, to a method and device for generating face recognition adversarial samples, and a storage medium.
背景技术Background technique
人脸识别,是基于人的脸部特征信息进行身份识别的一种生物识别技术,通常基于深度学习模型提取人脸特征来实现。但是,深度学习模型容易受到“对抗样本”的攻击影响,对抗样本通过添加人眼无法察觉的小扰动,可以使模型产生不正确的预测。Face recognition is a biometric technology that performs identity recognition based on people's facial feature information. It is usually implemented based on deep learning models to extract facial features. However, deep learning models are vulnerable to attacks called “adversarial examples,” which can cause the model to produce incorrect predictions by adding small perturbations that are imperceptible to the human eye.
对抗样本现象揭示了深度学习模型的安全漏洞。因此,研究如何快速大量地生成高质量的对抗样本对防御对抗攻击尤为重要。The phenomenon of adversarial examples reveals the security vulnerabilities of deep learning models. Therefore, studying how to quickly and massively generate high-quality adversarial samples is particularly important to defend against adversarial attacks.
发明内容Contents of the invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。The following is an overview of the topics described in detail in this article. This summary is not intended to limit the scope of the claims.
本公开实施例提供了一种人脸识别对抗样本的生成方法,包括:Embodiments of the present disclosure provide a method for generating face recognition adversarial samples, including:
对第一图像和第二图像进行预处理,并生成多张备选攻击掩膜图像,其中,所述第一图像和第二图像均包含人脸区域;Preprocess the first image and the second image, and generate multiple candidate attack mask images, where the first image and the second image both contain human face areas;
使用每张所述备选攻击掩膜图像与第二图像,对所述第一图像进行K次迭代修改攻击,计算K次迭代修改攻击后得到的修改图像与所述第二图像的相似度得分,根据多张备选攻击掩膜图像对应的相似度得分,生成最终攻击 掩膜图像,其中,所述攻击掩膜图像和所述最终攻击掩膜图像均用于指定对所述第一图像进行迭代修改攻击时的修改区域,不同所述备选攻击掩膜图像指定的修改区域不同;Using each of the candidate attack mask images and the second image, perform K iterative modification attacks on the first image, and calculate the similarity score between the modified image obtained after the K iterative modification attacks and the second image. , generate the final attack based on the similarity scores corresponding to multiple candidate attack mask images. Mask image, wherein the attack mask image and the final attack mask image are both used to specify the modification area when performing an iterative modification attack on the first image, and the different areas specified by the alternative attack mask image are The modification areas are different;
使用所述最终攻击掩膜图像与第二图像,对所述第一图像进行M次迭代修改攻击,得到人脸识别对抗样本,其中,M、K均为大于1的正整数,且M>K。Using the final attack mask image and the second image, perform M iterative modification attacks on the first image to obtain face recognition adversarial samples, where M and K are both positive integers greater than 1, and M>K .
本公开实施例还提供了一种人脸识别对抗样本的生成装置,包括存储器;和耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器中的指令,执行如本公开任一实施例所述的人脸识别对抗样本的生成方法的步骤。Embodiments of the present disclosure also provide a device for generating face recognition adversarial samples, including a memory; and a processor coupled to the memory, where the processor is configured to execute based on instructions stored in the memory. The steps of the method for generating face recognition adversarial samples according to any embodiment of the present disclosure.
本公开实施例还提供了一种存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本公开任一实施例所述的人脸识别对抗样本的生成方法。An embodiment of the present disclosure also provides a storage medium on which a computer program is stored. When the program is executed by a processor, the method for generating face recognition adversarial samples as described in any embodiment of the present disclosure is implemented.
在阅读理解了附图和详细描述后,可以明白其他方面。After reading and understanding the drawings and detailed description, other aspects can be understood.
附图说明Description of drawings
附图用来提供对本公开技术方案的理解,并且构成说明书的一部分,与本公开的实施例一起用于解释本公开的技术方案,并不构成对本公开技术方案的限制。The drawings are used to provide an understanding of the technical solution of the present disclosure and constitute a part of the specification. They are used to explain the technical solution of the present disclosure together with the embodiments of the present disclosure and do not constitute a limitation of the technical solution of the present disclosure.
图1为本公开示例性实施例一种人脸识别对抗样本的生成方法的流程示意图;Figure 1 is a schematic flowchart of a method for generating face recognition adversarial samples according to an exemplary embodiment of the present disclosure;
图2为本公开示例性实施例一种所生成的多张备选攻击掩膜图像的示意图;Figure 2 is a schematic diagram of multiple alternative attack mask images generated by an exemplary embodiment of the present disclosure;
图3为本公开示例性实施例一种一次迭代修改攻击的流程示意图;Figure 3 is a schematic flowchart of an iterative modification attack according to an exemplary embodiment of the present disclosure;
图4为本公开示例性实施例一种对最终攻击掩膜图像进行校正的方法示意图;Figure 4 is a schematic diagram of a method for correcting the final attack mask image according to an exemplary embodiment of the present disclosure;
图5为本公开示例性实施例另一种人脸识别对抗样本的生成方法的流程 示意图;Figure 5 is a flowchart of another method for generating face recognition adversarial samples according to an exemplary embodiment of the present disclosure. schematic diagram;
图6为本公开示例性实施例一种人脸识别对抗样本的生成效果示意图;Figure 6 is a schematic diagram of the generation effect of face recognition adversarial samples according to an exemplary embodiment of the present disclosure;
图7为本公开示例性实施例一种人脸识别对抗样本的生成装置的结构示意图。FIG. 7 is a schematic structural diagram of a device for generating face recognition adversarial samples according to an exemplary embodiment of the present disclosure.
具体实施方式Detailed ways
为使本公开的目的、技术方案和优点更加清楚明白,下文中将结合附图对本公开的实施例进行详细说明。实施方式可以以多个不同形式来实施。所属技术领域的普通技术人员可以很容易地理解一个事实,就是方式和内容可以在不脱离本公开的宗旨及其范围的条件下被变换为各种各样的形式。因此,本公开不应该被解释为仅限定在下面的实施方式所记载的内容中。在不冲突的情况下,本公开中的实施例及实施例中的特征可以相互任意组合。In order to make the purpose, technical solutions and advantages of the present disclosure more clear, the embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings. Embodiments may be implemented in many different forms. Those of ordinary skill in the art can easily understand the fact that the manner and content can be transformed into various forms without departing from the spirit and scope of the present disclosure. Therefore, the present disclosure should not be construed as being limited only to the contents described in the following embodiments. The embodiments and features in the embodiments of the present disclosure may be arbitrarily combined with each other unless there is any conflict.
本公开中的附图比例可以作为实际工艺中的参考,但不限于此。例如:沟道的宽长比、各个膜层的厚度和间距、各个信号线的宽度和间距,可以根据实际需要进行调整。显示面板中像素的个数和每个像素中子像素的个数也不是限定为图中所示的数量,本公开中所描述的附图仅是结构示意图,本公开的一个方式不局限于附图所示的形状或数值等。The scale of the drawings in this disclosure can be used as a reference in actual processes, but is not limited thereto. For example: the width-to-length ratio of the channel, the thickness and spacing of each film layer, and the width and spacing of each signal line can be adjusted according to actual needs. The number of pixels in the display panel and the number of sub-pixels in each pixel are not limited to the numbers shown in the figures. The figures described in the present disclosure are only structural schematic diagrams. One mode of the present disclosure is not limited to the figures. The shape or numerical value shown in the figure.
本说明书中的“第一”、“第二”、“第三”等序数词是为了避免构成要素的混同而设置,而不是为了在数量方面上进行限定的。Ordinal numbers such as "first", "second" and "third" in this specification are provided to avoid confusion of constituent elements and are not intended to limit the quantity.
如图1所示,本公开实施例提供了一种人脸识别对抗样本的生成方法,包括如下步骤:As shown in Figure 1, an embodiment of the present disclosure provides a method for generating face recognition adversarial samples, which includes the following steps:
步骤101、对第一图像和第二图像进行预处理,并生成多张备选攻击掩膜图像,其中,第一图像和第二图像均包含人脸区域;Step 101: Preprocess the first image and the second image, and generate multiple candidate attack mask images, where the first image and the second image both contain human face areas;
步骤102、使用每张备选攻击掩膜图像与第二图像,对第一图像进行K次迭代修改攻击,计算K次迭代修改攻击后得到的修改图像与第二图像的相似度得分,根据多张备选攻击掩膜图像对应的相似度得分,生成最终攻击掩膜图像,其中,备选攻击掩膜图像和最终攻击掩膜图像均用于指定对第一图像进行迭代修改攻击时的修改区域,不同备选攻击掩膜图像指定的修改区域 不同;Step 102: Use each candidate attack mask image and the second image to perform K iterative modification attacks on the first image, calculate the similarity score between the modified image obtained after K iterative modification attacks and the second image, and calculate the similarity score between the modified image and the second image based on the multiple candidate attacks. The similarity score corresponding to the mask image is used to generate the final attack mask image. The candidate attack mask image and the final attack mask image are both used to specify the modification area when iteratively modifying the first image. Different alternatives Attack the modified area specified by the mask image different;
步骤103、使用最终攻击掩膜图像与第二图像,对第一图像进行M次迭代修改攻击,得到人脸识别对抗样本,其中,M、K均为大于1的正整数,且M>K。Step 103: Use the final attack mask image and the second image to perform M iterative modification attacks on the first image to obtain face recognition adversarial samples, where M and K are both positive integers greater than 1, and M>K.
本公开实施例提供的人脸识别对抗样本的生成方法,通过生成多张备选攻击掩膜图像,使用每张备选攻击掩膜图像与第二图像,对第一图像进行K次迭代修改攻击,计算K次迭代修改攻击后得到的修改图像与第二图像的相似度得分;根据多张备选攻击掩膜图像对应的相似度得分,生成最终攻击掩膜图像;使用最终攻击掩膜图像与第二图像,对第一图像进行M次迭代修改攻击,M>K>1,得到人脸识别对抗样本,减少了人脸图像的修改面积,降低了修改前后的视觉差异,实现了尽可能少地修改人脸图像并使之能够“欺骗”过人脸识别系统,一方面可以防止人脸滥用,另一方面可以帮助深度神经网络模型提高鲁棒性。The face recognition adversarial sample generation method provided by the embodiment of the present disclosure generates multiple candidate attack mask images, uses each candidate attack mask image and the second image, performs K iterative modification attacks on the first image, and calculates K times The similarity score between the modified image obtained after iteratively modifying the attack and the second image; a final attack mask image is generated based on the similarity scores corresponding to multiple alternative attack mask images; using the final attack mask image and the second image, the first An image is subjected to M iterative modification attacks, M>K>1, to obtain a face recognition adversarial sample, which reduces the modification area of the face image, reduces the visual difference before and after modification, and achieves the goal of modifying the face image as little as possible. It can "cheat" the face recognition system, which can prevent face abuse on the one hand, and can help the deep neural network model improve the robustness on the other hand.
在一些示例性实施方式中,步骤101中对第一图像和第二图像进行预处理,包括:In some exemplary implementations, preprocessing the first image and the second image in step 101 includes:
使用人脸检测模型,分别提取第一图像和第二图像中的人脸区域;Use the face detection model to extract the face areas in the first image and the second image respectively;
使用人脸关键点检测模型,分别提取第一图像和第二图像中的人脸关键点;Use the facial key point detection model to extract facial key points in the first image and the second image respectively;
对提取的第一图像和第二图像中的人脸关键点进行仿射变换,得到对齐后的第一图像和第二图像,对齐后的第一图像和第二图像均为人脸图像;Perform affine transformation on the extracted face key points in the first image and the second image to obtain the aligned first image and the second image. The aligned first image and the second image are both face images;
将对齐后的第一图像和第二图像调整至预设尺寸。Adjust the aligned first image and second image to a preset size.
本公开实施例中,对图像进行预处理,包括如下操作:通过人脸检测模型在图像中准确标定出人脸的位置和大小;通过人脸关键点检测模型得到人脸的关键点坐标;根据人脸的关键点坐标调整人脸的角度,使人脸对齐;将对齐后的人脸图像调整至预设尺寸。示例性的,预设尺寸可以为112像素*112像素。In the embodiment of the present disclosure, preprocessing the image includes the following operations: accurately calibrating the position and size of the face in the image through the face detection model; obtaining the key point coordinates of the face through the face key point detection model; according to The key point coordinates of the face adjust the angle of the face to align the face; adjust the aligned face image to the preset size. For example, the default size may be 112 pixels*112 pixels.
在一些示例性实施方式中,步骤101中生成多张备选攻击掩膜图像,包括: In some exemplary implementations, multiple candidate attack mask images are generated in step 101, including:
根据预处理后的图像尺寸,设置起始位置坐标、窗口尺寸、步长大小与终止位置坐标,其中,起始位置坐标包括沿第一方向的横向起始位置坐标与沿第二方向的纵向起始位置坐标,终止位置坐标包括沿第一方向的横向终止位置坐标与沿第二方向的纵向终止位置坐标,步长包括沿第一方向的横向步长和沿第二方向的纵向步长,第一方向与第二方向交叉;According to the preprocessed image size, set the starting position coordinates, window size, step size and end position coordinates, where the starting position coordinates include the horizontal starting position coordinates along the first direction and the longitudinal starting position along the second direction. The starting position coordinates, the ending position coordinates include the transverse ending position coordinates along the first direction and the longitudinal ending position coordinates along the second direction, the step length includes the transverse step length along the first direction and the longitudinal step length along the second direction, One direction intersects the second direction;
依据设置的起始位置坐标、窗口尺寸、步长大小与终止位置坐标,生成多张备选攻击掩膜图像,其中,多张备选攻击掩膜图像包括a*b张,a=(横向终止位置坐标-横向起始位置坐标)/横向步长,b=(纵向终止位置坐标-纵向起始位置坐标)/纵向步长,每张备选攻击掩膜图像指定的修改区域的尺寸等于窗口尺寸。According to the set starting position coordinates, window size, step size and end position coordinates, multiple alternative attack mask images are generated, where the multiple alternative attack mask images include a*b images, a=(horizontal end position coordinate-horizontal Starting position coordinates)/horizontal step size, b = (vertical end position coordinates - longitudinal starting position coordinates)/vertical step size, the size of the modified area specified by each candidate attack mask image is equal to the window size.
在一些示例性实施方式中,第一方向与第二方向可以相互垂直。In some exemplary embodiments, the first direction and the second direction may be perpendicular to each other.
在一些示例性实施方式中,窗口形状可以为矩形形状,窗口尺寸可以包括沿第一方向的窗口宽度和沿第二方向的窗口高度。In some exemplary embodiments, the window shape may be a rectangular shape, and the window size may include a window width along a first direction and a window height along a second direction.
在一些示例性实施方式中,起始位置坐标为[startx,starty],其中,横向起始位置坐标为startx,纵向起始位置坐标为starty,窗口尺寸为[winw,winh],其中,窗口宽度为winw,窗口高度为winh,横向步长为stepx,纵向步长为stepy,终止位置坐标为[endx,endy],其中,横向终止位置坐标为endx,纵向终止位置坐标为endy,则a=(endx-startx)/stepx,b=(endy-starty)/stepy,第一张备选攻击掩膜图像指定的修改区域为从[startx,starty]到[startx+winw,starty+winh]的矩形区域,第二张备选攻击掩膜图像指定的修改区域为从[startx+stepx,starty]到[startx+stepx+winw,starty+winh]的矩形区域,……,第a张备选攻击掩膜图像指定的修改区域为从[endx,starty]到[endx+winw,starty+winh]的矩形区域,第a+1张备选攻击掩膜图像指定的修改区域为从[startx,starty+stepy]到[startx+winw,starty+stepy+winh]的矩形区域,……,第a*b张备选攻击掩膜图像指定的修改区域为从[endx,endy]到[endx+winw,endy+winh]的矩形区域。每张备选攻击掩膜图像指定的修改区域尺寸均为winw*winh,但是,不同备选攻击掩膜图像指定的修改区域不同。In some exemplary embodiments, the starting position coordinates are [startx, starty], where the horizontal starting position coordinates are startx, the vertical starting position coordinates are starty, and the window size is [winw, winh], where the window width is winw, the window height is winh, the horizontal step size is stepx, the vertical step size is stepy, and the end position coordinate is [endx, endy], where the horizontal end position coordinate is endx and the vertical end position coordinate is endy, then a = ( endx-startx)/stepx, b=(endy-starty)/stepy, the modification area specified by the first candidate attack mask image is the rectangular area from [startx, starty] to [startx+winw, starty+winh], No. The specified modification area of the two candidate attack mask images is the rectangular area from [startx+stepx, starty] to [startx+stepx+winw, starty+winh],..., and the specified modification area of the ath candidate attack mask image is The rectangular area from [endx, starty] to [endx+winw, starty+winh], the modified area specified by the a+1 candidate attack mask image is from [startx, starty+stepy] to [startx+winw, starty+ stepy+winh],..., the modified area specified by the a*b candidate attack mask image is the rectangular area from [endx, endy] to [endx+winw, endy+winh]. The size of the modification area specified for each candidate attack mask image is winw*winh. However, the modification areas specified for different candidate attack mask images are different.
示例性的,取startx=20,starty=20,winw=13,winh=10,stepx=8,stepy=5,endx=92,endy=95,所产生的多张备选攻击掩膜图像如图2所 示。每张备选攻击掩膜图像中,左上位置的坐标为该备选攻击掩膜图像所指定区域的起始位置坐标(即黑色矩形块的左上角顶点坐标),右下位置的坐标为该备选攻击掩膜图像所指定区域的终止位置坐标(即黑色矩形块的右下角顶点坐标),每张备选攻击掩膜图像指定的修改区域尺寸均为13像素*10像素。每行包括(92-20)/8=9张备选攻击掩膜图像,每列包括(95-20)/5=15张备选攻击掩膜图像,因此,一共了产生9×15=135张备选攻击掩膜图像。For example, take startx=20, starty=20, winw=13, winh=10, stepx=8, stepy=5, endx=92, endy=95, and the multiple candidate attack mask images generated are as shown in Figure 2. Show. In each alternative attack mask image, the coordinates of the upper left position are the starting position coordinates of the area specified by the alternative attack mask image (i.e., the coordinates of the upper left corner vertex of the black rectangular block), and the coordinates of the lower right position are the coordinates of the alternative attack The end position coordinates of the area specified by the mask image (that is, the coordinates of the lower right corner vertex of the black rectangular block). The size of the modified area specified by each candidate attack mask image is 13 pixels * 10 pixels. Each row includes (92-20)/8=9 candidate attack mask images, and each column includes (95-20)/5=15 candidate attack mask images. Therefore, a total of 9×15=135 candidate attack mask images are generated. image.
在一些示例性实施方式中,第一图像和第二图像可以为属于同一身份的不同图像。此时,人脸识别对抗攻击的目标在于:修改第一图像使它与同一身份下的其他人脸图像(即第二图像)经过人脸识别模型提取到的人脸特性向量的相似度得分尽可能低。In some exemplary embodiments, the first image and the second image may be different images belonging to the same identity. At this time, the goal of the face recognition adversarial attack is to modify the first image so that it has the same similarity score as the face feature vector extracted by the face recognition model with other face images under the same identity (i.e., the second image). Probably low.
在一些示例性实施方式中,当第一图像和第二图像为属于同一身份的不同图像时,如图3所示,步骤102或步骤103中对第一图像进行一次迭代修改攻击,包括如下步骤:In some exemplary embodiments, when the first image and the second image are different images belonging to the same identity, as shown in Figure 3, an iterative modification attack is performed on the first image in step 102 or step 103, including the following steps :
将第一图像或修改后的第一图像与第二图像分别输入人脸识别模型,得到第一图像对应的第一人脸特征向量与第二图像对应的第二人脸特征向量,计算第一人脸特征向量与第二人脸特征向量之间的损失函数的值并进行反向传播,对第一图像进行修改以使第一人脸特征向量与第二人脸特征向量之间的损失函数的值上升,修改区域为备选攻击掩膜图像或最终攻击掩膜图像指定的修改区域,该人脸识别模型的损失函数为-1*第一人脸特征向量与第二人脸特征向量之间的损失函数。Input the first image or the modified first image and the second image into the face recognition model respectively, obtain the first face feature vector corresponding to the first image and the second face feature vector corresponding to the second image, and calculate the first face feature vector. The value of the loss function between the face feature vector and the second face feature vector is backpropagated, and the first image is modified to make the loss function between the first face feature vector and the second face feature vector The value of increases, the modified area is the modified area specified by the candidate attack mask image or the final attack mask image. The loss function of the face recognition model is -1*the sum of the first face feature vector and the second face feature vector. loss function between.
本公开实施例中,在步骤102中对第一图像进行K次迭代修改攻击时,修改区域为本次迭代修改攻击使用的备选攻击掩膜图像指定的修改区域;在步骤103中对第一图像进行M次迭代修改攻击时,修改区域为最终攻击掩膜图像指定的修改区域。在K次迭代修改攻击或M次迭代修改攻击中的第一次修改攻击时,将第一图像与第二图像分别输入人脸识别模型,得到第一图像对应的第一人脸特征向量与第二图像对应的第二人脸特征向量;在K次迭代修改攻击或M次迭代修改攻击中的非第一次修改攻击时,将前一次修改攻击修改后的第一图像与第二图像分别输入人脸识别模型,得到第一图像对应的第一人脸特征向量与第二图像对应的第二人脸特征向量。 In the embodiment of the present disclosure, when K iterative modification attacks are performed on the first image in step 102, the modification area is the modification area specified by the alternative attack mask image used in this iterative modification attack; in step 103, the first image is modified in an iterative modification attack. When the image undergoes M iterative modification attacks, the modified area is the modified area specified by the final attack mask image. During the first modification attack in K iterative modification attacks or M iterative modification attacks, the first image and the second image are respectively input into the face recognition model, and the first face feature vector corresponding to the first image and the second face feature vector are obtained. The second face feature vector corresponding to the two images; when performing a non-first modification attack in K iterative modification attacks or M iterative modification attacks, input the first image and the second image modified by the previous modification attack respectively. The face recognition model obtains the first face feature vector corresponding to the first image and the second face feature vector corresponding to the second image.
本公开实施例中,人脸识别模型可以包括一个或多个,当人脸识别模型包括多个且第一图像和第二图像为属于同一身份的不同图像时,步骤102或步骤103中对第一图像进行一次迭代修改攻击,包括如下步骤:将第一图像或修改后的第一图像与第二图像分别输入每个人脸识别模型,得到每个人脸识别模型对应的第一人脸特征向量与第二人脸特征向量,计算每个人脸识别模型对应的第一人脸特征向量与第二人脸特征向量之间的损失函数的值,将多个人脸识别模型对应的损失函数的值进行加权平均并反向传播,对第一图像进行修改以使加权平均后的损失函数的值上升,修改区域为备选攻击掩膜图像或最终攻击掩膜图像指定的修改区域。In the embodiment of the present disclosure, the face recognition model may include one or more. When the face recognition model includes multiple and the first image and the second image are different images belonging to the same identity, the third image in step 102 or step 103 is Perform an iterative modification attack on an image, including the following steps: input the first image or the modified first image and the second image into each face recognition model, and obtain the first face feature vector and the corresponding first face feature vector of each face recognition model. The second face feature vector, calculates the value of the loss function between the first face feature vector and the second face feature vector corresponding to each face recognition model, and weights the values of the loss functions corresponding to multiple face recognition models Average and back propagate, modify the first image to increase the value of the weighted average loss function, and the modified area is the modified area specified by the alternative attack mask image or the final attack mask image.
在一些示例性实施方式中,第一人脸特征向量与第二人脸特征向量之间的损失函数可以通过均方损失函数(Mean Squared Error,MSE)进行计算。In some exemplary implementations, the loss function between the first facial feature vector and the second facial feature vector may be calculated by a mean squared loss function (Mean Squared Error, MSE).
在一些示例性实施方式中,当第一图像和第二图像为属于同一身份的不同图像时,步骤103中根据多张备选攻击掩膜图像对应的相似度得分,生成最终攻击掩膜图像,包括:In some exemplary embodiments, when the first image and the second image are different images belonging to the same identity, in step 103, a final attack mask image is generated based on the similarity scores corresponding to multiple candidate attack mask images, including:
将多张备选攻击掩膜图像对应的相似度得分进行排序;Sort the similarity scores corresponding to multiple candidate attack mask images;
选择相似度得分排序在前的N个备选攻击掩膜图像;Select the N candidate attack mask images with the highest similarity scores;
将排序在前的N个备选攻击掩膜图像指定的修改区域合成得到最终攻击掩膜图像指定的修改区域。The modified areas specified by the top N candidate attack mask images are synthesized to obtain the modified area specified by the final attack mask image.
在另一些示例性实施方式中,第一图像和第二图像可以为属于不同身份的不同图像。此时,人脸识别对抗攻击的目标在于:修改第一图像使它与不同身份下的其他人脸图像(即第二图像)经过人脸识别模型提取到的人脸特性向量的相似度得分尽可能高。In other exemplary embodiments, the first image and the second image may be different images belonging to different identities. At this time, the goal of the face recognition adversarial attack is to modify the first image so that it has the same similarity score as the face feature vector extracted by the face recognition model with other face images under different identities (i.e., the second image). Probably high.
在一些示例性实施方式中,当第一图像和第二图像为属于不同身份的不同图像时,如图3所示,步骤102或步骤103中对第一图像进行一次迭代修改攻击,包括:In some exemplary embodiments, when the first image and the second image are different images belonging to different identities, as shown in Figure 3, an iterative modification attack is performed on the first image in step 102 or step 103, including:
将第一图像或修改后的第一图像与第二图像分别输入人脸识别模型,得到第一图像对应的第一人脸特征向量与第二图像对应的第二人脸特征向量,计算第一人脸特征向量与第二人脸特征向量之间的损失函数的值并进行反向 传播,对第一图像进行修改以使第一人脸特征向量与第二人脸特征向量之间的损失函数的值下降,修改区域为备选攻击掩膜图像或最终攻击掩膜图像指定的修改区域,该人脸识别模型的损失函数为第一人脸特征向量与第二人脸特征向量之间的损失函数。Input the first image or the modified first image and the second image into the face recognition model respectively, obtain the first face feature vector corresponding to the first image and the second face feature vector corresponding to the second image, and calculate the first face feature vector. The value of the loss function between the face feature vector and the second face feature vector and reverse Propagation, modifying the first image so that the value of the loss function between the first face feature vector and the second face feature vector decreases, the modification area is the modification specified by the alternative attack mask image or the final attack mask image area, the loss function of the face recognition model is the loss function between the first face feature vector and the second face feature vector.
本公开实施例中,人脸识别模型可以包括一个或多个,当人脸识别模型包括多个且第一图像和第二图像为属于不同身份的不同图像时,步骤102或步骤103中对第一图像进行一次迭代修改攻击,包括:将第一图像或修改后的第一图像与第二图像分别输入每个人脸识别模型,得到每个人脸识别模型对应的第一人脸特征向量与第二人脸特征向量,计算每个人脸识别模型对应的第一人脸特征向量与第二人脸特征向量之间的损失函数的值,将多个人脸识别模型对应的损失函数的值进行加权平均并反向传播,对第一图像进行修改以使加权平均后的损失函数的值下降,修改区域为备选攻击掩膜图像或最终攻击掩膜图像指定的修改区域。In the embodiment of the present disclosure, the face recognition model may include one or more. When the face recognition model includes multiple and the first image and the second image are different images belonging to different identities, the third image in step 102 or step 103 is Perform an iterative modification attack on an image, including: inputting the first image or the modified first image and the second image into each face recognition model, and obtaining the first face feature vector and the second face feature vector corresponding to each face recognition model. Face feature vectors, calculate the value of the loss function between the first face feature vector and the second face feature vector corresponding to each face recognition model, weighted average the values of the loss functions corresponding to multiple face recognition models and combine them In back propagation, the first image is modified so that the value of the weighted average loss function decreases, and the modified area is the modified area specified by the alternative attack mask image or the final attack mask image.
在一些示例性实施方式中,当第一图像和第二图像为属于不同身份的不同图像时,步骤103中根据多张备选攻击掩膜图像对应的相似度得分,生成最终攻击掩膜图像,包括:In some exemplary embodiments, when the first image and the second image are different images belonging to different identities, in step 103, a final attack mask image is generated based on the similarity scores corresponding to multiple candidate attack mask images, including:
将多张备选攻击掩膜图像对应的相似度得分进行排序;Sort the similarity scores corresponding to multiple candidate attack mask images;
选择相似度得分排序在后的N个备选攻击掩膜图像;Select N candidate attack mask images with the lowest similarity scores;
将排序在后的N个备选攻击掩膜图像指定的修改区域合成得到最终攻击掩膜图像指定的修改区域。The modified areas specified by the ranked N candidate attack mask images are synthesized to obtain the modified area specified by the final attack mask image.
在一些示例性实施方式中,N可以取3到8之间。示例性的,N可以为5或6。In some exemplary embodiments, N may range from 3 to 8. For example, N can be 5 or 6.
在一些示例性实施方式中,如图4所示,该方法还可以包括:In some exemplary implementations, as shown in Figure 4, the method may further include:
检测最终攻击掩膜图像指定的修改区域是否包括一个或多个阶梯形状区域;Detect whether the modified area specified by the final attack mask image includes one or more stepped-shaped areas;
当包括一个或多个阶梯形状区域时,将每个阶梯形状区域划分为多个矩形区域。When one or more stepped-shaped areas are included, each stepped-shaped area is divided into a plurality of rectangular areas.
本公开实施例提供的人脸识别对抗样本的生成方法,通过对最终攻击掩 膜图像指定的修改区域进行校正处理(即划分为多个矩形区域),进一步减小了对原始图像的修改面积,降低了修改前后的视觉差异。The face recognition adversarial sample generation method provided by the embodiments of the present disclosure is to mask the final attack. The specified modified area of the membrane image is corrected (that is, divided into multiple rectangular areas), which further reduces the modified area of the original image and reduces the visual difference before and after modification.
在一些示例性实施方式中,M在120到180之间,K在20到40之间。示例性的,M为150,K为30。In some exemplary embodiments, M is between 120 and 180, and K is between 20 and 40. For example, M is 150 and K is 30.
在一些示例性实施方式中,如图5所示,本公开实施例提供的人脸识别对抗样本的生成方法,包括如下步骤:In some exemplary implementations, as shown in Figure 5, the method for generating face recognition adversarial samples provided by embodiments of the present disclosure includes the following steps:
1)对于一幅给定的人脸图像,记为P1,假设它的身份为I1。使用人脸检测模型将P1中的人脸区域截取出来;使用人脸关键点检测模型提取关键点后,利用仿射变换将截取到的人脸区域校正为预设尺寸(示例性的,112像素*112像素)的标准图像。1) For a given face image, denoted as P1, assume its identity is I1. Use the face detection model to intercept the face area in P1; after using the face key point detection model to extract the key points, use affine transformation to correct the intercepted face area to the preset size (example, 112 pixels *112 pixels) standard image.
一般的对于人脸识别模型的攻击可以分为两种,第一种是修改P1使它与同一身份下的其他人脸图像经过模型提取到的特征向量的相似度得分尽可能低;第二种是修改P1使它与不同身份下的其他人脸图像经过模型提取到的特征向量的相似度得分尽可能高。Generally, attacks on face recognition models can be divided into two types. The first is to modify P1 so that its similarity score with the feature vectors extracted by the model from other face images under the same identity is as low as possible; It is to modify P1 so that its similarity score with the feature vectors extracted by the model from other face images under different identities is as high as possible.
以第一种攻击类型为例,给定另一幅人脸图像,记为P2,假设它的身份为I1,与P1相同。使用人脸检测模型将P2中的人脸区域截取出来;使用人脸关键点检测模型提取关键点后,利用仿射变换将截取到的人脸区域校正为预设尺寸(示例性的,112像素*112像素)的标准图像。Taking the first type of attack as an example, given another face image, denoted as P2, assume that its identity is I1, which is the same as P1. Use the face detection model to intercept the face area in P2; after using the face key point detection model to extract the key points, use affine transformation to correct the intercepted face area to the preset size (example, 112 pixels *112 pixels) standard image.
以第二种攻击类型为例,给定另一幅人脸图像,记为P3,假设它的身份为I2,与P1不同。使用人脸检测模型将P3中的人脸区域截取出来;使用人脸关键点检测模型提取关键点后,利用仿射变换将截取到的人脸区域校正为预设尺寸(示例性的,112像素*112像素)的标准图像。Taking the second attack type as an example, given another face image, denoted as P3, assume that its identity is I2, which is different from P1. Use the face detection model to intercept the face area in P3; after using the face key point detection model to extract the key points, use affine transformation to correct the intercepted face area to the preset size (example, 112 pixels *112 pixels) standard image.
2)从[startx,starty]位置开始,以[winw,winh]为窗口尺寸,[stepx,stepy]为步长,以[endx,endy]为终止滑动位置来移动检索块,生成多张备选的攻击掩膜图像。2) Starting from the [startx, starty] position, use [winw, winh] as the window size, [stepx, stepy] as the step size, and use [endx, endy] as the end sliding position to move the retrieval block and generate multiple alternative attacks. Mask image.
示例性的,取startx=20,starty=20,winw=13,winh=10,stepx=8,stepy=5,endx=92,endy=95,所生成的多张备选攻击掩膜图像如图2所示。每张备选的攻击掩膜图像中,左上位置的坐标为该备选的攻击掩膜图像 所指定区域的起始位置坐标(即黑色矩形块的左上角顶点坐标),右下位置的坐标为该备选的攻击掩膜图像所指定区域的终止位置坐标(即黑色矩形块的右下角顶点坐标)。每行包括(92-20)/8=9张备选的攻击掩膜图像,每列包括(95-20)/5=15张备选的攻击掩膜图像,因此,一共产生9×15=135张备选的攻击掩膜图像。For example, take startx=20, starty=20, winw=13, winh=10, stepx=8, stepy=5, endx=92, endy=95. The generated multiple candidate attack mask images are as shown in Figure 2. Show. In each alternative attack mask image, the coordinates of the upper left position are the alternative attack mask images. The starting position coordinates of the specified area (i.e., the coordinates of the upper left corner vertex of the black rectangular block), and the coordinates of the lower right position are the ending position coordinates of the area specified by the alternative attack mask image (i.e., the coordinates of the lower right corner vertex of the black rectangular block) coordinate). Each row includes (92-20)/8=9 candidate attack mask images, and each column includes (95-20)/5=15 candidate attack mask images. Therefore, a total of 9×15=135 candidate attack images are generated. Mask image.
3)使用每张备选攻击掩膜图像,对P1图像进行K次迭代修改攻击,根据K次迭代修改攻击后得到的结果,确定P1图像中较易攻击成功的区域。3) Use each candidate attack mask image to perform K iterative modification attacks on the P1 image. Based on the results obtained after K iterative modification attacks, determine the areas in the P1 image that are easier to attack successfully.
以第一种攻击类型为例,如图3所示,此时,第一图像为P1,第二图像为P2,将P1和P2两幅图像输入人脸识别模型分别提取到人脸特征向量后计算提取到的人脸特征向量之间的MSE Loss,固定模型的参数不变,通过反向传播对P1图像进行修改使MSE Loss不断上升,其中对P1图像的修改区域为备选攻击掩膜图像所指定的修改区域。采用梯度下降法进行迭代,损失函数为-1*MSE Loss,这样进行梯度下降的时候可以令MSE Loss上升,一次梯度下降即为完成一次迭代。对于步骤2)中所产生的每张备选的攻击掩膜图像,都利用所述流程进行K次迭代(示例性的,K=30)。计算每张备选的攻击掩膜图像所对应的最终修改图像和P2图像的相似度,选择其中相似度得分较低的N个备选的攻击掩膜图像(示例性的,N可以取5个或6个),该N个备选的攻击掩膜图像所对应的区域即为前N个较易攻击成功的区域。Taking the first attack type as an example, as shown in Figure 3, at this time, the first image is P1 and the second image is P2. The two images P1 and P2 are input into the face recognition model and the face feature vectors are extracted respectively. Calculate the MSE Loss between the extracted facial feature vectors. The parameters of the fixed model remain unchanged. The P1 image is modified through backpropagation to make the MSE Loss continue to increase. The modified area of the P1 image is the alternative attack mask image. The specified modification area. The gradient descent method is used for iteration, and the loss function is -1*MSE Loss. In this way, the MSE Loss can be increased when performing gradient descent. One gradient descent is the completion of one iteration. For each candidate attack mask image generated in step 2), the process is used to perform K iterations (exemplarily, K=30). Calculate the similarity between the final modified image and the P2 image corresponding to each alternative attack mask image, and select N alternative attack mask images with lower similarity scores (for example, N can be 5 or 6 ), the areas corresponding to the N candidate attack mask images are the top N areas that are easier to attack successfully.
以第二种攻击类型为例,如图3所示,此时,第一图像为P1,第二图像为P3,将P1和P3两幅图像输入人脸识别模型分别提取到人脸特征向量后计算提取到的人脸特征向量之间的MSE Loss,固定模型的参数不变,通过反向传播对P1图像进行修改使MSE Loss不断下降,其中对P1图像的修改区域为备选攻击掩膜图像所指定的修改区域。采用梯度下降法进行迭代,损失函数为MSE Loss,这样进行梯度下降的时候可以令MSE Loss下降,一次梯度下降即为完成一次迭代。对于步骤2)中所产生的每张备选的攻击掩膜图像,都利用所述流程进行K次迭代(示例性的,K=30)。计算每张备选的攻击掩膜图像所对应的最终修改图像和P3图像的相似度,选择其中相似度得分较高的N个备选的攻击掩膜图像(示例性的,N可以取5个或6个),它们所对应的区域即为前N个较易攻击成功的区域。 Taking the second attack type as an example, as shown in Figure 3, at this time, the first image is P1 and the second image is P3. The two images P1 and P3 are input into the face recognition model and extracted into face feature vectors respectively. Calculate the MSE Loss between the extracted facial feature vectors. The parameters of the fixed model remain unchanged. The P1 image is modified through backpropagation to continuously reduce the MSE Loss. The modified area of the P1 image is the candidate attack mask image. The specified modification area. The gradient descent method is used for iteration, and the loss function is MSE Loss. In this way, the MSE Loss can be reduced when performing gradient descent. One gradient descent is the completion of one iteration. For each candidate attack mask image generated in step 2), the process is used to perform K iterations (exemplarily, K=30). Calculate the similarity between the final modified image and the P3 image corresponding to each alternative attack mask image, and select N alternative attack mask images with higher similarity scores (for example, N can be 5 or 6 ), the areas they correspond to are the top N areas that are easier to attack successfully.
得到前N个较易攻击成功的区域后,将这些区域合并起来生成一个最终攻击掩膜图像。由于在生成备选攻击掩膜图像时所使用的滑窗之间存在重叠关系,直接生成的最终攻击掩膜图像有可能存在阶梯的形状,影响最终生成的攻击图像的视觉效果。为了进一步缩小修改区域的面积,本公开实施例沿水平方向对生成的掩膜进行逐像素扫描,发现水平方向上相邻两个像素之间,在垂直方向上的掩膜高度不相同时,在该位置上对掩膜区域进行切断,从而获得最终的攻击掩膜图像,如图4所示的校正后的攻击掩膜图像。After obtaining the first N areas that are easier to attack successfully, these areas are combined to generate a final attack mask image. Due to the overlapping relationship between the sliding windows used when generating alternative attack mask images, the directly generated final attack mask image may have a stepped shape, which affects the visual effect of the final generated attack image. In order to further reduce the area of the modified area, the embodiment of the present disclosure scans the generated mask pixel by pixel along the horizontal direction, and finds that when the mask heights in the vertical direction are different between two adjacent pixels in the horizontal direction, The mask area is cut off at this position to obtain the final attack mask image, as shown in Figure 4, the corrected attack mask image.
4)获取到最终的攻击掩膜图像之后,采用类似步骤3)中所述的迭代修改攻击流程对原始人脸图像P1进行M次迭代修改攻击(示例性的,M=150),生成最终的人脸攻击图像(即人脸识别对抗样本)。4) After obtaining the final attack mask image, use an iterative modification attack process similar to that described in step 3) to perform M iterative modification attacks on the original face image P1 (exemplarily, M=150) to generate the final Face attack images (i.e. face recognition adversarial samples).
本公开实施例通过快速梯度算法(Fast Gradient Sign Attack,FGSM)修改原始人脸图像,FGSM在白盒环境下,通过求出模型对输入数据的导数,用函数求得其梯度方向,再乘以步长,得到的就是其扰动量,将这个扰动量加在原来的输入上,就得到了在FGSM攻击下的人脸识别对抗样本,该人脸识别对抗样本很大概率上可以使模型分类错误,以达到攻击的目的。This disclosed embodiment uses the Fast Gradient Sign Attack (FGSM) to modify the original face image. FGSM, in a white box environment, calculates the derivative of the model on the input data, uses a function to find the gradient direction, and then multiplies it by Step size, what is obtained is the amount of disturbance. Adding this amount of disturbance to the original input, we get the face recognition adversarial sample under the FGSM attack. This face recognition adversarial sample has a high probability of making the model classify incorrectly. , to achieve the purpose of attack.
以第二种攻击类型为例,最终的攻击效果如图6所示,其中,图6左侧图像为原始图像,中间图像为修改后的图像,即生成的人脸识别对抗样本,右侧图像为身份不同的混淆图像。修改后的图像与原始图像的视觉差异不明显,且与身份不同的混淆图像在人脸识别模型中,相似度得分可以达到0.3以上。Taking the second attack type as an example, the final attack effect is shown in Figure 6. The image on the left side of Figure 6 is the original image, the middle image is the modified image, that is, the generated face recognition adversarial sample, and the image on the right side is Confusing images for different identities. The visual difference between the modified image and the original image is not obvious, and the similarity score of the confused image with different identity can reach more than 0.3 in the face recognition model.
本公开实施例提供了一种人脸识别对抗样本的生成方法,通过使用滑窗对整幅人脸图像进行分块检索,找到整幅人脸图像中较易攻击成功的区域,进而产生最优的攻击掩膜图像,根据最优的攻击掩膜图像生成最终的攻击图像(即人脸识别对抗样本),在改动很少面积的前提下,使人脸图像能够“欺骗”过识别系统,从而有效防止上传到云端的人脸图像被滥用。Embodiments of the present disclosure provide a method for generating adversarial samples for face recognition. By using sliding windows to perform block retrieval on the entire face image, areas in the entire face image that are easier to attack are found, thereby generating optimal The attack mask image is used to generate the final attack image (i.e. face recognition adversarial sample) based on the optimal attack mask image. Under the premise of changing a small area, the face image can "deceive" the recognition system, thereby Effectively prevent face images uploaded to the cloud from being abused.
本公开实施例还提供了一种人脸识别对抗样本的生成装置,包括存储器;和耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器 中的指令,执行如本公开任一实施例所述的人脸识别对抗样本的生成方法的步骤。Embodiments of the present disclosure also provide a device for generating face recognition adversarial samples, including a memory; and a processor coupled to the memory, where the processor is configured to generate The instructions in execute the steps of the method for generating face recognition adversarial samples as described in any embodiment of the present disclosure.
如图7所示,在一个示例中,该人脸识别对抗样本的生成装置可包括:处理器710、存储器720和总线系统730,其中,处理器710和存储器720通过总线系统730相连,存储器720用于存储指令,处理器710用于执行存储器720存储的指令,以对第一图像和第二图像进行预处理,并生成多张备选攻击掩膜图像,其中,所述第一图像和第二图像均包含人脸区域;使用每张所述备选攻击掩膜图像与第二图像,对所述第一图像进行K次迭代修改攻击,计算K次迭代修改攻击后得到的修改图像与所述第二图像的相似度得分,根据多张备选攻击掩膜图像对应的相似度得分,生成最终攻击掩膜图像,其中,所述备选攻击掩膜图像和所述最终攻击掩膜图像均用于指定对所述第一图像进行迭代修改攻击时的修改区域,不同所述备选攻击掩膜图像指定的修改区域不同;使用所述最终攻击掩膜图像与第二图像,对所述第一图像进行M次迭代修改攻击,得到人脸识别对抗样本,其中,M、K均为大于1的正整数,且M>K。As shown in Figure 7, in one example, the device for generating face recognition adversarial samples may include: a processor 710, a memory 720, and a bus system 730. The processor 710 and the memory 720 are connected through the bus system 730, and the memory 720 For storing instructions, the processor 710 is used for executing instructions stored in the memory 720 to preprocess the first image and the second image and generate a plurality of candidate attack mask images, wherein the first image and the second image Both include face areas; use each of the candidate attack mask images and the second image to perform K iterative modification attacks on the first image, and calculate the modified image obtained after the K iterative modification attacks and the third image. The similarity scores of the two images are used to generate a final attack mask image based on the similarity scores corresponding to multiple alternative attack mask images, where both the alternative attack mask images and the final attack mask image are used to specify the target. The modification area when the first image is subjected to an iterative modification attack, and the modification areas specified by different alternative attack mask images are different; use the final attack mask image and the second image to perform M on the first image Iteratively modify the attack to obtain face recognition adversarial samples, in which M and K are both positive integers greater than 1, and M>K.
应理解,处理器710可以是中央处理单元(Central Processing Unit,CPU),处理器710还可以是其他通用处理器、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that the processor 710 can be a central processing unit (Central Processing Unit, CPU). The processor 710 can also be other general-purpose processors, digital signal processors (DSP), application-specific integrated circuits (ASICs), and off-the-shelf programmable gate arrays. (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
存储器720可以包括只读存储器和随机存取存储器,并向处理器710提供指令和数据。存储器720的一部分还可以包括非易失性随机存取存储器。例如,存储器720还可以存储设备类型的信息。Memory 720 may include read-only memory and random access memory and provides instructions and data to processor 710 . A portion of memory 720 may also include non-volatile random access memory. For example, memory 720 may also store device type information.
总线系统730除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图7中将各种总线都标为总线系统730。In addition to the data bus, the bus system 730 may also include a power bus, a control bus, a status signal bus, etc. However, for the sake of clarity, the various buses are labeled as bus system 730 in FIG. 7 .
在实现过程中,处理设备所执行的处理可以通过处理器710中的硬件的集成逻辑电路或者软件形式的指令完成。即本公开实施例的方法步骤可以体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。 软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等存储介质中。该存储介质位于存储器720,处理器710读取存储器720中的信息,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。During implementation, the processing performed by the processing device may be completed by instructions in the form of hardware integrated logic circuits or software in the processor 710 . That is to say, the method steps of the embodiments of the present disclosure may be implemented by a hardware processor, or may be executed by a combination of hardware and software modules in the processor. Software modules can be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media. The storage medium is located in the memory 720. The processor 710 reads the information in the memory 720 and completes the steps of the above method in combination with its hardware. To avoid repetition, it will not be described in detail here.
本公开实施例还提供了一种存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本公开任一实施例所述的人脸识别对抗样本的生成方法。An embodiment of the present disclosure also provides a storage medium on which a computer program is stored. When the program is executed by a processor, the method for generating face recognition adversarial samples as described in any embodiment of the present disclosure is implemented.
在一些可能的实施方式中,本公开提供的人脸识别对抗样本的生成方法的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序产品在计算机设备上运行时,所述程序代码用于使所述计算机设备执行本说明书上述描述的根据本公开各种示例性实施方式的人脸识别对抗样本的生成方法中的步骤,例如,所述计算机设备可以执行本公开实施例所记载的人脸识别对抗样本的生成方法。In some possible implementations, various aspects of the method for generating face recognition adversarial samples provided by the present disclosure can also be implemented in the form of a program product, which includes program code. When the program product is run on a computer device , the program code is used to cause the computer device to execute the steps in the method for generating face recognition adversarial samples according to various exemplary embodiments of the present disclosure described above in this specification. For example, the computer device can execute the present disclosure. The method for generating face recognition adversarial samples described in the embodiment.
所述程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以是但不限于:电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。The program product may take the form of any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to: electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more conductors, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
本公开中的附图只涉及本公开涉及到的结构,其他结构可参考通常设计。在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合以得到新的实施例。The drawings in this disclosure only refer to the structures involved in this disclosure, and other structures may refer to common designs. If there is no conflict, the embodiments of the present disclosure and the features in the embodiments may be combined with each other to obtain new embodiments.
本领域的普通技术人员应当理解,可以对本公开的技术方案进行修改或者等同替换,而不脱离本公开技术方案的精神和范围,均应涵盖在本公开的权利要求的范围当中。 Those of ordinary skill in the art should understand that the technical solutions of the present disclosure can be modified or equivalently substituted without departing from the spirit and scope of the technical solutions of the present disclosure, and all should be covered by the scope of the claims of the present disclosure.

Claims (13)

  1. 一种人脸识别对抗样本的生成方法,包括:A method for generating face recognition adversarial samples, including:
    对第一图像和第二图像进行预处理,并生成多张备选攻击掩膜图像,其中,所述第一图像和第二图像均包含人脸区域;Preprocess the first image and the second image, and generate multiple candidate attack mask images, where the first image and the second image both contain human face areas;
    使用每张所述备选攻击掩膜图像与第二图像,对所述第一图像进行K次迭代修改攻击,计算K次迭代修改攻击后得到的修改图像与所述第二图像的相似度得分,根据多张所述备选攻击掩膜图像对应的相似度得分,生成最终攻击掩膜图像,其中,所述备选攻击掩膜图像和所述最终攻击掩膜图像均用于指定对所述第一图像进行迭代修改攻击时的修改区域,不同所述备选攻击掩膜图像指定的修改区域不同;Using each of the candidate attack mask images and the second image, perform K iterative modification attacks on the first image, and calculate the similarity score between the modified image obtained after the K iterative modification attacks and the second image. , generate a final attack mask image based on the similarity scores corresponding to multiple candidate attack mask images, wherein both the candidate attack mask images and the final attack mask image are used to specify the pair of The modification area when the first image is subjected to an iterative modification attack, and the modification areas specified by different alternative attack mask images are different;
    使用所述最终攻击掩膜图像与第二图像,对所述第一图像进行M次迭代修改攻击,得到人脸识别对抗样本,其中,M、K均为大于1的正整数,且M>K。Using the final attack mask image and the second image, perform M iterative modification attacks on the first image to obtain face recognition adversarial samples, where M and K are both positive integers greater than 1, and M>K .
  2. 根据权利要求1所述的生成方法,其中,所述第一图像和第二图像为属于同一身份的不同图像。The generation method according to claim 1, wherein the first image and the second image are different images belonging to the same identity.
  3. 根据权利要求2所述的生成方法,其中,对所述第一图像进行一次迭代修改攻击,包括:The generation method according to claim 2, wherein performing an iterative modification attack on the first image includes:
    将所述第一图像或修改后的所述第一图像与所述第二图像分别输入人脸识别模型,得到所述第一图像对应的第一人脸特征向量与所述第二图像对应的第二人脸特征向量,计算所述第一人脸特征向量与第二人脸特征向量之间的损失函数的值并进行反向传播,对所述第一图像进行修改以使所述第一人脸特征向量与第二人脸特征向量之间的损失函数的值上升,修改区域为所述备选攻击掩膜图像或所述最终攻击掩膜图像指定的修改区域,所述人脸识别模型的损失函数为-1*所述第一人脸特征向量与所述第二人脸特征向量之间的损失函数。The first image or the modified first image and the second image are respectively input into the face recognition model to obtain the first face feature vector corresponding to the first image and the first face feature vector corresponding to the second image. the second face feature vector, calculate the value of the loss function between the first face feature vector and the second face feature vector and perform backpropagation, and modify the first image so that the first The value of the loss function between the face feature vector and the second face feature vector increases, the modification area is the modification area specified by the alternative attack mask image or the final attack mask image, and the face recognition model The loss function is -1*the loss function between the first facial feature vector and the second facial feature vector.
  4. 根据权利要求2所述的生成方法,其中,所述根据多张所述备选攻击掩膜图像对应的相似度得分,生成最终攻击掩膜图像,包括: The generation method according to claim 2, wherein generating a final attack mask image based on similarity scores corresponding to a plurality of candidate attack mask images includes:
    将多张所述备选攻击掩膜图像对应的相似度得分进行排序;Sort the similarity scores corresponding to multiple candidate attack mask images;
    选择相似度得分排序在前的N个所述备选攻击掩膜图像;Select the N candidate attack mask images with the highest similarity scores;
    将所述排序在前的N个所述备选攻击掩膜图像指定的修改区域合成得到所述最终攻击掩膜图像指定的修改区域。The modified areas specified by the top N candidate attack mask images are synthesized to obtain the modified area specified by the final attack mask image.
  5. 根据权利要求1所述的生成方法,其中,所述第一图像和第二图像为属于不同身份的不同图像。The generation method according to claim 1, wherein the first image and the second image are different images belonging to different identities.
  6. 根据权利要求5所述的生成方法,其中,对所述第一图像进行一次迭代修改攻击,包括:The generation method according to claim 5, wherein performing an iterative modification attack on the first image includes:
    将所述第一图像或修改后的所述第一图像与所述第二图像分别输入人脸识别模型,得到所述第一图像对应的第一人脸特征向量与所述第二图像对应的第二人脸特征向量,计算所述第一人脸特征向量与第二人脸特征向量之间的损失函数的值并进行反向传播,对所述第一图像进行修改以使所述第一人脸特征向量与第二人脸特征向量之间的损失函数的值下降,修改区域为所述备选攻击掩膜图像或最终攻击掩膜图像指定的修改区域,所述人脸识别模型的损失函数为所述第一人脸特征向量与所述第二人脸特征向量之间的损失函数。The first image or the modified first image and the second image are respectively input into the face recognition model to obtain the first face feature vector corresponding to the first image and the first face feature vector corresponding to the second image. the second face feature vector, calculate the value of the loss function between the first face feature vector and the second face feature vector and perform backpropagation, and modify the first image so that the first The value of the loss function between the face feature vector and the second face feature vector decreases, the modification area is the modification area specified by the alternative attack mask image or the final attack mask image, and the loss of the face recognition model The function is the loss function between the first facial feature vector and the second facial feature vector.
  7. 根据权利要求5所述的生成方法,其中,所述根据多张所述备选攻击掩膜图像对应的相似度得分,生成最终攻击掩膜图像,包括:The generation method according to claim 5, wherein generating a final attack mask image based on similarity scores corresponding to a plurality of candidate attack mask images includes:
    将多张所述备选攻击掩膜图像对应的相似度得分进行排序;Sort the similarity scores corresponding to multiple candidate attack mask images;
    选择相似度得分排序在后的N个所述备选攻击掩膜图像;Select the N candidate attack mask images ranked lower in similarity score;
    将所述排序在后的N个所述备选攻击掩膜图像指定的修改区域合成得到所述最终攻击掩膜图像指定的修改区域。The modified areas specified by the N sequenced candidate attack mask images are synthesized to obtain the modified area specified by the final attack mask image.
  8. 根据权利要求1所述的生成方法,其中,所述方法还包括:The generation method according to claim 1, wherein the method further includes:
    检测所述最终攻击掩膜图像指定的修改区域是否包括一个或多个阶梯形状区域;Detect whether the modified area specified by the final attack mask image includes one or more stepped-shaped areas;
    当包括一个或多个所述阶梯形状区域时,将每个所述阶梯形状区域划分 为多个矩形区域。When one or more of the stepped-shaped regions are included, each of the stepped-shaped regions is divided into for multiple rectangular areas.
  9. 根据权利要求1所述的生成方法,其中,所述生成多张备选攻击掩膜图像,包括:The generation method according to claim 1, wherein generating multiple candidate attack mask images includes:
    根据预处理后的图像尺寸,设置起始位置坐标、窗口尺寸、步长大小与终止位置坐标,其中,起始位置坐标包括沿第一方向的横向起始位置坐标与沿第二方向的纵向起始位置坐标,终止位置坐标包括沿第一方向的横向终止位置坐标与沿第二方向的纵向终止位置坐标,步长包括沿第一方向的横向步长和沿第二方向的纵向步长,第一方向与第二方向交叉;According to the preprocessed image size, set the starting position coordinates, window size, step size and end position coordinates, where the starting position coordinates include the horizontal starting position coordinates along the first direction and the longitudinal starting position along the second direction. The starting position coordinates, the ending position coordinates include the transverse ending position coordinates along the first direction and the longitudinal ending position coordinates along the second direction, the step length includes the transverse step length along the first direction and the longitudinal step length along the second direction, One direction intersects the second direction;
    依据设置的起始位置坐标、窗口尺寸、横向步长、纵向步长与终止位置坐标,生成多张备选攻击掩膜图像,其中,所述多张备选攻击掩膜图像包括a*b张,a=(横向终止位置坐标-横向起始位置坐标)/横向步长,b=(纵向终止位置坐标-纵向起始位置坐标)/纵向步长,每张所述备选攻击掩膜图像指定的修改区域的尺寸等于所述窗口尺寸。According to the set starting position coordinates, window size, horizontal step size, vertical step size and end position coordinates, multiple candidate attack mask images are generated, wherein the multiple candidate attack mask images include a*b images, a=( Horizontal end position coordinates - horizontal start position coordinates)/horizontal step size, b = (vertical end position coordinates - longitudinal start position coordinates)/vertical step size, the modification area specified by each of the alternative attack mask images The size is equal to the window size.
  10. 根据权利要求1所述的生成方法,其中,所述对第一图像和第二图像进行预处理,包括:The generation method according to claim 1, wherein said preprocessing the first image and the second image includes:
    使用人脸检测模型,分别提取所述第一图像和第二图像中的人脸区域;Use a face detection model to extract face areas in the first image and the second image respectively;
    使用人脸关键点检测模型,分别提取所述第一图像和第二图像中的人脸关键点;Use a facial key point detection model to extract facial key points in the first image and the second image respectively;
    对提取的所述第一图像和第二图像中的人脸关键点进行仿射变换,得到对齐后的第一图像和第二图像,对齐后的所述第一图像和第二图像均为人脸图像;Perform affine transformation on the extracted key points of the human face in the first image and the second image to obtain the aligned first image and the second image. The aligned first image and the second image are both human faces. image;
    将对齐后的第一图像和第二图像调整至预设尺寸。Adjust the aligned first image and second image to a preset size.
  11. 根据权利要求1所述的生成方法,其中,M在120到180之间,K在20到40之间。The generation method according to claim 1, wherein M is between 120 and 180, and K is between 20 and 40.
  12. 一种人脸识别对抗样本的生成装置,包括存储器;和耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器中的指令,执行 如权利要求1至11中任一项所述的人脸识别对抗样本的生成方法的步骤。A device for generating face recognition adversarial samples, including a memory; and a processor coupled to the memory, the processor being configured to execute based on instructions stored in the memory The steps of the method for generating face recognition adversarial samples according to any one of claims 1 to 11.
  13. 一种存储介质,其上存储有计算机程序,该程序被处理器执行时实现如权利要求1至11中任一项所述的人脸识别对抗样本的生成方法。 A storage medium on which a computer program is stored. When the program is executed by a processor, the method for generating face recognition adversarial samples according to any one of claims 1 to 11 is implemented.
PCT/CN2023/111034 2022-08-23 2023-08-03 Method and apparatus for generating facial recognition adversarial sample, and storage medium WO2024041346A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211013503.3 2022-08-23
CN202211013503.3A CN115995104A (en) 2022-08-23 2022-08-23 Face recognition countermeasure sample generation method and device and storage medium

Publications (1)

Publication Number Publication Date
WO2024041346A1 true WO2024041346A1 (en) 2024-02-29

Family

ID=85990948

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/111034 WO2024041346A1 (en) 2022-08-23 2023-08-03 Method and apparatus for generating facial recognition adversarial sample, and storage medium

Country Status (2)

Country Link
CN (1) CN115995104A (en)
WO (1) WO2024041346A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118135640A (en) * 2024-05-06 2024-06-04 南京信息工程大学 Method for defending face image attack based on recessive noise

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115995104A (en) * 2022-08-23 2023-04-21 京东方科技集团股份有限公司 Face recognition countermeasure sample generation method and device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150279043A1 (en) * 2014-03-28 2015-10-01 Sony Corporation Imaging system with depth estimation mechanism and method of operation thereof
CN111444516A (en) * 2020-03-23 2020-07-24 华南理工大学 Sensitivity-based deep reinforcement learning intelligent agent attack method
CN111626925A (en) * 2020-07-24 2020-09-04 支付宝(杭州)信息技术有限公司 Method and device for generating counterwork patch
CN112287973A (en) * 2020-09-28 2021-01-29 北京航空航天大学 Digital image countermeasure sample defense method based on truncated singular value and pixel interpolation
CN114297730A (en) * 2021-12-31 2022-04-08 北京瑞莱智慧科技有限公司 Countermeasure image generation method, device and storage medium
CN115995104A (en) * 2022-08-23 2023-04-21 京东方科技集团股份有限公司 Face recognition countermeasure sample generation method and device and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150279043A1 (en) * 2014-03-28 2015-10-01 Sony Corporation Imaging system with depth estimation mechanism and method of operation thereof
CN111444516A (en) * 2020-03-23 2020-07-24 华南理工大学 Sensitivity-based deep reinforcement learning intelligent agent attack method
CN111626925A (en) * 2020-07-24 2020-09-04 支付宝(杭州)信息技术有限公司 Method and device for generating counterwork patch
CN112287973A (en) * 2020-09-28 2021-01-29 北京航空航天大学 Digital image countermeasure sample defense method based on truncated singular value and pixel interpolation
CN114297730A (en) * 2021-12-31 2022-04-08 北京瑞莱智慧科技有限公司 Countermeasure image generation method, device and storage medium
CN115995104A (en) * 2022-08-23 2023-04-21 京东方科技集团股份有限公司 Face recognition countermeasure sample generation method and device and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118135640A (en) * 2024-05-06 2024-06-04 南京信息工程大学 Method for defending face image attack based on recessive noise

Also Published As

Publication number Publication date
CN115995104A (en) 2023-04-21

Similar Documents

Publication Publication Date Title
WO2024041346A1 (en) Method and apparatus for generating facial recognition adversarial sample, and storage medium
US11062123B2 (en) Method, terminal, and storage medium for tracking facial critical area
US10755120B2 (en) End-to-end lightweight method and apparatus for license plate recognition
CN112488064B (en) Face tracking method, system, terminal and storage medium
WO2021026805A1 (en) Adversarial example detection method and apparatus, computing device, and computer storage medium
US20230177695A1 (en) Instance segmentation method and system for enhanced image, and device and medium
KR102435365B1 (en) Certificate recognition method and apparatus, electronic device, computer readable storage medium
CN112668483B (en) Single-target person tracking method integrating pedestrian re-identification and face detection
WO2020238374A1 (en) Method, apparatus, and device for facial key point detection, and storage medium
CN111401521B (en) Neural network model training method and device, and image recognition method and device
JP2019117577A (en) Program, learning processing method, learning model, data structure, learning device and object recognition device
US20200134385A1 (en) Deep learning model used for image recognition and training apparatus of the model and method thereof
CN111461113B (en) Large-angle license plate detection method based on deformed plane object detection network
Zhu et al. Improving robustness of facial landmark detection by defending against adversarial attacks
CN113822278A (en) License plate recognition method for unlimited scene
WO2021147437A1 (en) Identity card edge detection method, device, and storage medium
Ranftl et al. Face tracking using optical flow
CN113435264A (en) Face recognition attack resisting method and device based on black box substitution model searching
CN109241878B (en) Lip positioning-based facial feature positioning method and system
CN114220108A (en) Text recognition method, readable storage medium and text recognition device for natural scene
CN114283448A (en) Child sitting posture reminding method and system based on head posture estimation
CN114333038B (en) Training method of object recognition model, object recognition method, device and equipment
CN115497097A (en) Inclined Chinese character click verification code identification method
CN111881732B (en) SVM (support vector machine) -based face quality evaluation method
WO2020155484A1 (en) Character recognition method and device based on support vector machine, and computer device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23856436

Country of ref document: EP

Kind code of ref document: A1