WO2024041346A1

WO2024041346A1 - Method and apparatus for generating facial recognition adversarial sample, and storage medium

Info

Publication number: WO2024041346A1
Application number: PCT/CN2023/111034
Authority: WO
Inventors: 王镜茹
Original assignee: 京东方科技集团股份有限公司
Priority date: 2022-08-23
Filing date: 2023-08-03
Publication date: 2024-02-29
Also published as: CN115995104A

Abstract

A method and apparatus for generating a facial recognition adversarial sample, and a storage medium. The method comprises: preprocessing a first image and a second image, and generating a plurality of alternative attack mask images, wherein the first image and the second image each include a facial region; performing K iterative modification attacks on the first image by using each alternative attack mask image and the second image, calculating a similarity score between a modified image, which is obtained after the K iterative modification attacks, and the second image, and generating a final attack mask image according to similarity scores corresponding to the plurality of alternative attack mask images, wherein the alternative attack mask images and the final attack mask image are all used for specifying modification regions when the iterative modification attacks are performed on the first image, and the modification regions specified by different alternative attack mask images are different; and performing M iterative modification attacks on the first image by using the final attack mask image and the second image, so as to obtain a facial recognition adversarial sample, wherein M and K are both positive integers greater than 1, and M > K.

Description

Face recognition adversarial sample generation method, device and storage medium

This application requests the priority of the Chinese patent application submitted to the China Patent Office on August 23, 2022, with the application number 202211013503.3 and the invention title "Method and device for generating adversarial samples for face recognition, and storage medium", and its content should be understood are incorporated by reference into this application.

Technical field

Embodiments of the present disclosure relate to but are not limited to the technical field of face recognition, and in particular, to a method and device for generating face recognition adversarial samples, and a storage medium.

Background technique

Face recognition is a biometric technology that performs identity recognition based on people's facial feature information. It is usually implemented based on deep learning models to extract facial features. However, deep learning models are vulnerable to attacks called “adversarial examples,” which can cause the model to produce incorrect predictions by adding small perturbations that are imperceptible to the human eye.

The phenomenon of adversarial examples reveals the security vulnerabilities of deep learning models. Therefore, studying how to quickly and massively generate high-quality adversarial samples is particularly important to defend against adversarial attacks.

Contents of the invention

The following is an overview of the topics described in detail in this article. This summary is not intended to limit the scope of the claims.

Embodiments of the present disclosure provide a method for generating face recognition adversarial samples, including:

Preprocess the first image and the second image, and generate multiple candidate attack mask images, where the first image and the second image both contain human face areas;

Using each of the candidate attack mask images and the second image, perform K iterative modification attacks on the first image, and calculate the similarity score between the modified image obtained after the K iterative modification attacks and the second image. , generate the final attack based on the similarity scores corresponding to multiple candidate attack mask images. Mask image, wherein the attack mask image and the final attack mask image are both used to specify the modification area when performing an iterative modification attack on the first image, and the different areas specified by the alternative attack mask image are The modification areas are different;

Using the final attack mask image and the second image, perform M iterative modification attacks on the first image to obtain face recognition adversarial samples, where M and K are both positive integers greater than 1, and M>K .

Embodiments of the present disclosure also provide a device for generating face recognition adversarial samples, including a memory; and a processor coupled to the memory, where the processor is configured to execute based on instructions stored in the memory. The steps of the method for generating face recognition adversarial samples according to any embodiment of the present disclosure.

An embodiment of the present disclosure also provides a storage medium on which a computer program is stored. When the program is executed by a processor, the method for generating face recognition adversarial samples as described in any embodiment of the present disclosure is implemented.

After reading and understanding the drawings and detailed description, other aspects can be understood.

Description of drawings

The drawings are used to provide an understanding of the technical solution of the present disclosure and constitute a part of the specification. They are used to explain the technical solution of the present disclosure together with the embodiments of the present disclosure and do not constitute a limitation of the technical solution of the present disclosure.

Figure 1 is a schematic flowchart of a method for generating face recognition adversarial samples according to an exemplary embodiment of the present disclosure;

Figure 2 is a schematic diagram of multiple alternative attack mask images generated by an exemplary embodiment of the present disclosure;

Figure 3 is a schematic flowchart of an iterative modification attack according to an exemplary embodiment of the present disclosure;

Figure 4 is a schematic diagram of a method for correcting the final attack mask image according to an exemplary embodiment of the present disclosure;

Figure 5 is a flowchart of another method for generating face recognition adversarial samples according to an exemplary embodiment of the present disclosure. schematic diagram;

Figure 6 is a schematic diagram of the generation effect of face recognition adversarial samples according to an exemplary embodiment of the present disclosure;

FIG. 7 is a schematic structural diagram of a device for generating face recognition adversarial samples according to an exemplary embodiment of the present disclosure.

Detailed ways

In order to make the purpose, technical solutions and advantages of the present disclosure more clear, the embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings. Embodiments may be implemented in many different forms. Those of ordinary skill in the art can easily understand the fact that the manner and content can be transformed into various forms without departing from the spirit and scope of the present disclosure. Therefore, the present disclosure should not be construed as being limited only to the contents described in the following embodiments. The embodiments and features in the embodiments of the present disclosure may be arbitrarily combined with each other unless there is any conflict.

The scale of the drawings in this disclosure can be used as a reference in actual processes, but is not limited thereto. For example: the width-to-length ratio of the channel, the thickness and spacing of each film layer, and the width and spacing of each signal line can be adjusted according to actual needs. The number of pixels in the display panel and the number of sub-pixels in each pixel are not limited to the numbers shown in the figures. The figures described in the present disclosure are only structural schematic diagrams. One mode of the present disclosure is not limited to the figures. The shape or numerical value shown in the figure.

Ordinal numbers such as "first", "second" and "third" in this specification are provided to avoid confusion of constituent elements and are not intended to limit the quantity.

As shown in Figure 1, an embodiment of the present disclosure provides a method for generating face recognition adversarial samples, which includes the following steps:

Step 101: Preprocess the first image and the second image, and generate multiple candidate attack mask images, where the first image and the second image both contain human face areas;

Step 102: Use each candidate attack mask image and the second image to perform K iterative modification attacks on the first image, calculate the similarity score between the modified image obtained after K iterative modification attacks and the second image, and calculate the similarity score between the modified image and the second image based on the multiple candidate attacks. The similarity score corresponding to the mask image is used to generate the final attack mask image. The candidate attack mask image and the final attack mask image are both used to specify the modification area when iteratively modifying the first image. Different alternatives Attack the modified area specified by the mask image different;

Step 103: Use the final attack mask image and the second image to perform M iterative modification attacks on the first image to obtain face recognition adversarial samples, where M and K are both positive integers greater than 1, and M>K.

The face recognition adversarial sample generation method provided by the embodiment of the present disclosure generates multiple candidate attack mask images, uses each candidate attack mask image and the second image, performs K iterative modification attacks on the first image, and calculates K times The similarity score between the modified image obtained after iteratively modifying the attack and the second image; a final attack mask image is generated based on the similarity scores corresponding to multiple alternative attack mask images; using the final attack mask image and the second image, the first An image is subjected to M iterative modification attacks, M>K>1, to obtain a face recognition adversarial sample, which reduces the modification area of the face image, reduces the visual difference before and after modification, and achieves the goal of modifying the face image as little as possible. It can "cheat" the face recognition system, which can prevent face abuse on the one hand, and can help the deep neural network model improve the robustness on the other hand.

In some exemplary implementations, preprocessing the first image and the second image in step 101 includes:

Use the face detection model to extract the face areas in the first image and the second image respectively;

Use the facial key point detection model to extract facial key points in the first image and the second image respectively;

Perform affine transformation on the extracted face key points in the first image and the second image to obtain the aligned first image and the second image. The aligned first image and the second image are both face images;

Adjust the aligned first image and second image to a preset size.

In the embodiment of the present disclosure, preprocessing the image includes the following operations: accurately calibrating the position and size of the face in the image through the face detection model; obtaining the key point coordinates of the face through the face key point detection model; according to The key point coordinates of the face adjust the angle of the face to align the face; adjust the aligned face image to the preset size. For example, the default size may be 112 pixels*112 pixels.

In some exemplary implementations, multiple candidate attack mask images are generated in step 101, including:

According to the preprocessed image size, set the starting position coordinates, window size, step size and end position coordinates, where the starting position coordinates include the horizontal starting position coordinates along the first direction and the longitudinal starting position along the second direction. The starting position coordinates, the ending position coordinates include the transverse ending position coordinates along the first direction and the longitudinal ending position coordinates along the second direction, the step length includes the transverse step length along the first direction and the longitudinal step length along the second direction, One direction intersects the second direction;

According to the set starting position coordinates, window size, step size and end position coordinates, multiple alternative attack mask images are generated, where the multiple alternative attack mask images include a*b images, a=(horizontal end position coordinate-horizontal Starting position coordinates)/horizontal step size, b = (vertical end position coordinates - longitudinal starting position coordinates)/vertical step size, the size of the modified area specified by each candidate attack mask image is equal to the window size.

In some exemplary embodiments, the first direction and the second direction may be perpendicular to each other.

In some exemplary embodiments, the window shape may be a rectangular shape, and the window size may include a window width along a first direction and a window height along a second direction.

In some exemplary embodiments, the starting position coordinates are [startx, starty], where the horizontal starting position coordinates are startx, the vertical starting position coordinates are starty, and the window size is [winw, winh], where the window width is winw, the window height is winh, the horizontal step size is stepx, the vertical step size is stepy, and the end position coordinate is [endx, endy], where the horizontal end position coordinate is endx and the vertical end position coordinate is endy, then a = ( endx-startx)/stepx, b=(endy-starty)/stepy, the modification area specified by the first candidate attack mask image is the rectangular area from [startx, starty] to [startx+winw, starty+winh], No. The specified modification area of the two candidate attack mask images is the rectangular area from [startx+stepx, starty] to [startx+stepx+winw, starty+winh],..., and the specified modification area of the ath candidate attack mask image is The rectangular area from [endx, starty] to [endx+winw, starty+winh], the modified area specified by the a+1 candidate attack mask image is from [startx, starty+stepy] to [startx+winw, starty+ stepy+winh],..., the modified area specified by the a*b candidate attack mask image is the rectangular area from [endx, endy] to [endx+winw, endy+winh]. The size of the modification area specified for each candidate attack mask image is winw*winh. However, the modification areas specified for different candidate attack mask images are different.

For example, take startx=20, starty=20, winw=13, winh=10, stepx=8, stepy=5, endx=92, endy=95, and the multiple candidate attack mask images generated are as shown in Figure 2. Show. In each alternative attack mask image, the coordinates of the upper left position are the starting position coordinates of the area specified by the alternative attack mask image (i.e., the coordinates of the upper left corner vertex of the black rectangular block), and the coordinates of the lower right position are the coordinates of the alternative attack The end position coordinates of the area specified by the mask image (that is, the coordinates of the lower right corner vertex of the black rectangular block). The size of the modified area specified by each candidate attack mask image is 13 pixels * 10 pixels. Each row includes (92-20)/8=9 candidate attack mask images, and each column includes (95-20)/5=15 candidate attack mask images. Therefore, a total of 9×15=135 candidate attack mask images are generated. image.

In some exemplary embodiments, the first image and the second image may be different images belonging to the same identity. At this time, the goal of the face recognition adversarial attack is to modify the first image so that it has the same similarity score as the face feature vector extracted by the face recognition model with other face images under the same identity (i.e., the second image). Probably low.

In some exemplary embodiments, when the first image and the second image are different images belonging to the same identity, as shown in Figure 3, an iterative modification attack is performed on the first image in step 102 or step 103, including the following steps :

Input the first image or the modified first image and the second image into the face recognition model respectively, obtain the first face feature vector corresponding to the first image and the second face feature vector corresponding to the second image, and calculate the first face feature vector. The value of the loss function between the face feature vector and the second face feature vector is backpropagated, and the first image is modified to make the loss function between the first face feature vector and the second face feature vector The value of increases, the modified area is the modified area specified by the candidate attack mask image or the final attack mask image. The loss function of the face recognition model is -1*the sum of the first face feature vector and the second face feature vector. loss function between.

In the embodiment of the present disclosure, when K iterative modification attacks are performed on the first image in step 102, the modification area is the modification area specified by the alternative attack mask image used in this iterative modification attack; in step 103, the first image is modified in an iterative modification attack. When the image undergoes M iterative modification attacks, the modified area is the modified area specified by the final attack mask image. During the first modification attack in K iterative modification attacks or M iterative modification attacks, the first image and the second image are respectively input into the face recognition model, and the first face feature vector corresponding to the first image and the second face feature vector are obtained. The second face feature vector corresponding to the two images; when performing a non-first modification attack in K iterative modification attacks or M iterative modification attacks, input the first image and the second image modified by the previous modification attack respectively. The face recognition model obtains the first face feature vector corresponding to the first image and the second face feature vector corresponding to the second image.

In the embodiment of the present disclosure, the face recognition model may include one or more. When the face recognition model includes multiple and the first image and the second image are different images belonging to the same identity, the third image in step 102 or step 103 is Perform an iterative modification attack on an image, including the following steps: input the first image or the modified first image and the second image into each face recognition model, and obtain the first face feature vector and the corresponding first face feature vector of each face recognition model. The second face feature vector, calculates the value of the loss function between the first face feature vector and the second face feature vector corresponding to each face recognition model, and weights the values of the loss functions corresponding to multiple face recognition models Average and back propagate, modify the first image to increase the value of the weighted average loss function, and the modified area is the modified area specified by the alternative attack mask image or the final attack mask image.

In some exemplary implementations, the loss function between the first facial feature vector and the second facial feature vector may be calculated by a mean squared loss function (Mean Squared Error, MSE).

In some exemplary embodiments, when the first image and the second image are different images belonging to the same identity, in step 103, a final attack mask image is generated based on the similarity scores corresponding to multiple candidate attack mask images, including:

Sort the similarity scores corresponding to multiple candidate attack mask images;

Select the N candidate attack mask images with the highest similarity scores;

The modified areas specified by the top N candidate attack mask images are synthesized to obtain the modified area specified by the final attack mask image.

In other exemplary embodiments, the first image and the second image may be different images belonging to different identities. At this time, the goal of the face recognition adversarial attack is to modify the first image so that it has the same similarity score as the face feature vector extracted by the face recognition model with other face images under different identities (i.e., the second image). Probably high.

In some exemplary embodiments, when the first image and the second image are different images belonging to different identities, as shown in Figure 3, an iterative modification attack is performed on the first image in step 102 or step 103, including:

Input the first image or the modified first image and the second image into the face recognition model respectively, obtain the first face feature vector corresponding to the first image and the second face feature vector corresponding to the second image, and calculate the first face feature vector. The value of the loss function between the face feature vector and the second face feature vector and reverse Propagation, modifying the first image so that the value of the loss function between the first face feature vector and the second face feature vector decreases, the modification area is the modification specified by the alternative attack mask image or the final attack mask image area, the loss function of the face recognition model is the loss function between the first face feature vector and the second face feature vector.

In the embodiment of the present disclosure, the face recognition model may include one or more. When the face recognition model includes multiple and the first image and the second image are different images belonging to different identities, the third image in step 102 or step 103 is Perform an iterative modification attack on an image, including: inputting the first image or the modified first image and the second image into each face recognition model, and obtaining the first face feature vector and the second face feature vector corresponding to each face recognition model. Face feature vectors, calculate the value of the loss function between the first face feature vector and the second face feature vector corresponding to each face recognition model, weighted average the values of the loss functions corresponding to multiple face recognition models and combine them In back propagation, the first image is modified so that the value of the weighted average loss function decreases, and the modified area is the modified area specified by the alternative attack mask image or the final attack mask image.

In some exemplary embodiments, when the first image and the second image are different images belonging to different identities, in step 103, a final attack mask image is generated based on the similarity scores corresponding to multiple candidate attack mask images, including:

Select N candidate attack mask images with the lowest similarity scores;

The modified areas specified by the ranked N candidate attack mask images are synthesized to obtain the modified area specified by the final attack mask image.

In some exemplary embodiments, N may range from 3 to 8. For example, N can be 5 or 6.

In some exemplary implementations, as shown in Figure 4, the method may further include:

Detect whether the modified area specified by the final attack mask image includes one or more stepped-shaped areas;

When one or more stepped-shaped areas are included, each stepped-shaped area is divided into a plurality of rectangular areas.

The face recognition adversarial sample generation method provided by the embodiments of the present disclosure is to mask the final attack. The specified modified area of the membrane image is corrected (that is, divided into multiple rectangular areas), which further reduces the modified area of the original image and reduces the visual difference before and after modification.

In some exemplary embodiments, M is between 120 and 180, and K is between 20 and 40. For example, M is 150 and K is 30.

In some exemplary implementations, as shown in Figure 5, the method for generating face recognition adversarial samples provided by embodiments of the present disclosure includes the following steps:

1) For a given face image, denoted as P1, assume its identity is I1. Use the face detection model to intercept the face area in P1; after using the face key point detection model to extract the key points, use affine transformation to correct the intercepted face area to the preset size (example, 112 pixels *112 pixels) standard image.

Generally, attacks on face recognition models can be divided into two types. The first is to modify P1 so that its similarity score with the feature vectors extracted by the model from other face images under the same identity is as low as possible; It is to modify P1 so that its similarity score with the feature vectors extracted by the model from other face images under different identities is as high as possible.

Taking the first type of attack as an example, given another face image, denoted as P2, assume that its identity is I1, which is the same as P1. Use the face detection model to intercept the face area in P2; after using the face key point detection model to extract the key points, use affine transformation to correct the intercepted face area to the preset size (example, 112 pixels *112 pixels) standard image.

Taking the second attack type as an example, given another face image, denoted as P3, assume that its identity is I2, which is different from P1. Use the face detection model to intercept the face area in P3; after using the face key point detection model to extract the key points, use affine transformation to correct the intercepted face area to the preset size (example, 112 pixels *112 pixels) standard image.

2) Starting from the [startx, starty] position, use [winw, winh] as the window size, [stepx, stepy] as the step size, and use [endx, endy] as the end sliding position to move the retrieval block and generate multiple alternative attacks. Mask image.

For example, take startx=20, starty=20, winw=13, winh=10, stepx=8, stepy=5, endx=92, endy=95. The generated multiple candidate attack mask images are as shown in Figure 2. Show. In each alternative attack mask image, the coordinates of the upper left position are the alternative attack mask images. The starting position coordinates of the specified area (i.e., the coordinates of the upper left corner vertex of the black rectangular block), and the coordinates of the lower right position are the ending position coordinates of the area specified by the alternative attack mask image (i.e., the coordinates of the lower right corner vertex of the black rectangular block) coordinate). Each row includes (92-20)/8=9 candidate attack mask images, and each column includes (95-20)/5=15 candidate attack mask images. Therefore, a total of 9×15=135 candidate attack images are generated. Mask image.

3) Use each candidate attack mask image to perform K iterative modification attacks on the P1 image. Based on the results obtained after K iterative modification attacks, determine the areas in the P1 image that are easier to attack successfully.

Taking the first attack type as an example, as shown in Figure 3, at this time, the first image is P1 and the second image is P2. The two images P1 and P2 are input into the face recognition model and the face feature vectors are extracted respectively. Calculate the MSE Loss between the extracted facial feature vectors. The parameters of the fixed model remain unchanged. The P1 image is modified through backpropagation to make the MSE Loss continue to increase. The modified area of the P1 image is the alternative attack mask image. The specified modification area. The gradient descent method is used for iteration, and the loss function is -1*MSE Loss. In this way, the MSE Loss can be increased when performing gradient descent. One gradient descent is the completion of one iteration. For each candidate attack mask image generated in step 2), the process is used to perform K iterations (exemplarily, K=30). Calculate the similarity between the final modified image and the P2 image corresponding to each alternative attack mask image, and select N alternative attack mask images with lower similarity scores (for example, N can be 5 or 6 ), the areas corresponding to the N candidate attack mask images are the top N areas that are easier to attack successfully.

Taking the second attack type as an example, as shown in Figure 3, at this time, the first image is P1 and the second image is P3. The two images P1 and P3 are input into the face recognition model and extracted into face feature vectors respectively. Calculate the MSE Loss between the extracted facial feature vectors. The parameters of the fixed model remain unchanged. The P1 image is modified through backpropagation to continuously reduce the MSE Loss. The modified area of the P1 image is the candidate attack mask image. The specified modification area. The gradient descent method is used for iteration, and the loss function is MSE Loss. In this way, the MSE Loss can be reduced when performing gradient descent. One gradient descent is the completion of one iteration. For each candidate attack mask image generated in step 2), the process is used to perform K iterations (exemplarily, K=30). Calculate the similarity between the final modified image and the P3 image corresponding to each alternative attack mask image, and select N alternative attack mask images with higher similarity scores (for example, N can be 5 or 6 ), the areas they correspond to are the top N areas that are easier to attack successfully.

After obtaining the first N areas that are easier to attack successfully, these areas are combined to generate a final attack mask image. Due to the overlapping relationship between the sliding windows used when generating alternative attack mask images, the directly generated final attack mask image may have a stepped shape, which affects the visual effect of the final generated attack image. In order to further reduce the area of the modified area, the embodiment of the present disclosure scans the generated mask pixel by pixel along the horizontal direction, and finds that when the mask heights in the vertical direction are different between two adjacent pixels in the horizontal direction, The mask area is cut off at this position to obtain the final attack mask image, as shown in Figure 4, the corrected attack mask image.

4) After obtaining the final attack mask image, use an iterative modification attack process similar to that described in step 3) to perform M iterative modification attacks on the original face image P1 (exemplarily, M=150) to generate the final Face attack images (i.e. face recognition adversarial samples).

This disclosed embodiment uses the Fast Gradient Sign Attack (FGSM) to modify the original face image. FGSM, in a white box environment, calculates the derivative of the model on the input data, uses a function to find the gradient direction, and then multiplies it by Step size, what is obtained is the amount of disturbance. Adding this amount of disturbance to the original input, we get the face recognition adversarial sample under the FGSM attack. This face recognition adversarial sample has a high probability of making the model classify incorrectly. , to achieve the purpose of attack.

Taking the second attack type as an example, the final attack effect is shown in Figure 6. The image on the left side of Figure 6 is the original image, the middle image is the modified image, that is, the generated face recognition adversarial sample, and the image on the right side is Confusing images for different identities. The visual difference between the modified image and the original image is not obvious, and the similarity score of the confused image with different identity can reach more than 0.3 in the face recognition model.

Embodiments of the present disclosure provide a method for generating adversarial samples for face recognition. By using sliding windows to perform block retrieval on the entire face image, areas in the entire face image that are easier to attack are found, thereby generating optimal The attack mask image is used to generate the final attack image (i.e. face recognition adversarial sample) based on the optimal attack mask image. Under the premise of changing a small area, the face image can "deceive" the recognition system, thereby Effectively prevent face images uploaded to the cloud from being abused.

Embodiments of the present disclosure also provide a device for generating face recognition adversarial samples, including a memory; and a processor coupled to the memory, where the processor is configured to generate The instructions in execute the steps of the method for generating face recognition adversarial samples as described in any embodiment of the present disclosure.

As shown in Figure 7, in one example, the device for generating face recognition adversarial samples may include: a processor 710, a memory 720, and a bus system 730. The processor 710 and the memory 720 are connected through the bus system 730, and the memory 720 For storing instructions, the processor 710 is used for executing instructions stored in the memory 720 to preprocess the first image and the second image and generate a plurality of candidate attack mask images, wherein the first image and the second image Both include face areas; use each of the candidate attack mask images and the second image to perform K iterative modification attacks on the first image, and calculate the modified image obtained after the K iterative modification attacks and the third image. The similarity scores of the two images are used to generate a final attack mask image based on the similarity scores corresponding to multiple alternative attack mask images, where both the alternative attack mask images and the final attack mask image are used to specify the target. The modification area when the first image is subjected to an iterative modification attack, and the modification areas specified by different alternative attack mask images are different; use the final attack mask image and the second image to perform M on the first image Iteratively modify the attack to obtain face recognition adversarial samples, in which M and K are both positive integers greater than 1, and M>K.

It should be understood that the processor 710 can be a central processing unit (Central Processing Unit, CPU). The processor 710 can also be other general-purpose processors, digital signal processors (DSP), application-specific integrated circuits (ASICs), and off-the-shelf programmable gate arrays. (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.

Memory 720 may include read-only memory and random access memory and provides instructions and data to processor 710 . A portion of memory 720 may also include non-volatile random access memory. For example, memory 720 may also store device type information.

In addition to the data bus, the bus system 730 may also include a power bus, a control bus, a status signal bus, etc. However, for the sake of clarity, the various buses are labeled as bus system 730 in FIG. 7 .

During implementation, the processing performed by the processing device may be completed by instructions in the form of hardware integrated logic circuits or software in the processor 710 . That is to say, the method steps of the embodiments of the present disclosure may be implemented by a hardware processor, or may be executed by a combination of hardware and software modules in the processor. Software modules can be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media. The storage medium is located in the memory 720. The processor 710 reads the information in the memory 720 and completes the steps of the above method in combination with its hardware. To avoid repetition, it will not be described in detail here.

In some possible implementations, various aspects of the method for generating face recognition adversarial samples provided by the present disclosure can also be implemented in the form of a program product, which includes program code. When the program product is run on a computer device , the program code is used to cause the computer device to execute the steps in the method for generating face recognition adversarial samples according to various exemplary embodiments of the present disclosure described above in this specification. For example, the computer device can execute the present disclosure. The method for generating face recognition adversarial samples described in the embodiment.

The program product may take the form of any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to: electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more conductors, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.

The drawings in this disclosure only refer to the structures involved in this disclosure, and other structures may refer to common designs. If there is no conflict, the embodiments of the present disclosure and the features in the embodiments may be combined with each other to obtain new embodiments.

Those of ordinary skill in the art should understand that the technical solutions of the present disclosure can be modified or equivalently substituted without departing from the spirit and scope of the technical solutions of the present disclosure, and all should be covered by the scope of the claims of the present disclosure.

Claims

A method for generating face recognition adversarial samples, including:

Preprocess the first image and the second image, and generate multiple candidate attack mask images, where the first image and the second image both contain human face areas;

Using each of the candidate attack mask images and the second image, perform K iterative modification attacks on the first image, and calculate the similarity score between the modified image obtained after the K iterative modification attacks and the second image. , generate a final attack mask image based on the similarity scores corresponding to multiple candidate attack mask images, wherein both the candidate attack mask images and the final attack mask image are used to specify the pair of The modification area when the first image is subjected to an iterative modification attack, and the modification areas specified by different alternative attack mask images are different;

Using the final attack mask image and the second image, perform M iterative modification attacks on the first image to obtain face recognition adversarial samples, where M and K are both positive integers greater than 1, and M>K .
The generation method according to claim 1, wherein the first image and the second image are different images belonging to the same identity.
The generation method according to claim 2, wherein performing an iterative modification attack on the first image includes:

The first image or the modified first image and the second image are respectively input into the face recognition model to obtain the first face feature vector corresponding to the first image and the first face feature vector corresponding to the second image. the second face feature vector, calculate the value of the loss function between the first face feature vector and the second face feature vector and perform backpropagation, and modify the first image so that the first The value of the loss function between the face feature vector and the second face feature vector increases, the modification area is the modification area specified by the alternative attack mask image or the final attack mask image, and the face recognition model The loss function is -1*the loss function between the first facial feature vector and the second facial feature vector.
The generation method according to claim 2, wherein generating a final attack mask image based on similarity scores corresponding to a plurality of candidate attack mask images includes:

Sort the similarity scores corresponding to multiple candidate attack mask images;

Select the N candidate attack mask images with the highest similarity scores;

The modified areas specified by the top N candidate attack mask images are synthesized to obtain the modified area specified by the final attack mask image.
The generation method according to claim 1, wherein the first image and the second image are different images belonging to different identities.
The generation method according to claim 5, wherein performing an iterative modification attack on the first image includes:

The first image or the modified first image and the second image are respectively input into the face recognition model to obtain the first face feature vector corresponding to the first image and the first face feature vector corresponding to the second image. the second face feature vector, calculate the value of the loss function between the first face feature vector and the second face feature vector and perform backpropagation, and modify the first image so that the first The value of the loss function between the face feature vector and the second face feature vector decreases, the modification area is the modification area specified by the alternative attack mask image or the final attack mask image, and the loss of the face recognition model The function is the loss function between the first facial feature vector and the second facial feature vector.
The generation method according to claim 5, wherein generating a final attack mask image based on similarity scores corresponding to a plurality of candidate attack mask images includes:

Sort the similarity scores corresponding to multiple candidate attack mask images;

Select the N candidate attack mask images ranked lower in similarity score;

The modified areas specified by the N sequenced candidate attack mask images are synthesized to obtain the modified area specified by the final attack mask image.
The generation method according to claim 1, wherein the method further includes:

Detect whether the modified area specified by the final attack mask image includes one or more stepped-shaped areas;

When one or more of the stepped-shaped regions are included, each of the stepped-shaped regions is divided into for multiple rectangular areas.
The generation method according to claim 1, wherein generating multiple candidate attack mask images includes:

According to the preprocessed image size, set the starting position coordinates, window size, step size and end position coordinates, where the starting position coordinates include the horizontal starting position coordinates along the first direction and the longitudinal starting position along the second direction. The starting position coordinates, the ending position coordinates include the transverse ending position coordinates along the first direction and the longitudinal ending position coordinates along the second direction, the step length includes the transverse step length along the first direction and the longitudinal step length along the second direction, One direction intersects the second direction;

According to the set starting position coordinates, window size, horizontal step size, vertical step size and end position coordinates, multiple candidate attack mask images are generated, wherein the multiple candidate attack mask images include a*b images, a=( Horizontal end position coordinates - horizontal start position coordinates)/horizontal step size, b = (vertical end position coordinates - longitudinal start position coordinates)/vertical step size, the modification area specified by each of the alternative attack mask images The size is equal to the window size.
The generation method according to claim 1, wherein said preprocessing the first image and the second image includes:

Use a face detection model to extract face areas in the first image and the second image respectively;

Use a facial key point detection model to extract facial key points in the first image and the second image respectively;

Perform affine transformation on the extracted key points of the human face in the first image and the second image to obtain the aligned first image and the second image. The aligned first image and the second image are both human faces. image;

Adjust the aligned first image and second image to a preset size.
The generation method according to claim 1, wherein M is between 120 and 180, and K is between 20 and 40.
A device for generating face recognition adversarial samples, including a memory; and a processor coupled to the memory, the processor being configured to execute based on instructions stored in the memory The steps of the method for generating face recognition adversarial samples according to any one of claims 1 to 11.
A storage medium on which a computer program is stored. When the program is executed by a processor, the method for generating face recognition adversarial samples according to any one of claims 1 to 11 is implemented.