CN117636430A

CN117636430A - Hidden face attack countermeasure method and system based on countermeasure semantic mask

Info

Publication number: CN117636430A
Application number: CN202311610474.3A
Authority: CN
Inventors: 刘德成; 苏麒蘐; 王楠楠; 彭春蕾; 高新波
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2023-11-28
Filing date: 2023-11-28
Publication date: 2024-03-01

Abstract

The invention provides a hidden face attack countermeasure method and a hidden face attack countermeasure system based on a countermeasure semantic mask. The method comprises the following steps: inputting an original image into a pre-trained fake detection model to obtain a class activation diagram; inputting the class activation diagram into a pre-trained semantic segmentation model, and obtaining a plurality of selected areas according to a preset segmentation task; inquiring selected pixel points of a plurality of selected areas in an original image, and determining an attack area according to the number and the size of the selected pixel points; inputting the original image into a pre-trained semantic segmentation model, and generating a corresponding semantic mask according to the attack region; adding noise into the original image along the gradient rising direction, and obtaining a final countermeasure sample after iteration for preset times; and calculating the constrained countermeasure image by utilizing the final countermeasure sample, the original image and the corresponding semantic mask. By limiting the disturbance of the finally generated constrained challenge image within a preset range, the generation quality and the portability of the challenge image are improved.

Description

Hidden face attack countermeasure method and system based on countermeasure semantic mask

Technical Field

The invention belongs to the technical field of attack countermeasure, and particularly relates to a hidden face attack countermeasure method and system based on a countermeasure semantic mask.

Background

In recent years, a face forging technology has been attracting more and more attention, and this technology is used for face makeup, face editing, or the like in some scenes. With the rapid development of AI models, a face forgery detection model has also been developed so that forgery-inhibited images can not only confuse human eyes but also need to pass through the forgery detection model.

Existing face attack countermeasure methods, mostly by adding global noise to the face image, tend to have poor portability of the samples generated in these ways. That is, a sample generated by one model cannot continue to maintain the disturbing effect on the model when it is detected by another model. In addition, the method can achieve the purpose of attack by adding patches to the image, namely, attack resistance is carried out on the image by utilizing local noise, and the methods often cause great image quality loss.

Therefore, the sample generated by the existing face attack countermeasure method has poor mobility and poor generation quality.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides a hidden face attack countermeasure method and a hidden face attack countermeasure system based on a countermeasure semantic mask.

The technical problems to be solved by the invention are realized by the following technical scheme:

in a first aspect, the present invention provides a method for countering a hidden face attack based on a countering semantic mask, including:

inputting an original image into a pre-trained fake detection model to obtain a class activation diagram;

inputting the class activation diagram into a pre-trained semantic segmentation model, and obtaining a plurality of selected areas according to a preset segmentation task;

inquiring selected pixel points of a plurality of selected areas in an original image, and determining an attack area according to the number and the size of the selected pixel points;

inputting the original image into a pre-trained semantic segmentation model, and generating a corresponding semantic mask according to the attack region;

adding noise into the original image along the gradient rising direction, and obtaining a final countermeasure sample after iteration for preset times;

and calculating the constrained countermeasure image by utilizing the final countermeasure sample, the original image and the corresponding semantic mask.

Optionally, inputting the class activation graph into a pre-trained semantic segmentation model, and obtaining a plurality of selected regions according to a preset segmentation task, including:

inputting the class activation graph into a pre-trained semantic segmentation model to obtain an intermediate image matrix; the intermediate image matrix consists of segmentation task labels;

and selecting a matrix area corresponding to a preset segmentation task from the intermediate image matrix to obtain a plurality of selected areas.

Optionally, querying selected pixels of the plurality of selected areas in the original image, and determining the attack area according to the number and the size of the selected pixels includes:

acquiring a plurality of selected areas, wherein the selected areas correspond to the number of selected pixel points in an original image;

calculating the ratio of the number of the selected pixel points to the total number of the pixel points of the original image to obtain ratio pixels;

and determining the selected area with the ratio pixel larger than the first preset value as an attack area.

Optionally, adding noise in the original image along the gradient rising direction, and obtaining a final challenge sample after a preset number of iterations, including:

s1, calculating a norm distance between an original image and a current wheel countermeasure sample, and calculating a partial derivative of the norm distance to obtain a current wheel gradient value; the initial countermeasure sample is obtained by superposing a second preset value on the original image;

s2, determining noise according to a symbol function and a current round gradient value, and adding the noise to the current round of countermeasure samples to obtain next round of countermeasure samples;

s3, adding 1 to the number of the current wheel times, taking the added 1 as the current wheel times, and taking the result of S2 as a current wheel countermeasure sample;

s4, performing S1-S3 circularly until the preset iteration times to obtain a final countermeasure sample.

Optionally, calculating a constrained challenge image using the final challenge sample, the original image, and the corresponding semantic mask, including:

making a difference between the original image and the final countermeasure sample to obtain global noise;

multiplying the global noise by a corresponding semantic mask to obtain mask noise;

and adding the mask noise to the original image to obtain a constrained countermeasure image.

Optionally, inputting the original image into a pre-trained fake detection model to obtain a class activation map includes:

inputting an original image into a pre-trained fake detection model to obtain a prediction type;

calculating class activation characteristics of the original image under the prediction class; class activation features are generated by a convolution layer of a pre-trained counterfeit detection model;

upsampling the class activation feature to obtain an upsampled image;

and superposing the up-sampling image and the original image to obtain the class activation diagram.

Optionally, determining noise according to the sign function and the current round gradient value, and adding the noise to the current round of challenge samples to obtain a next round of challenge samples, including:

the current wheel gradient value g _t Inputting a symbol function, and multiplying a noise coefficient alpha with preset intensity to obtain noise;

noise is opposed to the current wheel by sample x _t Adding to obtain the next round of countermeasureSample x _t+1 ；

x _t+1 ＝x _t +α·sign(g _t )；

Wherein x is _t Representing the current round of challenge samples, α represents the noise figure, sign () represents the sign function.

Optionally, after calculating the constrained challenge image by using the final challenge sample, the original image and the corresponding semantic mask, the hidden face attack challenge method based on the challenge semantic mask further includes:

and inputting the constrained countermeasure image as a training sample into a pre-trained fake detection model so as to update parameters of the pre-trained fake detection model.

In a second aspect, the present invention provides a hidden face attack countermeasure system based on a countermeasure semantic mask, including: the system comprises a processor, a storage medium and a bus, wherein the storage medium stores machine-readable instructions executable by the processor, and when the hidden face attack countermeasure system based on the countermeasure semantic mask runs, the processor and the storage medium are communicated through the bus, and the processor executes the machine-readable instructions to execute the steps of the method of the first aspect.

In a third aspect, the present invention provides a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of the first aspect described above.

The invention provides a hidden face attack countermeasure method and a hidden face attack countermeasure system based on a countermeasure semantic mask. The hidden face attack countermeasure method based on the countermeasure semantic mask comprises the following steps: inputting an original image into a pre-trained fake detection model to obtain a class activation diagram; inputting the class activation diagram into a pre-trained semantic segmentation model, and obtaining a plurality of selected areas according to a preset segmentation task; inquiring selected pixel points of a plurality of selected areas in an original image, and determining an attack area according to the number and the size of the selected pixel points; inputting the original image into a pre-trained semantic segmentation model, and generating a corresponding semantic mask according to the attack region; adding noise into the original image along the gradient rising direction, and obtaining a final countermeasure sample after iteration for preset times; and calculating the constrained countermeasure image by utilizing the final countermeasure sample, the original image and the corresponding semantic mask. In the invention, the attack area is determined by selecting the number and the size of the pixel points, and the subsequent semantic mask generation is carried out by utilizing the attack area, so that the disturbance of the finally generated constrained countermeasure image is limited in a preset range, and the generation quality and the mobility of the countermeasure image are improved.

The present invention will be described in further detail with reference to the accompanying drawings and examples.

Drawings

Fig. 1 is a schematic flow chart of a hidden face attack countermeasure method based on a countermeasure semantic mask according to an embodiment of the present invention;

FIG. 2 is a flow chart of a hidden face attack countermeasure method based on a countermeasure semantic mask according to another embodiment of the present invention;

FIG. 3 is a schematic diagram of a semantic mask-based overall framework for covert face attack challenge countering in accordance with an embodiment of the present invention;

fig. 4 is a schematic diagram of a hidden face attack countermeasure system based on a countermeasure semantic mask according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to specific examples, but embodiments of the present invention are not limited thereto.

In order to improve the generation quality and the mobility of the countermeasure images, the embodiment of the invention provides a hidden face attack countermeasure method based on a countermeasure semantic mask. Fig. 1 is a flow chart of a hidden face attack countermeasure method based on a countermeasure semantic mask according to an embodiment of the present invention, where as shown in fig. 1, the hidden face attack countermeasure method based on the countermeasure semantic mask includes:

s101, inputting an original image into a pre-trained fake detection model to obtain a class activation diagram;

s102, inputting a class activation diagram into a pre-trained semantic segmentation model, and obtaining a plurality of selected areas according to a preset segmentation task;

s103, inquiring selected pixel points of a plurality of selected areas in an original image, and determining an attack area according to the number and the size of the selected pixel points;

s104, inputting the original image into a pre-trained semantic segmentation model, and generating a corresponding semantic mask according to the attack region;

s105, adding noise into the original image along the gradient rising direction, and obtaining a final countermeasure sample after iteration for preset times;

it should be noted that, in the embodiment of the present invention, the final challenge sample may specifically represent the original image to which global noise is added.

S106, calculating the constrained countermeasure image by using the final countermeasure sample, the original image and the corresponding semantic mask.

The hidden face attack countermeasure method based on the countermeasure semantic mask provided by the embodiment of the invention comprises the following steps: inputting an original image into a pre-trained fake detection model to obtain a class activation diagram; inputting the class activation diagram into a pre-trained semantic segmentation model, and obtaining a plurality of selected areas according to a preset segmentation task; inquiring selected pixel points of a plurality of selected areas in an original image, and determining an attack area according to the number and the size of the selected pixel points; inputting the original image into a pre-trained semantic segmentation model, and generating a corresponding semantic mask according to the attack region; adding noise into the original image along the gradient rising direction, and obtaining a final countermeasure sample after iteration for preset times; and calculating the constrained countermeasure image by utilizing the final countermeasure sample, the original image and the corresponding semantic mask. In the embodiment of the invention, the attack area is determined by selecting the number and the size of the pixel points, and the subsequent semantic mask generation is carried out by utilizing the attack area, so that the disturbance of the finally generated constrained countermeasure image is limited in a preset range, and the generation quality and the mobility of the countermeasure image are improved.

In the embodiment of the present invention, the generation process of the pre-trained counterfeiting detection model may specifically include:

1. dataset preprocessing

A DFDC (Deepfake Detection Challenge) dataset is used, which contains 472GB of data information, including 119197 face videos, of which 100000 are false face videos and 19197 are true shot videos, the latter video content being more active. Each video in the dataset has a duration of 10 seconds, the video has a frame rate ranging from 15 fps to 30fps, the video has a resolution ranging from 320×240 to 3840×2160, a pretrained Retinaface detection model is used for detecting and locating faces in each frame in the video, the face area of the frame is extracted and stored as a 320×320×3 image, and all face images extracted from the same video are stored in a folder with the same name as the video and are used as a training set. Test set data is available at the DFDC homologous website, and the test set can be generated through the above processing.

2. Fake detection model training

The fake detection model can be XceptionNet, resNet-50, efficient Net-b0 and Efficient Net-b4, and because the original model is used for performing multi-classification tasks, the last full-connection layer output of the model is required to be modified to be 2 in the fake detection training process, that is, the true or false judgment result of the input image is output, if 0, the fake is generated, and if 1, the fake is true. And executing a data set reading function, judging whether the image extracted from the current video is true or not through a csv file which stores data set information, and giving a corresponding label. The model parameters were then trained using the SGD optimizer, and the learning rate was adjusted using the StepLR scheduler. The total training round number can be set to 50, 64 samples are read in each training, 4 GPUs are used for acceleration in the training process, models are verified in each training round, accuracy and loss information are output by using a test set, model parameters in each training round are stored, the best model accuracy is recorded, and the model parameters with the highest accuracy are stored.

3. Fake detection model verification

And manually selecting an image extracted from the video, recording the authenticity information of the image into a csv file, reading a data set through a data set reading function, and calculating the accuracy of model detection.

Fig. 2 is a flow chart of a hidden face attack countermeasure method based on a countermeasure semantic mask according to another embodiment of the present invention, as shown in fig. 2, S102 may specifically include:

s1021, inputting the class activation diagram into a pre-trained semantic segmentation model to obtain an intermediate image matrix; the intermediate image matrix consists of segmentation task labels;

s1022, selecting matrix areas corresponding to preset segmentation tasks from the intermediate image matrix to obtain a plurality of selected areas.

Optionally, S103 may specifically include:

In order to specifically describe the determination process of the attack area, the embodiment of the invention takes the pre-trained semantic segmentation model as an example of the FaceParseNet50 model for detailed description.

After the class activation map y is input into the FaceParseNet50 model, an intermediate image matrix (comprising a matrix formed by 13 numbers corresponding to a nose, eyes, mouth, eyes, hair, skin and the like, wherein the corresponding numbers represent 0-12) can be obtained, the number of pixels corresponding to the nose, eyes, mouth, eyes, hair, skin and the like in an original image can be obtained through the intermediate image matrix, the number of selected pixels is obtained, and the ratio of the number of selected pixels to the total number of pixels in the original image is recorded as CAM.

The larger the CAM value of a certain selected area, the greater the impact on the attack on the challenge result. By comparing the CAM values, the selected areas 5 before the CAM value are finally selected to be the skin, nose, eyebrow, eyes, hair, respectively. And then, respectively performing independent attack countermeasure of the selected area and combined countermeasure of different areas through a semantic segmentation model, and simultaneously performing success rate comparison and visual effect MAE comparison, wherein the finally selected attack area is the area combination of nose, eyebrow and eyes. Table 1 is an assessment of a selected area of a face provided by an embodiment of the present invention.

Table 1 evaluation of selected regions of the face

Selected area	Success rate of attack (%)	MAE	CAM
				Skin of a person	65.46	0.0354	0.031
Nose	45.53	0.0022	0.016
				Eyebrow	51.64	0.0006	0.004
Eyes (eyes)	23.15	0.0008	0.009
				Hair treatment device	30.74	0.0526	0.001
Eye + nose + eyebrow	85.33	0.0075	——

It should be noted that the MAE in table 1 represents a visual evaluation index, specifically, the smaller the value of the MAE, the smaller the quality loss of the image compared with the original image.

After the final attack area is determined, in the generation calculation of the semantic mask, inputting the original image x into a pre-trained semantic segmentation model, selecting the attack area as a nose+eyebrow+eye, setting the pixel value of the attack area as 1 and the rest as 0, and finally generating a corresponding semantic mask S _x 。

Optionally, S105 may specifically include:

It should be noted that, in the embodiment of the present invention, the initial challenge sample is obtained by superimposing the original image with the second preset value, specifically, when the original image is x, the initial challenge sample is x ₀ X is then ₀ May be denoted as x+0.00001. The size of the second preset value can be flexibly adjusted according to actual conditionsThe present embodiment is not limited to this.

Optionally, S106 may specifically include:

Optionally, S101 may specifically include:

it should be noted that, in the embodiment of the present invention, for a given original image x, the prediction class of the pre-trained counterfeit detection model is c _x ：

Where c represents a category (such as true 1, or fake 0); p (c|x) is the predictive probability of category c; arg max represents the category corresponding to the prediction probability with the maximum probability value; c _x The value of (2) is 1 or 0.

for category c _x Weighting characteristics of the kth channelThe method comprises the following steps:

wherein,characteristic of sample x in kth channel, +.>The calculation weight of the characteristic of the kth channel is expressed, and the weighted characteristics of all channels of the sample x are expressed as class activation characteristics phi _x ：

Wherein, the value of K is 1-K, K is a positive integer, and T represents the transposition of the matrix.

Upsampling the class activation feature to obtain an upsampled image;

In this embodiment, the class activation map y may be expressed as:

y＝x+U(Φ _x )；

where U (-) represents the upsampling operation.

noise is opposed to the current wheel by sample x _t Adding to obtain the next round of countermeasure sample x _t+1 ；

x _t+1 ＝x _t +α·sign(g _t )；

In addition, in the embodiment of the invention, the norm distance delta (x, x) between the original image and the current wheel countermeasure sample _t ) Can be expressed as:

wherein δ ()Representing x and x _t Corresponding class activation feature Φ _x Andl in between ₂ Norm distance. Then for delta (x, x _t ) Obtaining the gradient value g of the t-th wheel by solving the bias guide _t ：

Wherein,representing the calculation of the partial derivative.

Further, in the embodiment of the present invention, the maximum disturbance range epsilon can be set, and the current wheel can be used for resisting the sample x _t+1 Is severely limited within the disturbance range.

x _t+1 ＝clip(x _t+1 ，x-ε，x+ε)；

Wherein clip (, x) represents performing range projection _t+1 The minimum value is not less than x-epsilon, and the maximum value is not more than x+epsilon.

The global Noise can be expressed as:

Noise＝x′-x；

where x' represents the final challenge sample.

Constrained contrast imageExpressed as:

wherein, representing pixels in an imageAnd when the value is greater than 0, calculating.

In order to clearly illustrate the method for countering the hidden face attack based on the countering semantic mask provided by the embodiment of the present invention, fig. 3 is a whole framework for countering the hidden face attack based on the semantic mask provided by the embodiment of the present invention, and a specific operation flow shown in fig. 3 includes: obtaining a class activation map by inputting the original image into a fake detection model; inputting the original image into a semantic segmentation model to obtain a region segmentation image; performing feature region weight evaluation and selection according to the class activation mapping diagram to obtain a semantic mask; iteratively updating an original image and adding tiny noise to obtain a final countermeasure sample; the final countermeasure sample is subjected to difference with the original image, so that global noise can be obtained; combining the global noise with the semantic mask to generate an anti-noise semantic mask; adding an anti-noise semantic mask to the original image generates a constrained anti-image.

In order to verify the hidden face attack countermeasure method based on the countermeasure semantic mask provided by the embodiment of the invention, the embodiment of the invention evaluates the countermeasure effect on the public face database. The DFDC data set contains 472GB data, including 119197 face videos, 100000 of which are dummy face videos, 19197 of which are videos shot by true persons, and the content is more realistic. Each video segment in the dataset has a duration of 10 seconds, a frame rate of from 15 frames/second to 30 frames/second, and a video resolution of from 320 x 240 to 3840 x 2160.

Table 2 success rate (ASR) comparison of black box attacks and white box attacks between different models on DFDC datasets

As can be seen from the data in the table, the ASMA has a stronger challenge performance than the challenge algorithms such as FGSM, PGD, etc. Compared with the FGSM and the PGD methods, the success rate of the white box attack is highest, for example, when the constrained countermeasure image generated based on XceptionNet is used for countering the XceptionNet, the countersuccess rate is 0.1% higher than that of the PGD algorithm under the condition that the counterarea is reduced, and 43.29% higher than that of the FGSM, and the white box attack performance is very good. Meanwhile, the contrast image after constraint generated on ResNet50 has a contrast success rate improved by 17.06% compared with PGD when the XceptionNet is resisted, a contrast success rate improved by 26.79% compared with PGD when the EfficientNet-B0 is resisted, and a contrast success rate improved by 9.45% compared with PGD when the EfficientNet-B4 is resisted, so that the migration of the black box is verified. It can also be seen from the table data that a large deviation may occur in the challenge success rate of the constrained challenge image on different models, such as 21.14% PGD challenge success rate when the constrained challenge image generated on the afficientnet-B4 is against XceptionNet, 6.94% PGD challenge success rate when the ResNet50 is against, and much less variation in the challenge success rate of the ASMA generated constrained challenge image on different models than other methods. In addition, it can be seen that constrained challenge images generated on XtoptionNet have a low success rate in antagonizing Efficient Net-B0 and Efficient Net-B4, and constrained challenge images generated based on Efficient Net-B4 and Efficient Net-B4 are also difficult to migrate to XtoptionNet. This is due to the obvious structural differences between EfficientNet-B0 and EfficientNet-B4 and XcetionNet, and the resistance between the two networks is limited to migrating to each other. Experimental results show that adding a trace amount of disturbance countermeasure to the image counterfeiting key area has obvious improvement on disturbance countermeasure success rate of XceptionNet, resNet, efficientNet-B0 and EfficientNet-B4 models.

TABLE 3 quality assessment of constrained contrast images generated by ResNet50

Constrained challenge images were generated at ResNet50 using FGSM, BIM, PGD, C & W, deepFool, ASMA without region constraint (W) and region constraint ASMA, respectively, and evaluated using image evaluation metrics such as MSE, MAE, PSNR and SSIM. As can be seen from the data in table 3, the MSE and the MAE reflect the difference between the original image and the constrained challenge image, and the quality of the constrained challenge image generated by the ASMA is almost the same as that of the constrained challenge image generated by other methods without constraining the attack region, and after the region disturbance is added, the quality loss is greatly reduced. For the result of the image similarity SSIM, the ASMA generated constrained challenge image is undoubtedly closer to the original image. Compared with other algorithms, the method has the advantages that the constrained countermeasure images generated by ASMA can keep higher quality and better concealment while the countermeasure intensity is the same, and compared with other algorithms, the method can keep the migration and concealment of the countermeasure images.

The method provided by the embodiment of the invention can be applied to electronic equipment. Specifically, the electronic device may be: desktop computers, portable computers, intelligent mobile terminals, servers, etc. Any electronic device capable of implementing the present invention is not limited herein, and falls within the scope of the present invention.

Based on the same inventive concept, the embodiment of the present invention further provides a hidden face attack countermeasure system based on a countermeasure semantic mask, and fig. 4 is a schematic diagram of the hidden face attack countermeasure system based on the countermeasure semantic mask provided by the embodiment of the present invention, including: processor 710, storage medium 720 and bus 730, storage medium 720 storing machine-readable instructions executable by processor 710, processor 710 executing machine-readable instructions to perform the steps of the above-described method embodiments when the covert facial attack countermeasure system is operating based on the countermeasure semantic mask, the processor 710 and storage medium 720 communicating over bus 730. The specific implementation manner and the technical effect are similar, and are not repeated here.

Based on the same inventive concept, the embodiment of the invention further provides a storage medium, on which a computer program is stored, which when being executed by a processor performs the steps of the hidden face attack countermeasure method as based on the countermeasure semantic mask.

The storage medium may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the storage medium may be at least one storage device located remotely from the processor.

The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

It should be noted that the terms "first," "second," and the like are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the disclosed embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with aspects of the present disclosure.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Further, one skilled in the art can engage and combine the different embodiments or examples described in this specification.

Although the present application has been described herein with respect to various embodiments, other variations of the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed application, from a review of the figures and the disclosure. In the description of the present invention, the word "comprising" does not exclude other elements or steps, the "a" or "an" does not exclude a plurality, and the "a" or "an" means two or more, unless specifically defined otherwise. Moreover, some measures are described in mutually different embodiments, but this does not mean that these measures cannot be combined to produce a good effect.

The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.

Claims

1. A hidden face attack countermeasure method based on a countermeasure semantic mask, comprising:

inputting the original image into a pre-trained fake detection model to obtain a class activation diagram;

inputting the class activation graph into a pre-trained semantic segmentation model, and obtaining a plurality of selected areas according to a preset segmentation task;

inquiring selected pixel points of the selected areas in an original image, and determining an attack area according to the number and the size of the selected pixel points;

inputting the original image into the pre-trained semantic segmentation model, and generating a corresponding semantic mask according to the attack region;

and calculating a constrained countermeasure image by using the final countermeasure sample, the original image and the corresponding semantic mask.

2. The method for countering a hidden face attack based on a countering semantic mask according to claim 1, wherein the inputting the class activation graph into a pre-trained semantic segmentation model and obtaining a plurality of selected regions according to a preset segmentation task comprises:

inputting the class activation diagram into the pre-trained semantic segmentation model to obtain an intermediate image matrix; the intermediate image matrix consists of segmentation task labels;

and selecting matrix areas corresponding to the preset segmentation tasks from the intermediate image matrix to obtain the selected areas.

3. The method for countering a hidden face attack based on a countering semantic mask according to claim 1, wherein the querying the selected pixels of the plurality of selected regions in the original image and determining the attack region according to the number and the size of the selected pixels comprises:

acquiring the plurality of selected areas, wherein the selected areas correspond to the number of selected pixel points in an original image;

and determining the selected area, of which the ratio pixel is larger than a first preset value, as the attack area.

4. The method for countering a hidden face attack based on a countering semantic mask according to claim 1, wherein noise is added to the original image along a gradient rising direction, and a final countering sample is obtained after a preset number of iterations, comprising:

s1, calculating a norm distance between the original image and a current wheel countermeasure sample, and calculating a partial derivative of the norm distance to obtain a current wheel gradient value; the initial countermeasure sample is obtained by superposing a second preset value on the original image;

s2, determining the noise according to the symbol function and the current round gradient value, and adding the noise to the current round of countermeasure sample to obtain a next round of countermeasure sample;

s3, adding 1 to the number of the current wheel times, taking the added 1 as the current wheel times, and taking the result of the S2 as the current wheel countermeasure sample;

s4, performing S1-S3 circularly until the preset iteration times to obtain the final countermeasure sample.

5. The method of claim 1, wherein the computing the constrained challenge image using the final challenge sample, the original image, and the corresponding semantic mask comprises:

the original image and the final countermeasure sample are subjected to difference to obtain global noise;

multiplying the global noise by the corresponding semantic mask to obtain mask noise;

and adding the mask noise to the original image to obtain the constrained countermeasure image.

6. The method for countering a hidden face attack based on a countering semantic mask according to claim 1, wherein the inputting the original image into a pre-trained fake detection model to obtain a class activation map comprises:

inputting the original image into the pre-trained fake detection model to obtain a prediction type;

calculating class activation characteristics of the original image under the prediction class; the class activation feature is generated by a convolution layer of the pre-trained counterfeit detection model;

upsampling the class activation feature to obtain an upsampled image;

7. The method of claim 4, wherein determining the noise according to the sign function and the current round gradient value, and adding the noise to the current round of challenge samples to obtain a next round of challenge samples, comprises:

the current wheel gradient value g _t Inputting the symbol function, and multiplying the symbol function by a noise coefficient alpha with preset intensity to obtain the noise;

sample x is opposed to the current wheel by the noise _t Adding to obtain the next round of countermeasure sample x _t+1 ；

x _t+1 ＝x _t +α·sign(g _t )；

Wherein x is _t Representing the current round of challenge samples, alpha representing a noise figure, sign () representing a sign function.

8. The method according to claim 1, wherein after calculating the constrained challenge image using the final challenge sample, the original image, and the corresponding semantic mask, the method further comprises:

inputting the constrained challenge image as a training sample into the pre-trained counterfeiting detection model to update parameters of the pre-trained counterfeiting detection model.

9. A covert facial attack countermeasure system based on a countermeasure semantic mask, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium in communication over the bus when the stealth face attack countermeasure system based on the countermeasure semantic mask is operating, the processor executing the machine-readable instructions to perform the steps of the method of any of claims 1-8.

10. A storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method according to any of claims 1-8.