CN115984979A

CN115984979A - Unknown-countermeasure-attack-oriented face counterfeiting identification method and device

Info

Publication number: CN115984979A
Application number: CN202310063019.XA
Authority: CN
Inventors: 熊荔; 王美涵; 王柏文
Original assignee: CETC Information Science Research Institute
Current assignee: CETC Information Science Research Institute
Priority date: 2023-01-18
Filing date: 2023-01-18
Publication date: 2023-04-18

Abstract

The disclosed embodiment provides a face forgery identification method, device, electronic equipment and storage medium facing unknown counterattack, comprising: acquiring an actual image containing a human face; carrying out face detection processing on the actual image to obtain an actual face area image, and carrying out reconstruction processing on the actual face area image to obtain an actual face reconstruction image; inputting the actual face reconstruction image into a pre-trained confrontation face discrimination network, and predicting the face authenticity of the actual image; the confrontation face discrimination network is obtained by training a clean face data set and a confrontation face image set added with disturbance noise. The embodiment of the disclosure reconstructs the image data of the actual face area, changes the distribution mode of the antagonistic disturbance, inhibits the antagonistic noise, and simultaneously keeps the face forged trace; and training a confrontation face discrimination network by adopting the confrontation face image set added with the disturbance noise, and improving the robustness of the discrimination network model.

Description

Unknown-countermeasure-attack-oriented face counterfeiting identification method and device

Technical Field

The embodiment of the disclosure belongs to the technical field of image processing, and particularly relates to a face counterfeiting identification method and device facing unknown counterattack, electronic equipment and a storage medium.

Background

With the development of deep counterfeiting technology, face-oriented counterfeiting is more and more widely applied, and technologies for identifying true and false faces are developed, and the face-forged picture identification technology mainly comprises a traditional image evidence obtaining method, a biological feature detection method and a spatial feature detection method. The method based on traditional image forensics and the method based on biological characteristics need to identify based on manually set or concerned forged characteristics and regions, the detection method based on spatial characteristics extracts and judges the forged characteristics of a single picture, the current mainstream method extracts and identifies the characteristics based on a deep neural network, faces a main common problem and is easily influenced by counterattack. The anti-attack is a jamming means facing the machine learning algorithm, which causes the recognition error of the automatic recognition model by adding imperceptible slight jamming. The attack resistance mainly comprises two measures of white box attack and black box attack ^[1] . While the white box attack attacks the unknown model through the structure and parameters of a specific model, the black box attack usually cannot acquire the detailed parameters of the model, and only can attack the unknown model by means of simple input and output mapping information, and the black box attack is more applied to practical application.

At present, a defense algorithm facing to black box attack mainly comprises data level defense and model level defense, wherein the data level defense comprises model parameter modification in a training stage and input data modification in a testing stage, a method for modifying the model parameters in the training stage mainly comprises countertraining, and a method for modifying the input data in the testing stage mainly comprises data compression, denoising and the like; the defense method of the model level mainly comprises two methods: modifying the network and the additional network, common methods of modifying the network include altering a loss function or an activation function, defensive distillation, regularization, and the like. The defense method using the additional network refers to that the deep identification network and the confrontation sample generation network are subjected to joint confrontation training to improve the robustness of the model. The method adds additional networks into the defense method commonly used at present, and the method has the advantage of not depending on data set samples too much, such as' adaptive applications Improve

The AdvProp method proposed in Image Recognition' achieves robustness of network model Recognition by adding a generation network using Auxiliary BN to generate anti-noise (using PGD) and by joint defense training.

The prior art has at least the following problems: at present, the mainstream face deep forgery authentication algorithm is based on deep neural network to extract and authenticate features, and although the network models can obtain ideal effects in clean pictures, the network models are easily interfered by anti-attack designed aiming at the neural network models, so that the recognition fails. How to defend the anti-attack is a key problem of improving authentication robustness at present, because in practical application, the existing defense mode is usually faced with unknown attack, the existing defense mode is mainly carried out by modifying data and a model, and the defense technology facing the anti-attack has the following problems: 1) Defense facing attack resistance can obtain good effect only aiming at fixed several known attack methods, and can not deal with risks brought by known attacks; 2) There is a problem of dependency on the parameters of the object model.

Reference 1: yiping, wang Kedi, huang Cheng, et al.

Disclosure of Invention

The disclosed embodiments are directed to at least one of the technical problems in the prior art, and provide a face forgery identification method and apparatus, an electronic device, and a storage medium for an unknown countermeasure attack.

One aspect of the disclosed embodiments provides a face forgery identification method facing unknown counterattack. The method comprises the following steps:

acquiring an actual image containing a human face;

carrying out face detection processing on the actual image to obtain an actual face area image, and carrying out reconstruction processing on the actual face area image to obtain an actual face reconstruction image;

inputting the actual face reconstruction image into a pre-trained confrontation face discrimination network, and predicting the face authenticity of the actual image; the confrontation face discrimination network is obtained by training a clean face data set and a confrontation face image set added with disturbance noise.

Optionally, the confrontation face discrimination network is obtained by training through the following steps:

acquiring a human face real image and a human face forged image which are not added with interference, performing human face detection processing on the human face real image and the human face forged image to obtain a clean human face area image data set, and performing reconstruction processing on partial human face area images to obtain a human face reconstruction image data set; wherein the clean face region image dataset and the face reconstruction image dataset together form the clean face dataset;

selecting a part of target clean face images from the clean face data set, inputting the target clean face images to a preset universal disturbance noise generation network, and generating the confrontation face images;

inputting the clean face image and the confrontation face image into the confrontation face discrimination network to be trained for training to obtain the trained confrontation face discrimination network.

Optionally, the confrontation face discrimination network is an EfficientNet network model, and the inputting the clean face image and the confrontation face image into the confrontation face discrimination network to be trained for training includes:

respectively inputting the clean face image and the confrontation face image into the EfficientNet network model for forward propagation to calculate loss, wherein the EfficientNet network model is provided with a first loss function and a second loss function;

inputting the clean face image into a first loss function L ^c (θ,x ^c Computing its loss value; theta is a parameter of the EfficientNet network model, x ^c A clean face image, and y is a true/false label;

inputting the confrontation face image into a second loss function L ^a (,x ^a Calculating its loss value; wherein x is ^a To combat face images.

Calculating gradient back propagation of the loss of the clean face image and the confrontation face image by using a min-max optimization algorithm, updating parameters of the EfficientNet network model until a final loss function is converged to obtain a trained EfficientNet network model, wherein the final loss function is as follows:

wherein x is ^c For a clean face image, y is a true/false label, e represents added disturbance noise, x ^c + ee is confrontation face image, x ^c +∈＝x ^a (ii) a S represents the allowable disturbance range of disturbance noise, theta is a parameter of the EfficientNet network model,

for a data distribution, <' > based on>

Representing the difference between the training sample and the label.

Optionally, the generic disturbance noise generation network includes: the convolutional layer and three anti-convolutional layers, the convolutional layers adopt a DenseNet network, the convolutional processing is carried out on the input clean face image to obtain a characteristic diagram, and the size of the output characteristic diagram is 7 multiplied by 256; performing BN treatment through Baztch Normalization, and obtaining a feature mapping chart through a ReLU activation function to be used as the input of the next-layer deconvolution layer;

the convolution kernels of the deconvolution layers are all 5 multiplied by 5, each deconvolution layer is subjected to a Battch Normalization (BN), and a feature mapping graph is obtained through a corresponding activation function and serves as the input of the next deconvolution layer.

Optionally, the performing face detection processing on the actual image to obtain an actual face region image includes:

adjusting the actual image to be a preset size;

inputting the actual image with the adjusted size into a MobileNet neural network model for feature extraction, and outputting a feature map;

and carrying out anchor framing on the feature map by adopting a preset anchor framing mechanism, screening the anchor framing by utilizing a Soft-NMS algorithm to obtain a face detection frame, and cutting the face detection frame to obtain an actual face area image.

Optionally, the reconstructing the actual face region image to obtain an actual face reconstructed image includes:

blurring the face region image by adopting at least one of Gaussian blur, frame blur, kawase blur, double blur, shot blur and direction blur;

and carrying out deblurring processing on the blurred face region image by adopting a linear expanded SDWNet network based on wavelet transformation to obtain the actual face reconstruction image.

One aspect of the embodiments of the present disclosure provides a face forgery recognition apparatus facing unknown counterattack. The device, comprising:

the acquisition module is used for acquiring an actual image containing a human face;

the reconstruction module is used for carrying out face detection processing on the actual image to obtain an actual face region image and carrying out reconstruction processing on the actual face region image to obtain an actual face reconstruction image;

and the recognition module is used for inputting the actual face reconstruction image into a pre-trained face confronting judgment network and predicting the face authenticity of the actual image.

Optionally, the apparatus further comprises a training module, wherein the training module is configured to:

acquiring a real face image without interference and a forged face image, performing face detection processing on the real face image and the forged face image to obtain a clean face region image data set, and performing reconstruction processing on partial face region images to obtain a face reconstruction image data set; wherein the clean face region image dataset and the face reconstruction image dataset together form the clean face dataset;

and inputting the clean face image and the confrontation face image into the confrontation face judgment network to be trained for training to obtain the trained confrontation face judgment network.

One aspect of the disclosed embodiments provides an electronic device. The method comprises the following steps: one or more processors;

a storage unit for storing one or more programs which, when executed by the one or more processors, enable the one or more processors to implement a method in accordance with the above.

An aspect of embodiments of the present disclosure provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, is capable of implementing a method according to the above.

In the unknown anti-attack-oriented face counterfeiting identification method, a method of combining face image data reconstruction and additional network addition for countertraining is adopted, firstly, high-quality reconstruction is carried out on the actual face region image data, the distribution mode of counterdisturbance can be changed, counternoise is inhibited, and face counterfeiting marks are kept; adding a confrontation face image set added with disturbance noise to train the confrontation face discrimination network, and improving the robustness of the discrimination network model; the embodiment of the disclosure performs anti-attack defense from two levels of data and network, and can make up the defect of defense from only a single aspect of data or network from multiple dimensions.

Drawings

Fig. 1 is a flowchart of a face forgery identification method for unknown attack countermeasure according to an embodiment of the present disclosure;

fig. 2 is a schematic structural diagram of a face forgery recognition apparatus for unknown attack countermeasures according to an embodiment of the present disclosure.

Detailed Description

In order to better understand the technical solutions of the embodiments of the present disclosure, those skilled in the art will now describe the embodiments of the present disclosure in further detail with reference to the accompanying drawings and detailed description.

As shown in fig. 1, a face forgery identification method S100 for unknown attack, said method S100 comprising:

and S110, acquiring an actual image containing a human face.

The embodiment of the disclosure can adopt the existing image acquisition equipment to acquire the actual image containing the face in the video or the picture, etc., wherein the actual image containing the face is the actual image containing the face to be identified, and the actual image may include a face forged image.

And S120, carrying out face detection processing on the actual image to obtain an actual face region image, and carrying out reconstruction processing on the actual face region image to obtain an actual face reconstruction image.

The embodiment of the disclosure adopts an improved MobileNet network, and combines a novel anchor frame mechanism suitable for the efficient operation of a GPU at a mobile terminal and a Soft-NMS algorithm to realize the detection of the unrestricted face image in the actual image and obtain the actual face region image. The non-limited face detection refers to the face detection without occlusion, and is mainly used for application scenes of counterfeit video detection. The MobileNet network is a lightweight convolutional neural network, and a more compact lightweight feature extraction method is adopted, so that the human face in the video can be detected in real time. The novel anchor frame mechanism reduces anchor frames, reduces the calculation complexity, can accelerate the final operation speed, and can be deployed at a mobile terminal. The anchor frame screening can also be performed by other methods, and the accuracy of the anchor frame screening performed by adopting the Soft-NMS algorithm is higher in the embodiment of the disclosure.

Illustratively, the performing face detection processing on the actual image to obtain an actual face region image includes:

and S121, adjusting the actual image to be a preset size.

As an example, this step normalizes the actual image, scaling the actual image to a uniform size of 128 × 128.

And S122, inputting the actual image with the adjusted size into a MobileNet neural network model for feature extraction, and outputting a feature map.

The MobileNet is a lightweight convolution neural network, the parameters for performing convolution are much smaller than those of standard convolution, and a characteristic diagram can be extracted from an actual image in real time. The dimensions of the feature map include: 8 × 8 and 16 × 16.

Illustratively, the MobileNet neural network model includes: one 2D convolutional layer and five single blazeblocks and six double blazeblocks; the 2D convolutional layer adopts a deformable convolutional network layer of 5 multiplied by 3 multiplied by 24.

The 2D convolutional layer of the embodiment of the disclosure can increase the receptive field, and the characteristics can be kept as much as possible by adopting the MobileNet neural network model.

And S123, carrying out anchor framing on the feature map by adopting a preset anchor framing mechanism, screening the anchor framing by utilizing a Soft-NMS algorithm to obtain a face detection frame, and cutting the face detection frame to obtain an actual face area image.

As an example, the preset anchor frame mechanism reduces the number of anchor frames, reduces the computational complexity, can accelerate the final operation speed, and can be deployed at the mobile terminal. The method and the device have higher accuracy in anchor frame screening by adopting the Soft-NMS algorithm. Other methods of anchor frame screening may also be used, and are not limited herein. If a plurality of face regions may be detected in one actual image, different detected face regions need to be cut out to obtain corresponding actual face region images.

Illustratively, the anchor box mechanism sets six 8 × 8 and two 16 × 16 anchor boxes for each pixel position on the feature map, with an aspect ratio of 1:1.

as an example, the anchor frame mechanism sets six 8 × 8 and two 16 × 16 anchor frames at each pixel position on the feature map to reduce the number of anchor frames, and since the aspect ratio of a human face varies only a limited amount, the aspect ratio of an anchor frame may be set to 1:1.

illustratively, the reconstructing the actual face region image to obtain an actual reconstructed face image includes:

and S124, blurring the face region image by adopting at least one of Gaussian blur, frame blur, kawase blur, double blur, shot blur and direction blur.

As an example, the blur radius may be set to 5, or may be adjusted according to actual situations, and is not limited herein. The embodiment of the present disclosure may sequentially perform blurring processing on the face image by using the 6 blurring methods, and the blurring sequence is not limited here. The blurring process can be performed as it is by those skilled in the art. The sizes of the face area images are different, before the face area images are subjected to blurring processing, the sizes of the face area images are subjected to normalization processing, the sizes are uniformly adjusted to 416 x 416, and the face images with the sizes smaller than 416 x 416 are supplemented from rows and columns to two sides respectively by pixel gray values 0.

And S124, deblurring the blurred face region image by adopting a linear expanded SDWNet network based on wavelet transformation to obtain the actual face reconstruction image.

As an example, the face region image subjected to the blurring processing in step S123 is subjected to the deblurring processing, and restoration of the face region image is realized. If the face region image is added with the counterdisturbance, the face region image is deblurred to recover the face region image, and the counterdisturbance can be changed to suppress the counterdisturbance noise. The embodiment of the disclosure reconstructs the face region image by blurring and deblurring the face region image, so as to limit or change the disturbance of the anti-attack on the face region image, and improve the identification accuracy.

Illustratively, the SDWNet network model training process includes:

setting the sample number of the batch size training set as 8; the learning rate of learning is adjusted in a self-adaptive way by adopting a cosine annealing method from 4 multiplied by 10 < -4 >;

and taking model parameters trained by the Gopro data set as initial training parameters of the SDWNet network model, fine-tuning the parameters, and training a plurality of epoch learning rounds until the SDWNet network model is optimal.

As an example, around 100 epoch learning rounds may be trained. The SDWNet network model is adopted to train and deblur the face region image, reconstruction and recovery of high-frequency texture details are facilitated, fake and weak traces of the face region image can be well reserved, the structure of the SDWNet network model is simple, training is facilitated, and convergence is fast. The sample number of the batch size training set is set to 8, so that the SDWNet network model is converged quickly.

S130, inputting the actual face reconstruction image into a pre-trained confrontation face judgment network, and predicting the face authenticity of the actual image; the confrontation face discrimination network is obtained by adopting a clean face data set and a confrontation face image set added with disturbance noise for training.

It is easy to understand that the output prediction result is true when the actual face reconstruction image is a real face image, and the prediction result is false when the actual face reconstruction image is a fake face image. The embodiment of the invention trains the confrontation face discrimination network by adopting a clean face data set and a confrontation face image set added with disturbance noise, thereby enhancing the robustness of the confrontation face discrimination network.

Illustratively, the confrontation face discrimination network is obtained by training through the following steps:

s131, obtaining a human face real image and a human face forged image which are not added with interference, carrying out human face detection processing on the human face real image and the human face forged image to obtain a clean human face area image data set, and carrying out reconstruction processing on partial human face area images to obtain a human face reconstruction image data set; wherein the clean face region image dataset and the face reconstruction image dataset together comprise the clean face dataset.

As an example, the face detection processing and reconstruction processing method in step S120 may be adopted to obtain a face reconstruction image data set of a clean face region image data set, where reconstructing the clean face region image may play a role of expanding the data set, and increase the generalization of the confrontation face discrimination network.

S132, selecting a part of target clean face images from the clean face data set, inputting the target clean face images into a preset universal disturbance noise generation network, and generating the confrontation face images.

As an example, the generic disturbance noise generation network is an additional network in the confrontation face judgment network, and is intended to add disturbance noise to the target clean face image to generate the confrontation face image when training the confrontation face judgment network.

S133, inputting the clean face image and the confrontation face image into the confrontation face judgment network to be trained for training to obtain the trained confrontation face judgment network.

As can be understood, the target clean face image and the true/false label corresponding to the confrontation face image are used as the output of the confrontation face discrimination network, and the confrontation face discrimination network is trained to obtain the trained confrontation face discrimination network.

Illustratively, the confrontation face discrimination network is an EfficientNet network model, confrontation training is performed on the EfficientNet network model by adopting an AdvProp training method, and inputting the clean face image and the confrontation face image into the confrontation face discrimination network to be trained for training includes:

inputting the clean face image into a first loss function L ^c (θ,x ^c Y) calculating the loss value thereof; wherein, theta is a parameter of the EfficientNet network model, and x ^c A clean face image, and y is a true/false label;

inputting the confrontation face image into a second loss function L ^a (,x ^a Y) calculating the loss value thereof; wherein x is ^a To combat face images.

/>

wherein x is ^c For a clean face image, y is a true/false label, e represents added disturbance noise, x ^c + egeis confrontation face image, x ^c +∈＝x ^a (ii) a S represents the allowable disturbance range of disturbance noise, theta is a parameter of the EfficientNet network model,

for data distribution, is asserted>

Representing the difference between the training sample and the label.

The confrontation face distinguishing network based on the EfficientNet network is trained by adopting an Advprop method, the confrontation face image generated by the network is generated by the added universal disturbance noise in the EfficientNet network model, the EfficientNet network model is trained, the robustness of the EfficientNet network model is improved, and the EfficientNet network model is simple in structure, easy to train, light in weight and wide in application range, and can be suitable for a mobile terminal. Specifically, in the AdvProp training method, the loss value of a clean face image is calculated by using the main BN, the loss value of an anti-face image is calculated by using the auxiliary BN, and the auxiliary BN is removed when the trained anti-face discrimination network is used to identify an actual image, which is not described herein.

Illustratively, the generic disturbance noise generation network includes: the convolutional layer and three anti-convolutional layers, the convolutional layers adopt a DenseNet network, the convolutional processing is carried out on the input clean face image to obtain a characteristic diagram, and the size of the output characteristic diagram is 7 multiplied by 256; performing BN (Batch Normalization) processing and a ReLU (ReLU) activation function to obtain a feature mapping graph which is used as the input of the next deconvolution layer;

the convolution kernel size of each deconvolution layer is 5 multiplied by 5, each deconvolution layer is processed by a Batch Normalization, namely BN, and then a feature mapping chart is obtained through a corresponding activation function and is used as the input of the next deconvolution layer.

Specifically, as shown in table 1, the convolutional layer uses a DenseNet network to perform downsampling on the input clean face image. The DenseNet network parameters are fewer and the feature connectivity is higher.

The first Conv2DTranspose deconvolution layer comprises 128 convolution kernels of 5 x 5, the output of the convolution layers serves as the input of the first Conv2DTranspose deconvolution layer, an up-sampling deconvolution operation is carried out, characteristics are amplified, a convolution step s is set to be 1, the value of padding is set to be same, a characteristic diagram is obtained after deconvolution processing, the characteristic diagram is processed through Batch Normalization (BN), and a ReLU activation function is carried out to obtain a characteristic mapping diagram which serves as the input of the second Conv2DTranspose deconvolution layer.

The second Conv2DTranspose deconvolution layer comprises 64 convolution kernels of 5 x 5, the output of the first Conv2DTranspose deconvolution layer is used as the input of the second Conv2DTranspose deconvolution layer, the up-sampling deconvolution operation is carried out, the characteristic is amplified, the convolution step length s is set to be 1, the padding value is set to be same, the characteristic map is obtained after the deconvolution processing, the characteristic map is obtained after the Batch Normalization BN processing and the ReLU activation function are carried out, and the characteristic map is used as the input of the third Conv2DTranspose deconvolution layer.

The third Conv2 DTransport deconvolution layer comprises 1 convolution kernel of 5 x 5, the output of the second Conv2 DTransport deconvolution layer is used as the input of the third Conv2 DTransport deconvolution layer, the up-sampling deconvolution operation is carried out, the features are amplified, the convolution step length s is set to be 2, the padding value is set to be same, a feature map is obtained after deconvolution, the feature map is obtained after Batch Normalization, namely BN treatment, and a Tanh activation function; dropout processing is carried out on each deconvolution layer, and finally a face image is output.

TABLE 1 disturbance noise Generation network

Each iteration of the universal disturbance noise generation network is adopted, a small number of clean face images are sampled from the clean face data set in the step S132 in each batch and are used as target clean face images to be input into the universal disturbance noise generation network, confrontation face images are generated, and the iteration times can be set according to actual conditions.

As an example, the DenseNet network is a network designed for this general disturbance noise generation. Each layer of the universal disturbance noise generation network adopts BN as a normalization method, the disturbance noise generation network adds noise interference to the face image to generate a confrontation face image, and the confrontation face discrimination network, namely an EfficientNet network model, is trained by adding the confrontation face image to enhance the robustness of the confrontation face discrimination network. While the general disturbance noise added may be selected from a general method such as FGD, the added general disturbance noise of the embodiments of the present disclosure is more suitable for unknown attack. Upsampling each layer for Dropout, which prevents overfitting.

The neural network adopted in each step of the embodiment of the disclosure has the advantages of less related parameters and small calculation amount, and is suitable for devices such as mobile terminals with small storage space. The embodiment of the disclosure provides a face counterfeiting recognition method facing unknown counterattack, wherein in the face counterfeiting recognition, the unknown attack is faced, the distribution mode of counterattack disturbance is changed by transforming and reconstructing a face image, meanwhile, the image characteristics can be retained to the maximum extent to avoid destroying counterfeiting marks, and finally, the image characteristics are input to a counterface discrimination network for recognition, the counterface discrimination network generates a counterface image sample by adding an additional universal disturbance noise generation network and trains the face counterface discrimination network together with a clean face image sample, so that the accuracy of the face counterfeiting recognition is enhanced. The face forgery identification of the counterattack is carried out from the data level and the network level, and the defects of a single method can be made up from multiple dimensions.

One aspect of the disclosed embodiments provides a face forgery recognition apparatus facing an unknown countermeasure attack. As shown in fig. 2, the apparatus 100 includes:

an obtaining module 110, configured to obtain an actual image including a human face;

a reconstruction module 120, configured to perform face detection processing on the actual image to obtain an actual face region image, and perform reconstruction processing on the actual face region image to obtain an actual face reconstruction image;

and the recognition module 130 is configured to input the actual face reconstruction image to a pre-trained confrontation face discrimination network, and predict the face authenticity of the actual image.

Specifically, an actual image to be recognized, which contains a human face, is acquired by adopting the existing image acquisition equipment; the reconstruction module 120 performs face detection processing on the actual image by using the method in the step S120, and reconstructs the detected actual face region image to obtain an actual face reconstruction image. Inputting the actual face reconstruction image into the recognition module 130 of the confrontation face discrimination network obtained by training in the step 130 for recognition, and outputting the recognition result of the face authenticity of the actual image.

The unknown anti-attack face counterfeiting recognition device reconstructs a detected actual face region image, changes the distribution mode of the antagonistic disturbance, simultaneously reserves face counterfeiting traces, trains the antagonistic face discrimination network by using the antagonistic face image, and improves the robustness of the antagonistic face discrimination network. In the device of the embodiment of the disclosure, the calculation amount related to each module is small, the occupied space is small, the operation speed is high, the device is suitable for various scenes, and the device has the characteristics of wide application range and low cost.

Illustratively, the apparatus further comprises a training module 140 for:

Specifically, disturbance noise is added to an input target clean face image through an additional universal disturbance noise generation network in the training module to generate a confrontation face image, and the confrontation face discrimination network is trained by combining the clean face image, so that the robustness of the confrontation face discrimination network is improved, and the multi-dimensional improvement enhances the recognition effect.

One aspect of the disclosed embodiments provides an electronic device. The method comprises the following steps: one or more processors.

Where the memory and processor are connected by a bus, the bus may comprise any number of interconnected buses and bridges, the buses connecting together one or more of the various circuits of the processor and the memory. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, etc., which are well known in the art, and therefore, will not be described in detail herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor is transmitted over a wireless medium through an antenna, which further receives the data and transmits the data to the processor.

The processor is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And the memory may be used to store data used by the processor in performing operations.

The electronic device of the embodiment performs face detection processing and reconstruction on the obtained actual image of the face by implementing the face forgery identification method facing unknown counterattack, has high identification accuracy, and can better adapt to various identification conditions.

An aspect of the embodiments of the present disclosure provides a computer-readable storage medium on which a computer program is stored, which computer program, when being executed by a processor, is able to carry out a method according to the above.

The computer readable medium may be included in the system and the electronic device of the present disclosure, or may exist separately.

The computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, with more specific examples including, but not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, an optical fiber, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination thereof.

A computer readable storage medium may also include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave, including without limitation, electromagnetic signals, optical signals, or any suitable combination thereof.

According to the embodiment of the disclosure, the confrontation face image generated by the additional network is used as the training sample to train the confrontation face judgment network, so that the robustness of the confrontation face judgment network is improved, the recognition accuracy rate is high, and the confrontation face judgment network can better adapt to various recognition conditions.

It is to be understood that the above embodiments are merely exemplary embodiments that are employed to illustrate the principles of the disclosed embodiments, which are not limiting. It will be apparent to those skilled in the art that various changes and modifications can be made in the embodiments of the present disclosure without departing from the spirit and scope of the embodiments of the disclosure, and such changes and modifications are to be considered as within the scope of the embodiments of the disclosure.

Claims

1. A face forgery identification method facing unknown counterattack is characterized in that the method comprises the following steps:

acquiring an actual image containing a human face;

2. The method of claim 1, wherein the confrontation face discrimination network is trained by the following steps:

3. The method of claim 2, wherein the confrontation face discrimination network is an EfficientNet network model, and the inputting the clean face image and the confrontation face image into the confrontation face discrimination network to be trained for training comprises:

inputting the clean face image into a first loss function L ^c (θ,x ^c Y) calculating the loss value thereof; wherein theta is a parameter of the EfficientNet network model, and x ^c The image is a clean face image, and y is a true/false label;

inputting the confrontation face image into a second loss function L ^a (θ,x ^a Y) calculating the loss value; wherein x is ^a To combat face images.

Calculating gradient back propagation of the loss of the clean face image and the anti-face image by using a min-max optimization algorithm, updating parameters of the EfficientNet network model until a final loss function is converged to obtain a trained EfficientNet network model, wherein the final loss function is as follows:

wherein x is ^c The image is a clean face image, y is a true-false label, e represents added disturbance noise, x ^c + ee is confrontation face image, x ^c +∈＝x ^a (ii) a S represents the allowable disturbance range of disturbance noise, theta is a parameter of the EfficientNet network model,

for data distribution, is asserted>

Representing the difference between the training sample and the label.

4. The method of claim 2, wherein the generic perturbation noise generating network comprises: the convolutional layer and three anti-convolutional layers, the convolutional layers adopt a DenseNet network, the convolutional processing is carried out on the input clean face image to obtain a characteristic diagram, and the size of the output characteristic diagram is 7 multiplied by 256; performing BN (blend Normalization) treatment and a ReLU (ReLU) activation function to obtain a feature mapping graph which is used as the input of the next deconvolution layer;

the convolution kernel size of the deconvolution layers is 5 multiplied by 5, each deconvolution layer is processed by a Batch Normalization (BN), and a feature mapping diagram is obtained through a corresponding activation function and is used as the input of the next deconvolution layer.

5. The method according to any one of claims 1 to 4, wherein the performing the face detection processing on the actual image to obtain an actual face region image comprises:

adjusting the actual image to be a preset size;

6. The method according to claim 5, wherein the reconstructing the actual face region image to obtain an actual face reconstructed image comprises:

and deblurring the blurred face region image by adopting a linear type expanded SDWNet network based on wavelet transformation to obtain the actual face reconstruction image.

7. A face forgery recognition apparatus for unknown attack countermeasure, the apparatus comprising:

the reconstruction module is used for carrying out face detection processing on the actual image to obtain an actual face area image and carrying out reconstruction processing on the actual face area image to obtain an actual face reconstruction image;

8. The apparatus of claim 7, further comprising a training module to:

9. An electronic device, comprising: one or more processors;

a storage unit to store one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1 to 6.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is able to carry out a method according to any one of claims 1 to 6.