CN114663936A - Face counterfeiting detection method based on uncertainty perception level supervision - Google Patents

Face counterfeiting detection method based on uncertainty perception level supervision Download PDF

Info

Publication number
CN114663936A
CN114663936A CN202210167833.1A CN202210167833A CN114663936A CN 114663936 A CN114663936 A CN 114663936A CN 202210167833 A CN202210167833 A CN 202210167833A CN 114663936 A CN114663936 A CN 114663936A
Authority
CN
China
Prior art keywords
network
face image
image sample
level
supervision
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210167833.1A
Other languages
Chinese (zh)
Inventor
鲁继文
周杰
于炳耀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202210167833.1A priority Critical patent/CN114663936A/en
Publication of CN114663936A publication Critical patent/CN114663936A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The method, the device and the storage medium for detecting the face forgery based on the uncertain perception level supervision acquire a plurality of different face image samples, extract a feature map of each face image sample in the plurality of different face image samples through a convolutional neural network and an upper sampling layer, input the feature map of each face image sample into a multi-level supervision network to obtain a prediction result of each level supervision network, obtain a calculation result corresponding to each level supervision network by utilizing a preset loss function, train according to a network gradient descent algorithm and the calculation result corresponding to each level supervision network to obtain a target multi-level supervision network, predict a face image to be detected by utilizing the target multi-level supervision network, and judge whether the face image to be detected is a forged face according to the prediction result. The method provided by the application improves the robustness and the generalization of the network and improves the accuracy of the detection result.

Description

Face counterfeiting detection method based on uncertainty perception level supervision
Technical Field
The present application relates to the field of computer vision and machine learning technologies, and in particular, to a method and an apparatus for detecting face forgery based on uncertainty perception level supervision, and a storage medium.
Background
With the continuous development and progress of the depth generation model, more and more face editing methods can enable a user to arbitrarily edit the face attribute and even directly change the identity of the face image. And these methods can produce images that are so realistic that even the human eye cannot distinguish them completely correctly. Meanwhile, photos and even videos of other people can be easily obtained through highly developed internet technologies and social networks, and a large amount of materials are provided for a face editing method, so that various visual error information and abuse of the face editing technology may cause serious trust crisis, for example, a face recognition system is attacked by forged images. Therefore, an effective face forgery detection method is required to detect whether the resulting face image is edited or not.
In the related technology, different additional information, prior knowledge and a convolutional neural network model are combined to perform face forgery detection through a texture-based face forgery detection method or a texture-based method, but the related technology mainly focuses on a whole image or a local area in the image, and a mask label of face data carries data uncertainty, so that robustness and generalization are insufficient, and the accuracy of a detection result is reduced.
Disclosure of Invention
The application provides a face forgery detection method and device based on uncertainty perception level supervision and a storage medium, which at least solve the technical problems of insufficient robustness and generalization and low accuracy of detection results in the related technology.
An embodiment of a first aspect of the present application provides a face forgery detection method based on uncertainty perception level supervision, including:
acquiring a plurality of different face image samples;
extracting a feature map of each face image sample in the plurality of different face image samples through a convolutional neural network and an upsampling layer;
inputting the characteristic diagram of each face image sample into each level of monitoring network in a multi-level monitoring network for prediction to obtain a prediction result of each level of monitoring network, wherein the multi-level monitoring network comprises a plurality of monitoring networks of different levels;
calculating the prediction result of each layer of the supervision network by using a preset loss function to obtain a calculation result corresponding to each layer of the supervision network;
training each level of supervision network in the multi-level supervision network according to a network gradient descent algorithm and a calculation result corresponding to each level of supervision network to obtain a target multi-level supervision network;
and predicting the face image to be detected by using the target multi-level surveillance network, and judging whether the face image to be detected is a forged face according to a prediction result.
The embodiment of the second aspect of the present application provides a face forgery detection apparatus based on uncertainty perception level supervision, including:
the acquisition module is used for acquiring a plurality of different face image samples;
the extraction module is used for extracting a feature map of each face image sample in the plurality of different face image samples through a convolutional neural network and an upsampling layer;
the prediction module is used for inputting the feature map of each face image sample into each hierarchy supervision network in a multilevel supervision network to predict to obtain the prediction result of each hierarchy supervision network, wherein the multilevel supervision network comprises a plurality of different levels of supervision networks;
the calculation module is used for calculating the prediction result of each layer of the supervision network by using a preset loss function to obtain a calculation result corresponding to each layer of the supervision network;
the training module is used for training each hierarchy supervision network in the multilevel supervision networks according to a network gradient descent algorithm and a calculation result corresponding to each hierarchy supervision network to obtain a target multilevel supervision network;
and the judging module is used for predicting the face image to be detected by utilizing the target multi-level surveillance network and judging whether the face image to be detected is a forged face or not according to a prediction result.
A non-transitory computer-readable storage medium as set forth in an embodiment of the third aspect of the present application, wherein the non-transitory computer-readable storage medium stores a computer program; which when executed by a processor implements the method as shown in the first aspect above.
A computer device according to an embodiment of a fourth aspect of the present application includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method according to the first aspect is implemented.
The technical scheme provided by the embodiment of the application at least has the following beneficial effects:
the application provides a face forgery detection method, a device and a storage medium based on uncertainty perception level supervision, which obtains a plurality of different face image samples, extracts a characteristic diagram of each face image sample in the plurality of different face image samples through a convolution neural network and an upper sampling layer, inputs the characteristic diagram of each face image sample into each level supervision network in a multi-level supervision network for forecasting to obtain a forecasting result of each level supervision network, wherein the multi-level supervision network comprises a plurality of different levels of supervision networks, calculates the forecasting result of each level supervision network by using a preset loss function to obtain a calculating result corresponding to each level supervision network, trains each level supervision network in the multi-level supervision network according to a network gradient descent algorithm and the calculating result corresponding to each level supervision network to obtain a target multi-level supervision network, and predicting the face image to be detected by using a target multi-level surveillance network, and judging whether the face image to be detected is a forged face or not according to a prediction result. The method and the device have the advantages that the whole network structure is assisted through the binary mask label of the face image, the robustness and the generalization of the network are improved through a hierarchical supervision method, meanwhile, the uncertainty of data naturally carried by the mask label is processed through an uncertainty estimation method, the characteristics of the image are effectively extracted through a self-attention transformation network, and therefore the accuracy of a detection result is improved.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flowchart of a face forgery detection method based on uncertainty perception hierarchical supervision according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a face forgery detection apparatus based on uncertainty perception hierarchical supervision according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.
The following describes a face forgery detection method and apparatus based on uncertainty perception hierarchical supervision according to an embodiment of the present application with reference to the drawings.
Example one
Fig. 1 is a schematic flowchart of a face forgery detection method based on uncertainty perception hierarchical supervision according to an embodiment of the present application, and as shown in fig. 1, the method may include:
step 101, obtaining a plurality of different face image samples.
In this embodiment, the number of different image samples obtained each time may be the same, for example, 120 different image samples are obtained each time.
And, in the embodiment of the present invention, the face image samples in the form of a matrix may be obtained, for example, one face image sample obtained is I e RH×W×3
And 102, extracting a characteristic map of each face image sample in a plurality of different face image samples through a convolutional neural network and an upper sampling layer.
In an embodiment of the present invention, a method for extracting a feature map of each face image sample in a plurality of different face image samples through a convolutional neural network and an upsampling layer may include the following steps:
step a, extracting an initial characteristic image of each face image sample in a plurality of different face image samples through a convolution network.
In the embodiment of the present invention, the initial feature map extracted from one face image sample is
Figure BDA0003517240900000041
And b, increasing the resolution of the initial feature map of each face image sample by adopting a plurality of continuous up-sampling convolution blocks to obtain the feature map of each face image sample.
Wherein, in the embodiment of the invention, the initial feature map F of the face image sample0The size of the convolution block is smaller than the size of the mask label, and based on the requirement, a plurality of continuous up-sampling convolution blocks are adopted to increase the resolution of the initial feature map of each facial image sample, so that the feature map of each facial image sample is obtained. For example, in the embodiment of the present invention, the resolution of the initial feature map of each face image sample may be increased by using three consecutive upsampling convolution blocks, so as to obtain the feature map F ∈ R of each face image sampleh×w×c
And 103, inputting the feature map of each face image sample into each layer of supervision network in the multi-layer supervision network to predict to obtain the prediction result of each layer of supervision network.
In the embodiment of the invention, the multi-level supervisory network comprises a plurality of supervisory networks of different levels. Specifically, in the embodiment of the present invention, the multi-level monitoring network may include a pixel-level monitoring network, a region-level monitoring network, and an image-level monitoring network.
And the method for inputting the feature map of each face image sample into each level of supervision network in the multi-level supervision network to predict and obtain the prediction result of each level of supervision network can comprise the following steps:
step 1031, inputting the feature map of each face image sample into a pixel-level supervision network for prediction to obtain a pixel-level prediction result of each face image sample.
In the embodiment of the invention, the step of inputting the feature map of each face image sample into the pixel-level supervision network to predict to obtain the pixel-level prediction result of each face image sample comprises the step of predicting to obtain the pixel-level prediction result of each face image sample through a plurality of convolutional neural networks.
And, in an embodiment of the present invention, the pixel-level prediction result of each face image sample obtained through step 1031 includes a prediction result of each pixel in each face image sample.
And step 1032, inputting the feature map of each face image sample into an area-level supervision network for prediction to obtain an area-level prediction result of each face image sample.
In the embodiment of the present invention, the method for inputting the feature map of each face image sample into the area-level supervision network to predict and obtain the area-level prediction result of each face image sample may include the following steps:
step one, obtaining a normalized variance graph corresponding to the feature graph of each face image sample by using a Sigmoid function.
In the embodiment of the invention, a normalized variance map S (σ) corresponding to the feature map of each face image sample is obtained by using a Sigmoid function.
And step two, obtaining an uncertainty perception characteristic image of each face image sample based on the normalized variance image of each face image sample.
In the embodiment of the invention, an uncertainty perception characteristic map of each face image sample is obtained through a first formula, wherein the first formula is as follows:
Fu=[F|F⊙(1-S(σ))]
and step three, mapping the uncertainty perception characteristic image of each facial image sample through a linear mapping layer to obtain the implicit vector representation of each facial image sample.
In the embodiment of the present invention, before mapping the uncertainty perceptual feature map of each face image sample through the linear mapping layer, the uncertainty perceptual feature map F of each face image sample needs to be mappeduAnd (4) carrying out serialization.
Specifically, in the embodiment of the invention, the uncertainty perception feature map F of each face image sample is useduThe method for sequencing comprises the step of subjecting FuCutting the face image into M small blocks, wherein each small block is a square with the same size, and stretching each small block in 2D to obtain a vector F of each face image sampletAnd then, the vector F of each face image sample is mapped by utilizing a learnable linear mapping layertMapping to implicit vector characterization E0
And step four, obtaining the region-level characterization of each face image sample by using the implicit vector characterization of each face image sample.
In the embodiment of the invention, the implicit vector representation of each facial image sample and the learnable position code of each facial image sample are directly added to obtain the complete area-level representation z of each facial image sample0So as to maintain the position information between the areas of each face image sample.
Specifically, in the embodiment of the invention, each face image sample has complete area-level characterization z0Comprises the following steps:
z0=FtE0+Epos,
wherein E isposAnd coding the learnable position of each face image sample.
And fifthly, inputting the area level characteristics and the category marks of each face image sample into the self-attention transformation network layer to obtain an area level prediction result of each face image sample.
In an embodiment of the present invention, the Self-Attention transforming network Layer may include an L-Layer MSA (Multi-head Self-Attention module), and an MLP (Multi-Layer per unit).
And, in an embodiment of the invention, the output from the l-th layer in the attention transforming network layer is:
Figure BDA0003517240900000071
Figure BDA0003517240900000072
wherein LN (-) represents a layer regularization operation, and
Figure BDA0003517240900000073
for the output variable of the corresponding multi-headed self-attention module, zl-1And zlRespectively representing input and output coded representations of different self-attention transforming network layers.
Further, in embodiments of the present invention, after L self-attention transforming network layers, an encoded sequence output T may be obtainedcls,T1,...,TM]Wherein, TclsIndicates the class header, T1,...,TMAnd predicting results for each region level of the face image sample.
And 1033, inputting the feature map of each face image sample into an image-level supervision network for prediction to obtain an image-level prediction result of each face image sample.
In the embodiment of the present invention, the image-level prediction result of each facial image sample includes a category to which each facial image sample belongs. For example, in the embodiment of the present invention, the image-level prediction result may use 0 to indicate that the category to which the input face image belongs is a fake face, and may use 1 to indicate that the category to which the input face image belongs is a real face.
And 104, calculating the prediction result of each layer of supervision network by using a preset loss function to obtain a calculation result corresponding to each layer of supervision network.
It should be noted that, in the embodiment of the present invention, a mask tag with data uncertainty is used as an auxiliary supervisory signal, and an uncertainty estimation method is used to process the data uncertainty naturally carried by the mask tag.
In an embodiment of the present invention, the uncertainty estimation method may include:
the method comprises the following steps: to model the data uncertainty characterizing learning under hierarchical aiding signal supervision, the characterization can be a probability distribution z-p (z | x).
Specifically, in the embodiment of the present invention, the hierarchical characterization z of each sample x follows a multivariate gaussian distribution:
Figure BDA0003517240900000081
wherein μ represents the mean of the gaussian distribution, Σ represents the diagonal covariance of the multivariate gaussian distribution, given a sample x, two convolutional neural networks are used to predict the above parameters, the mean μ is the most likely prediction mask, and the diagonal covariance Σ is the uncertainty of the data.
The second method comprises the following steps: the data uncertainty is estimated using a cross entropy loss function.
Specifically, in the embodiment of the present invention, the cross entropy loss function is:
Figure BDA0003517240900000082
wherein y is a binary mask label corresponding to the face image sample, and z isiAnd (3) representing the hierarchical representation corresponding to each pixel point, wherein the hierarchical representation of each pixel point obeys multivariate Gaussian distribution, and N is the number of the pixel points of the face image sample. And the mean value mu can be used as a prediction mask of the human face image, and sigma can be regarded asAnd the uncertainty of the data is introduced so as to improve the accuracy of the detection result.
The third method comprises the following steps: the data uncertainty is estimated using a constraint loss function.
Specifically, in the embodiment of the present invention, the constraint loss function is:
Figure BDA0003517240900000083
where σ represents the variance of the gaussian distribution.
Further, in an embodiment of the present invention, the sampling operation required to estimate the probability distribution in the above-mentioned method one is not trivial, so that the deep learning model cannot back-propagate the gradient to minimize the objective loss function. And, the cross entropy loss function in the second method may cause the multi-level supervised network to predict a very small Σ all the time, so as to minimize the objective function. Based on the fact that a KL divergence term is adopted to constrain the distribution N (z | mu, Σ) to be close to N (e |0, I), in the embodiment of the present invention, the cross entropy loss function and the constraint loss function can be combined to form a composite loss function as a preset loss function of the multi-level supervisory network, so that the multi-level supervisory network can better estimate the uncertainty. Illustratively, in the embodiment of the present invention, the composite loss function is obtained by adding the cross-entropy loss function and the constraint loss function.
And 105, training each hierarchy supervision network in the multilevel supervision network according to the network gradient descent algorithm and the calculation result corresponding to each hierarchy supervision network to obtain the target multilevel supervision network.
And 106, predicting the face image to be detected by using the target multi-level surveillance network, and judging whether the face image to be detected is a forged face according to a prediction result.
The application provides a face forgery detection method based on uncertainty perception level supervision, which comprises the steps of obtaining a plurality of different face image samples, extracting a characteristic diagram of each face image sample in the plurality of different face image samples through a convolutional neural network and an upper sampling layer, inputting the characteristic diagram of each face image sample into each level supervision network in the level supervision network for forecasting to obtain a forecasting result of each level supervision network, wherein the level supervision network comprises a plurality of different levels of supervision networks, utilizing a preset loss function to calculate the forecasting result of each level supervision network to obtain a calculating result corresponding to each level supervision network, training each level supervision network in the level supervision network according to a network gradient descent algorithm and the calculating result corresponding to each level supervision network to obtain a target level supervision network, utilizing the target level supervision network to forecast a face image to be detected, and judging whether the face image to be detected is a forged face or not according to the prediction result. The method and the device have the advantages that the whole network structure is assisted through the binary mask label of the face image, the robustness and the generalization of the network are improved through a hierarchical supervision method, meanwhile, the uncertainty of data naturally carried by the mask label is processed through an uncertainty estimation method, the characteristics of the image are effectively extracted through a self-attention transformation network, and therefore the accuracy of a detection result is improved.
Example two
Further, fig. 2 is a schematic structural diagram of a face forgery detection apparatus based on uncertainty perception hierarchical supervision according to an embodiment of the present application, and as shown in fig. 2, the face forgery detection apparatus may include:
an obtaining module 201, configured to obtain multiple different face image samples;
an extraction module 202, configured to extract a feature map of each face image sample in multiple different face image samples through a convolutional neural network and an upsampling layer;
the prediction module 203 is configured to input the feature map of each face image sample into each hierarchical monitoring network in the hierarchical monitoring network to perform prediction to obtain a prediction result of each hierarchical monitoring network, where the hierarchical monitoring network includes multiple different hierarchical monitoring networks;
the calculation module 204 is configured to calculate the prediction result of each layer of the supervision network by using a preset loss function to obtain a calculation result corresponding to each layer of the supervision network;
the training module 205 is configured to train each hierarchical monitoring network in the multi-hierarchical monitoring network according to a network gradient descent algorithm and a calculation result corresponding to each hierarchical monitoring network to obtain a target multi-hierarchical monitoring network;
and the judging module 206 is configured to predict the facial image to be detected by using the target multi-level surveillance network, and judge whether the facial image to be detected is a counterfeit face according to the prediction result.
To implement the above embodiments, the present disclosure also proposes a non-transitory computer-readable storage medium.
A non-transitory computer-readable storage medium provided by an embodiment of the present disclosure stores a computer program; when executed by a processor, the computer program can implement the face forgery detection method based on the uncertainty perception hierarchical supervision as shown in fig. 1.
In order to implement the above embodiments, the present disclosure also provides a computer device.
The computer device provided by the embodiment of the disclosure comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor; the processor, when executing the program, is capable of implementing the method as shown in fig. 1.
The application provides a face forgery detection method, a device and a storage medium based on uncertainty perception level supervision, which obtains a plurality of different face image samples, extracts a characteristic diagram of each face image sample in the plurality of different face image samples through a convolution neural network and an upper sampling layer, inputs the characteristic diagram of each face image sample into each level supervision network in a multi-level supervision network for forecasting to obtain a forecasting result of each level supervision network, wherein the multi-level supervision network comprises a plurality of different levels of supervision networks, calculates the forecasting result of each level supervision network by using a preset loss function to obtain a calculating result corresponding to each level supervision network, trains each level supervision network in the multi-level supervision network according to a network gradient descent algorithm and the calculating result corresponding to each level supervision network to obtain a target multi-level supervision network, and predicting the face image to be detected by using a target multi-level surveillance network, and judging whether the face image to be detected is a forged face or not according to a prediction result. The method and the device have the advantages that the whole network structure is assisted through the binary mask label of the face image, the robustness and the generalization of the network are improved through a hierarchical supervision method, meanwhile, the uncertainty of data naturally carried by the mask label is processed through an uncertainty estimation method, the characteristics of the image are effectively extracted through a self-attention transformation network, and therefore the accuracy of a detection result is improved.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are exemplary and should not be construed as limiting the present application and that changes, modifications, substitutions and alterations in the above embodiments may be made by those of ordinary skill in the art within the scope of the present application.

Claims (10)

1. A face forgery detection method based on uncertainty perception level supervision is characterized by comprising the following steps:
acquiring a plurality of different face image samples;
extracting a feature map of each face image sample in the plurality of different face image samples through a convolutional neural network and an upsampling layer;
inputting the feature map of each face image sample into each hierarchy supervision network in a multilevel supervision network for prediction to obtain a prediction result of each hierarchy supervision network, wherein the multilevel supervision network comprises a plurality of different hierarchy supervision networks;
calculating the prediction result of each layer of the supervision network by using a preset loss function to obtain a calculation result corresponding to each layer of the supervision network;
training each hierarchy supervision network in the multilevel supervision network according to a network gradient descent algorithm and a calculation result corresponding to each hierarchy supervision network to obtain a target multilevel supervision network;
and predicting the face image to be detected by using the target multi-level surveillance network, and judging whether the face image to be detected is a forged face according to a prediction result.
2. The method of claim 1, wherein said extracting a feature map for each of said plurality of different face image samples via a convolutional neural network and an upsampling layer comprises: extracting an initial feature map of each face image sample in the plurality of different face image samples through a convolutional network;
and increasing the resolution of the initial feature map of each face image sample by adopting a plurality of continuous up-sampling convolution blocks to obtain the feature map of each face image sample.
3. The method of claim 1, wherein the multi-level supervisory network comprises a pixel level supervisory network, a region level supervisory network, an image level supervisory network,
inputting the feature map of each face image sample into each hierarchy supervision network in the multilevel supervision network to predict to obtain the prediction result of each hierarchy supervision network, wherein the prediction result comprises the following steps:
inputting the feature map of each face image sample into a pixel level supervision network for prediction to obtain a pixel level prediction result of each face image sample;
inputting the feature map of each face image sample into an area level supervision network for prediction to obtain an area level prediction result of each face image sample;
and inputting the feature map of each face image sample into an image-level supervision network for prediction to obtain an image-level prediction result of each face image sample.
4. The method of claim 3, wherein inputting the feature map of each face image sample into a pixel-level supervised network to predict the pixel-level predictor of each face image sample comprises predicting the pixel-level predictor of each face image sample by a plurality of convolutional neural networks.
5. The method as claimed in claim 3, wherein the inputting the feature map of each facial image sample into a region-level supervision network for prediction to obtain the region-level prediction result of each facial image sample comprises:
obtaining a normalized variance graph corresponding to the feature graph of each face image sample by using a Sigmoid function;
obtaining an uncertainty perception characteristic image of each face image sample based on the normalized variance image of each face image sample;
mapping the uncertainty perception characteristic image of each facial image sample through a linear mapping layer to obtain an implicit vector representation of each facial image sample;
obtaining the region-level characterization of each facial image sample by using the implicit vector characterization of each facial image sample;
and inputting the region level characteristics and the class marks of each facial image sample into a self-attention transformation network layer to obtain a region level prediction result of each facial image sample.
6. A method as claimed in claim 3 wherein the image-level prediction result for each face image sample comprises the category to which said each face image sample belongs.
7. The method of claim 1, wherein the pre-set loss function comprises a composite loss function comprising a cross-entropy loss function and a constrained loss function,
the cross entropy loss function is:
Figure FDA0003517240890000031
wherein y is a binary mask label corresponding to the face image sample, and z isiAnd (3) performing layered characterization corresponding to each pixel point, wherein the layered characterization of each pixel point obeys multivariate Gaussian distribution, sigma represents the diagonal covariance of the multivariate Gaussian distribution, and N is the number of the pixel points of the face image sample.
The constraint loss function is:
Figure FDA0003517240890000032
where σ represents the variance of the gaussian distribution.
8. A human face forgery detection device based on uncertainty perception level supervision is characterized by comprising the following modules:
the acquisition module is used for acquiring a plurality of different face image samples;
the extraction module is used for extracting a feature map of each face image sample in the plurality of different face image samples through a convolutional neural network and an upper sampling layer;
the prediction module is used for inputting the feature map of each face image sample into each hierarchy supervision network in a multilevel supervision network to predict to obtain the prediction result of each hierarchy supervision network, wherein the multilevel supervision network comprises a plurality of different levels of supervision networks;
the calculation module is used for calculating the prediction result of each layer of the supervision network by using a preset loss function to obtain a calculation result corresponding to each layer of the supervision network;
the training module is used for training each hierarchy supervision network in the multilevel supervision networks according to a network gradient descent algorithm and a calculation result corresponding to each hierarchy supervision network to obtain a target multilevel supervision network;
and the judging module is used for predicting the face image to be detected by utilizing the target multi-level surveillance network and judging whether the face image to be detected is a forged face or not according to a prediction result.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to any one of claims 1-7 when executing the program.
10. A non-transitory computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method of any one of claims 1-7.
CN202210167833.1A 2022-02-23 2022-02-23 Face counterfeiting detection method based on uncertainty perception level supervision Pending CN114663936A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210167833.1A CN114663936A (en) 2022-02-23 2022-02-23 Face counterfeiting detection method based on uncertainty perception level supervision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210167833.1A CN114663936A (en) 2022-02-23 2022-02-23 Face counterfeiting detection method based on uncertainty perception level supervision

Publications (1)

Publication Number Publication Date
CN114663936A true CN114663936A (en) 2022-06-24

Family

ID=82026730

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210167833.1A Pending CN114663936A (en) 2022-02-23 2022-02-23 Face counterfeiting detection method based on uncertainty perception level supervision

Country Status (1)

Country Link
CN (1) CN114663936A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117253262A (en) * 2023-11-15 2023-12-19 南京信息工程大学 Fake fingerprint detection method and device based on commonality feature learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019128646A1 (en) * 2017-12-28 2019-07-04 深圳励飞科技有限公司 Face detection method, method and device for training parameters of convolutional neural network, and medium
CN112329696A (en) * 2020-11-18 2021-02-05 携程计算机技术(上海)有限公司 Face living body detection method, system, equipment and storage medium
CN112784781A (en) * 2021-01-28 2021-05-11 清华大学 Method and device for detecting forged faces based on difference perception meta-learning
CN112906676A (en) * 2021-05-06 2021-06-04 北京远鉴信息技术有限公司 Face image source identification method and device, storage medium and electronic equipment
CN113723295A (en) * 2021-08-31 2021-11-30 浙江大学 Face counterfeiting detection method based on image domain frequency domain double-flow network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019128646A1 (en) * 2017-12-28 2019-07-04 深圳励飞科技有限公司 Face detection method, method and device for training parameters of convolutional neural network, and medium
CN112329696A (en) * 2020-11-18 2021-02-05 携程计算机技术(上海)有限公司 Face living body detection method, system, equipment and storage medium
CN112784781A (en) * 2021-01-28 2021-05-11 清华大学 Method and device for detecting forged faces based on difference perception meta-learning
CN112906676A (en) * 2021-05-06 2021-06-04 北京远鉴信息技术有限公司 Face image source identification method and device, storage medium and electronic equipment
CN113723295A (en) * 2021-08-31 2021-11-30 浙江大学 Face counterfeiting detection method based on image domain frequency domain double-flow network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117253262A (en) * 2023-11-15 2023-12-19 南京信息工程大学 Fake fingerprint detection method and device based on commonality feature learning
CN117253262B (en) * 2023-11-15 2024-01-30 南京信息工程大学 Fake fingerprint detection method and device based on commonality feature learning

Similar Documents

Publication Publication Date Title
CN107766894B (en) Remote sensing image natural language generation method based on attention mechanism and deep learning
Hoang An Artificial Intelligence Method for Asphalt Pavement Pothole Detection Using Least Squares Support Vector Machine and Neural Network with Steerable Filter‐Based Feature Extraction
CN110084156B (en) Gait feature extraction method and pedestrian identity recognition method based on gait features
CN110287960A (en) The detection recognition method of curve text in natural scene image
CN112560831B (en) Pedestrian attribute identification method based on multi-scale space correction
CN116824307B (en) Image labeling method and device based on SAM model and related medium
CN109840483B (en) Landslide crack detection and identification method and device
CN114841319A (en) Multispectral image change detection method based on multi-scale self-adaptive convolution kernel
CN107103308A (en) A kind of pedestrian's recognition methods again learnt based on depth dimension from coarse to fine
CN113111716A (en) Remote sensing image semi-automatic labeling method and device based on deep learning
CN115661480A (en) Image anomaly detection method based on multi-level feature fusion network
CN111126155B (en) Pedestrian re-identification method for generating countermeasure network based on semantic constraint
CN114549470A (en) Method for acquiring critical region of hand bone based on convolutional neural network and multi-granularity attention
CN112446292A (en) 2D image salient target detection method and system
CN117036715A (en) Deformation region boundary automatic extraction method based on convolutional neural network
CN106203373A (en) A kind of human face in-vivo detection method based on deep vision word bag model
CN113780129B (en) Action recognition method based on unsupervised graph sequence predictive coding and storage medium
Zuo et al. A remote sensing image semantic segmentation method by combining deformable convolution with conditional random fields
CN114663936A (en) Face counterfeiting detection method based on uncertainty perception level supervision
CN118465876A (en) Two-stage approach precipitation prediction method based on EOF-Kmeans clustering and LDM
CN114581789A (en) Hyperspectral image classification method and system
CN117314938B (en) Image segmentation method and device based on multi-scale feature fusion decoding
CN117992919A (en) River flood early warning method based on machine learning and multi-meteorological-mode fusion
CN117829243A (en) Model training method, target detection device, electronic equipment and medium
CN111882545B (en) Fabric defect detection method based on bidirectional information transmission and feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination