CN114663936A - Face counterfeiting detection method based on uncertainty perception level supervision - Google Patents
Face counterfeiting detection method based on uncertainty perception level supervision Download PDFInfo
- Publication number
- CN114663936A CN114663936A CN202210167833.1A CN202210167833A CN114663936A CN 114663936 A CN114663936 A CN 114663936A CN 202210167833 A CN202210167833 A CN 202210167833A CN 114663936 A CN114663936 A CN 114663936A
- Authority
- CN
- China
- Prior art keywords
- network
- face image
- image sample
- level
- supervision
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 27
- 230000008447 perception Effects 0.000 title claims abstract description 24
- 238000000034 method Methods 0.000 claims abstract description 43
- 230000006870 function Effects 0.000 claims abstract description 36
- 238000004364 calculation method Methods 0.000 claims abstract description 17
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 14
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 10
- 238000005070 sampling Methods 0.000 claims abstract description 10
- 230000001815 facial effect Effects 0.000 claims description 19
- 238000012512 characterization method Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 9
- 238000013507 mapping Methods 0.000 claims description 8
- 230000009466 transformation Effects 0.000 claims description 5
- 239000002131 composite material Substances 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 abstract description 3
- 238000012544 monitoring process Methods 0.000 description 18
- 238000010586 diagram Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 6
- 230000001131 transforming effect Effects 0.000 description 4
- 239000000463 material Substances 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The method, the device and the storage medium for detecting the face forgery based on the uncertain perception level supervision acquire a plurality of different face image samples, extract a feature map of each face image sample in the plurality of different face image samples through a convolutional neural network and an upper sampling layer, input the feature map of each face image sample into a multi-level supervision network to obtain a prediction result of each level supervision network, obtain a calculation result corresponding to each level supervision network by utilizing a preset loss function, train according to a network gradient descent algorithm and the calculation result corresponding to each level supervision network to obtain a target multi-level supervision network, predict a face image to be detected by utilizing the target multi-level supervision network, and judge whether the face image to be detected is a forged face according to the prediction result. The method provided by the application improves the robustness and the generalization of the network and improves the accuracy of the detection result.
Description
Technical Field
The present application relates to the field of computer vision and machine learning technologies, and in particular, to a method and an apparatus for detecting face forgery based on uncertainty perception level supervision, and a storage medium.
Background
With the continuous development and progress of the depth generation model, more and more face editing methods can enable a user to arbitrarily edit the face attribute and even directly change the identity of the face image. And these methods can produce images that are so realistic that even the human eye cannot distinguish them completely correctly. Meanwhile, photos and even videos of other people can be easily obtained through highly developed internet technologies and social networks, and a large amount of materials are provided for a face editing method, so that various visual error information and abuse of the face editing technology may cause serious trust crisis, for example, a face recognition system is attacked by forged images. Therefore, an effective face forgery detection method is required to detect whether the resulting face image is edited or not.
In the related technology, different additional information, prior knowledge and a convolutional neural network model are combined to perform face forgery detection through a texture-based face forgery detection method or a texture-based method, but the related technology mainly focuses on a whole image or a local area in the image, and a mask label of face data carries data uncertainty, so that robustness and generalization are insufficient, and the accuracy of a detection result is reduced.
Disclosure of Invention
The application provides a face forgery detection method and device based on uncertainty perception level supervision and a storage medium, which at least solve the technical problems of insufficient robustness and generalization and low accuracy of detection results in the related technology.
An embodiment of a first aspect of the present application provides a face forgery detection method based on uncertainty perception level supervision, including:
acquiring a plurality of different face image samples;
extracting a feature map of each face image sample in the plurality of different face image samples through a convolutional neural network and an upsampling layer;
inputting the characteristic diagram of each face image sample into each level of monitoring network in a multi-level monitoring network for prediction to obtain a prediction result of each level of monitoring network, wherein the multi-level monitoring network comprises a plurality of monitoring networks of different levels;
calculating the prediction result of each layer of the supervision network by using a preset loss function to obtain a calculation result corresponding to each layer of the supervision network;
training each level of supervision network in the multi-level supervision network according to a network gradient descent algorithm and a calculation result corresponding to each level of supervision network to obtain a target multi-level supervision network;
and predicting the face image to be detected by using the target multi-level surveillance network, and judging whether the face image to be detected is a forged face according to a prediction result.
The embodiment of the second aspect of the present application provides a face forgery detection apparatus based on uncertainty perception level supervision, including:
the acquisition module is used for acquiring a plurality of different face image samples;
the extraction module is used for extracting a feature map of each face image sample in the plurality of different face image samples through a convolutional neural network and an upsampling layer;
the prediction module is used for inputting the feature map of each face image sample into each hierarchy supervision network in a multilevel supervision network to predict to obtain the prediction result of each hierarchy supervision network, wherein the multilevel supervision network comprises a plurality of different levels of supervision networks;
the calculation module is used for calculating the prediction result of each layer of the supervision network by using a preset loss function to obtain a calculation result corresponding to each layer of the supervision network;
the training module is used for training each hierarchy supervision network in the multilevel supervision networks according to a network gradient descent algorithm and a calculation result corresponding to each hierarchy supervision network to obtain a target multilevel supervision network;
and the judging module is used for predicting the face image to be detected by utilizing the target multi-level surveillance network and judging whether the face image to be detected is a forged face or not according to a prediction result.
A non-transitory computer-readable storage medium as set forth in an embodiment of the third aspect of the present application, wherein the non-transitory computer-readable storage medium stores a computer program; which when executed by a processor implements the method as shown in the first aspect above.
A computer device according to an embodiment of a fourth aspect of the present application includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method according to the first aspect is implemented.
The technical scheme provided by the embodiment of the application at least has the following beneficial effects:
the application provides a face forgery detection method, a device and a storage medium based on uncertainty perception level supervision, which obtains a plurality of different face image samples, extracts a characteristic diagram of each face image sample in the plurality of different face image samples through a convolution neural network and an upper sampling layer, inputs the characteristic diagram of each face image sample into each level supervision network in a multi-level supervision network for forecasting to obtain a forecasting result of each level supervision network, wherein the multi-level supervision network comprises a plurality of different levels of supervision networks, calculates the forecasting result of each level supervision network by using a preset loss function to obtain a calculating result corresponding to each level supervision network, trains each level supervision network in the multi-level supervision network according to a network gradient descent algorithm and the calculating result corresponding to each level supervision network to obtain a target multi-level supervision network, and predicting the face image to be detected by using a target multi-level surveillance network, and judging whether the face image to be detected is a forged face or not according to a prediction result. The method and the device have the advantages that the whole network structure is assisted through the binary mask label of the face image, the robustness and the generalization of the network are improved through a hierarchical supervision method, meanwhile, the uncertainty of data naturally carried by the mask label is processed through an uncertainty estimation method, the characteristics of the image are effectively extracted through a self-attention transformation network, and therefore the accuracy of a detection result is improved.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flowchart of a face forgery detection method based on uncertainty perception hierarchical supervision according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a face forgery detection apparatus based on uncertainty perception hierarchical supervision according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.
The following describes a face forgery detection method and apparatus based on uncertainty perception hierarchical supervision according to an embodiment of the present application with reference to the drawings.
Example one
Fig. 1 is a schematic flowchart of a face forgery detection method based on uncertainty perception hierarchical supervision according to an embodiment of the present application, and as shown in fig. 1, the method may include:
In this embodiment, the number of different image samples obtained each time may be the same, for example, 120 different image samples are obtained each time.
And, in the embodiment of the present invention, the face image samples in the form of a matrix may be obtained, for example, one face image sample obtained is I e RH×W×3。
And 102, extracting a characteristic map of each face image sample in a plurality of different face image samples through a convolutional neural network and an upper sampling layer.
In an embodiment of the present invention, a method for extracting a feature map of each face image sample in a plurality of different face image samples through a convolutional neural network and an upsampling layer may include the following steps:
step a, extracting an initial characteristic image of each face image sample in a plurality of different face image samples through a convolution network.
In the embodiment of the present invention, the initial feature map extracted from one face image sample is
And b, increasing the resolution of the initial feature map of each face image sample by adopting a plurality of continuous up-sampling convolution blocks to obtain the feature map of each face image sample.
Wherein, in the embodiment of the invention, the initial feature map F of the face image sample0The size of the convolution block is smaller than the size of the mask label, and based on the requirement, a plurality of continuous up-sampling convolution blocks are adopted to increase the resolution of the initial feature map of each facial image sample, so that the feature map of each facial image sample is obtained. For example, in the embodiment of the present invention, the resolution of the initial feature map of each face image sample may be increased by using three consecutive upsampling convolution blocks, so as to obtain the feature map F ∈ R of each face image sampleh×w×c。
And 103, inputting the feature map of each face image sample into each layer of supervision network in the multi-layer supervision network to predict to obtain the prediction result of each layer of supervision network.
In the embodiment of the invention, the multi-level supervisory network comprises a plurality of supervisory networks of different levels. Specifically, in the embodiment of the present invention, the multi-level monitoring network may include a pixel-level monitoring network, a region-level monitoring network, and an image-level monitoring network.
And the method for inputting the feature map of each face image sample into each level of supervision network in the multi-level supervision network to predict and obtain the prediction result of each level of supervision network can comprise the following steps:
step 1031, inputting the feature map of each face image sample into a pixel-level supervision network for prediction to obtain a pixel-level prediction result of each face image sample.
In the embodiment of the invention, the step of inputting the feature map of each face image sample into the pixel-level supervision network to predict to obtain the pixel-level prediction result of each face image sample comprises the step of predicting to obtain the pixel-level prediction result of each face image sample through a plurality of convolutional neural networks.
And, in an embodiment of the present invention, the pixel-level prediction result of each face image sample obtained through step 1031 includes a prediction result of each pixel in each face image sample.
And step 1032, inputting the feature map of each face image sample into an area-level supervision network for prediction to obtain an area-level prediction result of each face image sample.
In the embodiment of the present invention, the method for inputting the feature map of each face image sample into the area-level supervision network to predict and obtain the area-level prediction result of each face image sample may include the following steps:
step one, obtaining a normalized variance graph corresponding to the feature graph of each face image sample by using a Sigmoid function.
In the embodiment of the invention, a normalized variance map S (σ) corresponding to the feature map of each face image sample is obtained by using a Sigmoid function.
And step two, obtaining an uncertainty perception characteristic image of each face image sample based on the normalized variance image of each face image sample.
In the embodiment of the invention, an uncertainty perception characteristic map of each face image sample is obtained through a first formula, wherein the first formula is as follows:
Fu=[F|F⊙(1-S(σ))]
and step three, mapping the uncertainty perception characteristic image of each facial image sample through a linear mapping layer to obtain the implicit vector representation of each facial image sample.
In the embodiment of the present invention, before mapping the uncertainty perceptual feature map of each face image sample through the linear mapping layer, the uncertainty perceptual feature map F of each face image sample needs to be mappeduAnd (4) carrying out serialization.
Specifically, in the embodiment of the invention, the uncertainty perception feature map F of each face image sample is useduThe method for sequencing comprises the step of subjecting FuCutting the face image into M small blocks, wherein each small block is a square with the same size, and stretching each small block in 2D to obtain a vector F of each face image sampletAnd then, the vector F of each face image sample is mapped by utilizing a learnable linear mapping layertMapping to implicit vector characterization E0。
And step four, obtaining the region-level characterization of each face image sample by using the implicit vector characterization of each face image sample.
In the embodiment of the invention, the implicit vector representation of each facial image sample and the learnable position code of each facial image sample are directly added to obtain the complete area-level representation z of each facial image sample0So as to maintain the position information between the areas of each face image sample.
Specifically, in the embodiment of the invention, each face image sample has complete area-level characterization z0Comprises the following steps:
z0=FtE0+Epos,
wherein E isposAnd coding the learnable position of each face image sample.
And fifthly, inputting the area level characteristics and the category marks of each face image sample into the self-attention transformation network layer to obtain an area level prediction result of each face image sample.
In an embodiment of the present invention, the Self-Attention transforming network Layer may include an L-Layer MSA (Multi-head Self-Attention module), and an MLP (Multi-Layer per unit).
And, in an embodiment of the invention, the output from the l-th layer in the attention transforming network layer is:
wherein LN (-) represents a layer regularization operation, andfor the output variable of the corresponding multi-headed self-attention module, zl-1And zlRespectively representing input and output coded representations of different self-attention transforming network layers.
Further, in embodiments of the present invention, after L self-attention transforming network layers, an encoded sequence output T may be obtainedcls,T1,...,TM]Wherein, TclsIndicates the class header, T1,...,TMAnd predicting results for each region level of the face image sample.
And 1033, inputting the feature map of each face image sample into an image-level supervision network for prediction to obtain an image-level prediction result of each face image sample.
In the embodiment of the present invention, the image-level prediction result of each facial image sample includes a category to which each facial image sample belongs. For example, in the embodiment of the present invention, the image-level prediction result may use 0 to indicate that the category to which the input face image belongs is a fake face, and may use 1 to indicate that the category to which the input face image belongs is a real face.
And 104, calculating the prediction result of each layer of supervision network by using a preset loss function to obtain a calculation result corresponding to each layer of supervision network.
It should be noted that, in the embodiment of the present invention, a mask tag with data uncertainty is used as an auxiliary supervisory signal, and an uncertainty estimation method is used to process the data uncertainty naturally carried by the mask tag.
In an embodiment of the present invention, the uncertainty estimation method may include:
the method comprises the following steps: to model the data uncertainty characterizing learning under hierarchical aiding signal supervision, the characterization can be a probability distribution z-p (z | x).
Specifically, in the embodiment of the present invention, the hierarchical characterization z of each sample x follows a multivariate gaussian distribution:
wherein μ represents the mean of the gaussian distribution, Σ represents the diagonal covariance of the multivariate gaussian distribution, given a sample x, two convolutional neural networks are used to predict the above parameters, the mean μ is the most likely prediction mask, and the diagonal covariance Σ is the uncertainty of the data.
The second method comprises the following steps: the data uncertainty is estimated using a cross entropy loss function.
Specifically, in the embodiment of the present invention, the cross entropy loss function is:
wherein y is a binary mask label corresponding to the face image sample, and z isiAnd (3) representing the hierarchical representation corresponding to each pixel point, wherein the hierarchical representation of each pixel point obeys multivariate Gaussian distribution, and N is the number of the pixel points of the face image sample. And the mean value mu can be used as a prediction mask of the human face image, and sigma can be regarded asAnd the uncertainty of the data is introduced so as to improve the accuracy of the detection result.
The third method comprises the following steps: the data uncertainty is estimated using a constraint loss function.
Specifically, in the embodiment of the present invention, the constraint loss function is:
where σ represents the variance of the gaussian distribution.
Further, in an embodiment of the present invention, the sampling operation required to estimate the probability distribution in the above-mentioned method one is not trivial, so that the deep learning model cannot back-propagate the gradient to minimize the objective loss function. And, the cross entropy loss function in the second method may cause the multi-level supervised network to predict a very small Σ all the time, so as to minimize the objective function. Based on the fact that a KL divergence term is adopted to constrain the distribution N (z | mu, Σ) to be close to N (e |0, I), in the embodiment of the present invention, the cross entropy loss function and the constraint loss function can be combined to form a composite loss function as a preset loss function of the multi-level supervisory network, so that the multi-level supervisory network can better estimate the uncertainty. Illustratively, in the embodiment of the present invention, the composite loss function is obtained by adding the cross-entropy loss function and the constraint loss function.
And 105, training each hierarchy supervision network in the multilevel supervision network according to the network gradient descent algorithm and the calculation result corresponding to each hierarchy supervision network to obtain the target multilevel supervision network.
And 106, predicting the face image to be detected by using the target multi-level surveillance network, and judging whether the face image to be detected is a forged face according to a prediction result.
The application provides a face forgery detection method based on uncertainty perception level supervision, which comprises the steps of obtaining a plurality of different face image samples, extracting a characteristic diagram of each face image sample in the plurality of different face image samples through a convolutional neural network and an upper sampling layer, inputting the characteristic diagram of each face image sample into each level supervision network in the level supervision network for forecasting to obtain a forecasting result of each level supervision network, wherein the level supervision network comprises a plurality of different levels of supervision networks, utilizing a preset loss function to calculate the forecasting result of each level supervision network to obtain a calculating result corresponding to each level supervision network, training each level supervision network in the level supervision network according to a network gradient descent algorithm and the calculating result corresponding to each level supervision network to obtain a target level supervision network, utilizing the target level supervision network to forecast a face image to be detected, and judging whether the face image to be detected is a forged face or not according to the prediction result. The method and the device have the advantages that the whole network structure is assisted through the binary mask label of the face image, the robustness and the generalization of the network are improved through a hierarchical supervision method, meanwhile, the uncertainty of data naturally carried by the mask label is processed through an uncertainty estimation method, the characteristics of the image are effectively extracted through a self-attention transformation network, and therefore the accuracy of a detection result is improved.
Example two
Further, fig. 2 is a schematic structural diagram of a face forgery detection apparatus based on uncertainty perception hierarchical supervision according to an embodiment of the present application, and as shown in fig. 2, the face forgery detection apparatus may include:
an obtaining module 201, configured to obtain multiple different face image samples;
an extraction module 202, configured to extract a feature map of each face image sample in multiple different face image samples through a convolutional neural network and an upsampling layer;
the prediction module 203 is configured to input the feature map of each face image sample into each hierarchical monitoring network in the hierarchical monitoring network to perform prediction to obtain a prediction result of each hierarchical monitoring network, where the hierarchical monitoring network includes multiple different hierarchical monitoring networks;
the calculation module 204 is configured to calculate the prediction result of each layer of the supervision network by using a preset loss function to obtain a calculation result corresponding to each layer of the supervision network;
the training module 205 is configured to train each hierarchical monitoring network in the multi-hierarchical monitoring network according to a network gradient descent algorithm and a calculation result corresponding to each hierarchical monitoring network to obtain a target multi-hierarchical monitoring network;
and the judging module 206 is configured to predict the facial image to be detected by using the target multi-level surveillance network, and judge whether the facial image to be detected is a counterfeit face according to the prediction result.
To implement the above embodiments, the present disclosure also proposes a non-transitory computer-readable storage medium.
A non-transitory computer-readable storage medium provided by an embodiment of the present disclosure stores a computer program; when executed by a processor, the computer program can implement the face forgery detection method based on the uncertainty perception hierarchical supervision as shown in fig. 1.
In order to implement the above embodiments, the present disclosure also provides a computer device.
The computer device provided by the embodiment of the disclosure comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor; the processor, when executing the program, is capable of implementing the method as shown in fig. 1.
The application provides a face forgery detection method, a device and a storage medium based on uncertainty perception level supervision, which obtains a plurality of different face image samples, extracts a characteristic diagram of each face image sample in the plurality of different face image samples through a convolution neural network and an upper sampling layer, inputs the characteristic diagram of each face image sample into each level supervision network in a multi-level supervision network for forecasting to obtain a forecasting result of each level supervision network, wherein the multi-level supervision network comprises a plurality of different levels of supervision networks, calculates the forecasting result of each level supervision network by using a preset loss function to obtain a calculating result corresponding to each level supervision network, trains each level supervision network in the multi-level supervision network according to a network gradient descent algorithm and the calculating result corresponding to each level supervision network to obtain a target multi-level supervision network, and predicting the face image to be detected by using a target multi-level surveillance network, and judging whether the face image to be detected is a forged face or not according to a prediction result. The method and the device have the advantages that the whole network structure is assisted through the binary mask label of the face image, the robustness and the generalization of the network are improved through a hierarchical supervision method, meanwhile, the uncertainty of data naturally carried by the mask label is processed through an uncertainty estimation method, the characteristics of the image are effectively extracted through a self-attention transformation network, and therefore the accuracy of a detection result is improved.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are exemplary and should not be construed as limiting the present application and that changes, modifications, substitutions and alterations in the above embodiments may be made by those of ordinary skill in the art within the scope of the present application.
Claims (10)
1. A face forgery detection method based on uncertainty perception level supervision is characterized by comprising the following steps:
acquiring a plurality of different face image samples;
extracting a feature map of each face image sample in the plurality of different face image samples through a convolutional neural network and an upsampling layer;
inputting the feature map of each face image sample into each hierarchy supervision network in a multilevel supervision network for prediction to obtain a prediction result of each hierarchy supervision network, wherein the multilevel supervision network comprises a plurality of different hierarchy supervision networks;
calculating the prediction result of each layer of the supervision network by using a preset loss function to obtain a calculation result corresponding to each layer of the supervision network;
training each hierarchy supervision network in the multilevel supervision network according to a network gradient descent algorithm and a calculation result corresponding to each hierarchy supervision network to obtain a target multilevel supervision network;
and predicting the face image to be detected by using the target multi-level surveillance network, and judging whether the face image to be detected is a forged face according to a prediction result.
2. The method of claim 1, wherein said extracting a feature map for each of said plurality of different face image samples via a convolutional neural network and an upsampling layer comprises: extracting an initial feature map of each face image sample in the plurality of different face image samples through a convolutional network;
and increasing the resolution of the initial feature map of each face image sample by adopting a plurality of continuous up-sampling convolution blocks to obtain the feature map of each face image sample.
3. The method of claim 1, wherein the multi-level supervisory network comprises a pixel level supervisory network, a region level supervisory network, an image level supervisory network,
inputting the feature map of each face image sample into each hierarchy supervision network in the multilevel supervision network to predict to obtain the prediction result of each hierarchy supervision network, wherein the prediction result comprises the following steps:
inputting the feature map of each face image sample into a pixel level supervision network for prediction to obtain a pixel level prediction result of each face image sample;
inputting the feature map of each face image sample into an area level supervision network for prediction to obtain an area level prediction result of each face image sample;
and inputting the feature map of each face image sample into an image-level supervision network for prediction to obtain an image-level prediction result of each face image sample.
4. The method of claim 3, wherein inputting the feature map of each face image sample into a pixel-level supervised network to predict the pixel-level predictor of each face image sample comprises predicting the pixel-level predictor of each face image sample by a plurality of convolutional neural networks.
5. The method as claimed in claim 3, wherein the inputting the feature map of each facial image sample into a region-level supervision network for prediction to obtain the region-level prediction result of each facial image sample comprises:
obtaining a normalized variance graph corresponding to the feature graph of each face image sample by using a Sigmoid function;
obtaining an uncertainty perception characteristic image of each face image sample based on the normalized variance image of each face image sample;
mapping the uncertainty perception characteristic image of each facial image sample through a linear mapping layer to obtain an implicit vector representation of each facial image sample;
obtaining the region-level characterization of each facial image sample by using the implicit vector characterization of each facial image sample;
and inputting the region level characteristics and the class marks of each facial image sample into a self-attention transformation network layer to obtain a region level prediction result of each facial image sample.
6. A method as claimed in claim 3 wherein the image-level prediction result for each face image sample comprises the category to which said each face image sample belongs.
7. The method of claim 1, wherein the pre-set loss function comprises a composite loss function comprising a cross-entropy loss function and a constrained loss function,
the cross entropy loss function is:
wherein y is a binary mask label corresponding to the face image sample, and z isiAnd (3) performing layered characterization corresponding to each pixel point, wherein the layered characterization of each pixel point obeys multivariate Gaussian distribution, sigma represents the diagonal covariance of the multivariate Gaussian distribution, and N is the number of the pixel points of the face image sample.
The constraint loss function is:
where σ represents the variance of the gaussian distribution.
8. A human face forgery detection device based on uncertainty perception level supervision is characterized by comprising the following modules:
the acquisition module is used for acquiring a plurality of different face image samples;
the extraction module is used for extracting a feature map of each face image sample in the plurality of different face image samples through a convolutional neural network and an upper sampling layer;
the prediction module is used for inputting the feature map of each face image sample into each hierarchy supervision network in a multilevel supervision network to predict to obtain the prediction result of each hierarchy supervision network, wherein the multilevel supervision network comprises a plurality of different levels of supervision networks;
the calculation module is used for calculating the prediction result of each layer of the supervision network by using a preset loss function to obtain a calculation result corresponding to each layer of the supervision network;
the training module is used for training each hierarchy supervision network in the multilevel supervision networks according to a network gradient descent algorithm and a calculation result corresponding to each hierarchy supervision network to obtain a target multilevel supervision network;
and the judging module is used for predicting the face image to be detected by utilizing the target multi-level surveillance network and judging whether the face image to be detected is a forged face or not according to a prediction result.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to any one of claims 1-7 when executing the program.
10. A non-transitory computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the method of any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210167833.1A CN114663936A (en) | 2022-02-23 | 2022-02-23 | Face counterfeiting detection method based on uncertainty perception level supervision |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210167833.1A CN114663936A (en) | 2022-02-23 | 2022-02-23 | Face counterfeiting detection method based on uncertainty perception level supervision |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114663936A true CN114663936A (en) | 2022-06-24 |
Family
ID=82026730
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210167833.1A Pending CN114663936A (en) | 2022-02-23 | 2022-02-23 | Face counterfeiting detection method based on uncertainty perception level supervision |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114663936A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117253262A (en) * | 2023-11-15 | 2023-12-19 | 南京信息工程大学 | Fake fingerprint detection method and device based on commonality feature learning |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019128646A1 (en) * | 2017-12-28 | 2019-07-04 | 深圳励飞科技有限公司 | Face detection method, method and device for training parameters of convolutional neural network, and medium |
CN112329696A (en) * | 2020-11-18 | 2021-02-05 | 携程计算机技术(上海)有限公司 | Face living body detection method, system, equipment and storage medium |
CN112784781A (en) * | 2021-01-28 | 2021-05-11 | 清华大学 | Method and device for detecting forged faces based on difference perception meta-learning |
CN112906676A (en) * | 2021-05-06 | 2021-06-04 | 北京远鉴信息技术有限公司 | Face image source identification method and device, storage medium and electronic equipment |
CN113723295A (en) * | 2021-08-31 | 2021-11-30 | 浙江大学 | Face counterfeiting detection method based on image domain frequency domain double-flow network |
-
2022
- 2022-02-23 CN CN202210167833.1A patent/CN114663936A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019128646A1 (en) * | 2017-12-28 | 2019-07-04 | 深圳励飞科技有限公司 | Face detection method, method and device for training parameters of convolutional neural network, and medium |
CN112329696A (en) * | 2020-11-18 | 2021-02-05 | 携程计算机技术(上海)有限公司 | Face living body detection method, system, equipment and storage medium |
CN112784781A (en) * | 2021-01-28 | 2021-05-11 | 清华大学 | Method and device for detecting forged faces based on difference perception meta-learning |
CN112906676A (en) * | 2021-05-06 | 2021-06-04 | 北京远鉴信息技术有限公司 | Face image source identification method and device, storage medium and electronic equipment |
CN113723295A (en) * | 2021-08-31 | 2021-11-30 | 浙江大学 | Face counterfeiting detection method based on image domain frequency domain double-flow network |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117253262A (en) * | 2023-11-15 | 2023-12-19 | 南京信息工程大学 | Fake fingerprint detection method and device based on commonality feature learning |
CN117253262B (en) * | 2023-11-15 | 2024-01-30 | 南京信息工程大学 | Fake fingerprint detection method and device based on commonality feature learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107766894B (en) | Remote sensing image natural language generation method based on attention mechanism and deep learning | |
Hoang | An Artificial Intelligence Method for Asphalt Pavement Pothole Detection Using Least Squares Support Vector Machine and Neural Network with Steerable Filter‐Based Feature Extraction | |
CN110084156B (en) | Gait feature extraction method and pedestrian identity recognition method based on gait features | |
CN110287960A (en) | The detection recognition method of curve text in natural scene image | |
CN112560831B (en) | Pedestrian attribute identification method based on multi-scale space correction | |
CN116824307B (en) | Image labeling method and device based on SAM model and related medium | |
CN109840483B (en) | Landslide crack detection and identification method and device | |
CN114841319A (en) | Multispectral image change detection method based on multi-scale self-adaptive convolution kernel | |
CN107103308A (en) | A kind of pedestrian's recognition methods again learnt based on depth dimension from coarse to fine | |
CN113111716A (en) | Remote sensing image semi-automatic labeling method and device based on deep learning | |
CN115661480A (en) | Image anomaly detection method based on multi-level feature fusion network | |
CN111126155B (en) | Pedestrian re-identification method for generating countermeasure network based on semantic constraint | |
CN114549470A (en) | Method for acquiring critical region of hand bone based on convolutional neural network and multi-granularity attention | |
CN112446292A (en) | 2D image salient target detection method and system | |
CN117036715A (en) | Deformation region boundary automatic extraction method based on convolutional neural network | |
CN106203373A (en) | A kind of human face in-vivo detection method based on deep vision word bag model | |
CN113780129B (en) | Action recognition method based on unsupervised graph sequence predictive coding and storage medium | |
Zuo et al. | A remote sensing image semantic segmentation method by combining deformable convolution with conditional random fields | |
CN114663936A (en) | Face counterfeiting detection method based on uncertainty perception level supervision | |
CN118465876A (en) | Two-stage approach precipitation prediction method based on EOF-Kmeans clustering and LDM | |
CN114581789A (en) | Hyperspectral image classification method and system | |
CN117314938B (en) | Image segmentation method and device based on multi-scale feature fusion decoding | |
CN117992919A (en) | River flood early warning method based on machine learning and multi-meteorological-mode fusion | |
CN117829243A (en) | Model training method, target detection device, electronic equipment and medium | |
CN111882545B (en) | Fabric defect detection method based on bidirectional information transmission and feature fusion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |