CN116206375A - Face counterfeiting detection method based on double-layer twin network and sustainable learning - Google Patents
Face counterfeiting detection method based on double-layer twin network and sustainable learning Download PDFInfo
- Publication number
- CN116206375A CN116206375A CN202310474306.XA CN202310474306A CN116206375A CN 116206375 A CN116206375 A CN 116206375A CN 202310474306 A CN202310474306 A CN 202310474306A CN 116206375 A CN116206375 A CN 116206375A
- Authority
- CN
- China
- Prior art keywords
- face
- network
- supervised
- learning
- unsupervised
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 81
- 238000000034 method Methods 0.000 claims description 28
- 238000005516 engineering process Methods 0.000 claims description 27
- 238000005242 forging Methods 0.000 claims description 24
- 230000006870 function Effects 0.000 claims description 20
- 230000011218 segmentation Effects 0.000 claims description 10
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 9
- 230000004927 fusion Effects 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 239000013598 vector Substances 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000009825 accumulation Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000015556 catabolic process Effects 0.000 claims description 3
- 238000006731 degradation reaction Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 239000000284 extract Substances 0.000 abstract description 3
- 239000010410 layer Substances 0.000 description 46
- 238000013135 deep learning Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000007547 defect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000010276 construction Methods 0.000 description 2
- 239000002355 dual-layer Substances 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/40—Spoof detection, e.g. liveness detection
- G06V40/45—Detection of the body part being alive
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- Medical Informatics (AREA)
- Human Computer Interaction (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a face counterfeiting detection method based on a double-layer twin network and sustainable learning, which comprises the following steps: constructing an image training set for a continuous learning strategy; training the constructed double-layer twin network through a continuous learning strategy, wherein the double-layer twin network comprises: the supervised subnetwork for fast learning is applicable to the unsupervised subnetwork for slow learning and the memory module; the non-supervision sub-network extracts the characteristics through non-supervision learning and guides the supervised sub-network, and the supervised sub-network performs the supervised learning to extract the characteristics under the guidance of the non-supervision sub-network; the memory module is used for consolidating the learned knowledge; inputting the image to be detected into a trained detection model, and detecting the specific position of face counterfeiting in the image by dividing and detecting the image by the model. The invention can improve the accuracy of the human face fake detection model and simultaneously realize the prediction of specific fake positions; and the generalization performance of the face counterfeiting detection model is improved by using a continuous learning strategy.
Description
Technical Field
The invention belongs to the field of artificial intelligence security, and particularly relates to a face counterfeiting detection method based on a double-layer twin network and sustainable learning.
Background
Face deep forgery is a generated deep learning algorithm that can create or modify facial features of a face, and it is difficult for the human eye to distinguish between the true and false images from these modified false videos or images, so how to effectively detect face forgery is a problem that needs to be solved.
Currently, most face counterfeiting detection methods are based on deep learning, and the methods generally use an artificial neural network to extract features of a face image and identify tamper marks so as to detect whether the face image is counterfeit. However, the neural network model of many face fake detection methods based on deep learning generally has only a single network architecture, and such a network architecture model usually tends to extract features in a single dimension when training to extract features, but ignores extracting features from a multi-dimension perspective, so that the accuracy of detection is not high. Therefore, if the network model is decoupled into two sub-networks and the multi-dimensional features are interactively fused, the capability of overall feature extraction of the model can be improved, and the accuracy of model detection can be improved.
Although the detection method based on deep learning can obtain a good detection effect, most methods focus on improving accuracy, but neglect generalization of the detection method. Therefore, when the forgery technology to be detected is different from the forgery technology at the time of training, the accuracy of detection may be greatly reduced, mainly because of insufficient generalization of the detection model.
Although a few new methods for improving the generalization of detection have been proposed, there is still a great room for improvement of the generalization of the detection model. The continuous learning refers to continuous learning of the model in a plurality of different task projects, and the performance of the model on new knowledge is improved on the premise of not forgetting the learned knowledge to the maximum extent. Therefore, if the model is optimized based on continuous learning, the defect of the model in generalization can be well made up, and the model has good accuracy in the technology of detecting unknown face counterfeiting. On the other hand, many methods based on continuous learning perform iterative supervised learning on the data so that the obtained feature representations are prone to overfitting. This is because supervised learning is more prone to fit old features on known tasks, and thus during continuous learning, when new unknown tasks are learned, it is more prone to forget old tasks that were learned before. In general, when a model trained by supervised learning is used for a new unknown task, catastrophic forgetting is easy to occur, and accuracy is greatly reduced.
In real life, because the face counterfeiting modes are various, the influence of different face counterfeiting modes on the image is different, and the existing face counterfeiting detection generally only judges whether the face image is counterfeit or not, and rarely judges the specific counterfeiting position of the face counterfeiting image.
In summary, it can be obtained that the main defects and shortcomings of the existing face counterfeiting detection technology are as follows:
1) Most of the existing face fake detection network models are single network model structures, when the model structures extract features, the features with single dimension tend to be extracted, the features with multiple dimensions are easily ignored, and the detection accuracy is not high enough;
2) The existing face counterfeiting detection method has insufficient generalization performance, and the accuracy of prediction is not high when facing to unknown counterfeiting technology. In addition, the existing continuous learning-based method has the problems of disastrous forgetting and overfitting to a certain extent because most of the existing continuous learning-based methods only adopt supervised learning;
3) The existing face fake detection method generally only judges whether the face image is fake or not, and does not detect specific fake positions.
Disclosure of Invention
The invention aims to: the invention aims to provide a method capable of improving accuracy, generalization performance and detecting specific face counterfeiting
A human face fake detection method based on a double-layer twin network and sustainable learning of a position.
The technical scheme is as follows: the face counterfeiting detection method of the invention comprises the following steps:
s1, preprocessing the obtained huge number of face fake images, dividing an image dataset into different task datasets according to different fake technologies, and integrating all the task datasets to form a complete face fake image dataset;
s2, training a pre-constructed double-layer twin network based on a continuous learning strategy by taking a face counterfeiting image dataset as a training set to obtain a face counterfeiting detection model based on the double-layer twin network and the continuous learning;
s3, detecting a face image to be detected by adopting a face counterfeiting detection model, and predicting the counterfeiting probability of each pixel point in the image; if the forging probability is larger than the threshold value, judging that the pixel point is forged, and finally obtaining a predicted forging result image and a forging position corresponding to the image to be detected.
Further, in step S1, the implementation steps of the face counterfeit image dataset are as follows:
s11, selecting a public data set faceforensis++ and a face public data set CelebA of a face fake detection task;
the public data set comprises a large number of face fake videos, four face fake technologies of DeepFakes, face2Face, faceSwap and NeuralTextures are utilized, and a frame sampling mode is adopted to obtain face fake images according to the videos in the public data set;
the Face public data set comprises a large number of real Face images, four Face forging technologies of deep fakes, face2Face, faceSwap and NeuralTextures are utilized to forge the Face images in the Face public data set, and each image adopts a forging mode at random;
s12, uniformly adjusting all the obtained face fake images to be the same size, and dividing the face fake images into t task data sets according to different fake technologiesWherein->The dataset composed of individual forgery techniques is represented asAll task data sets are integrated together to form a complete face counterfeit image data set.
Further, in step S2, the dual-layer twin network mainly includes three parts: a supervised subnetwork for fast learning, an unsupervised subnetwork for slow learning, and a memory module;
the unsupervised sub-network fully learns the falsified trace features of the human face falsification in an unsupervised learning mode, fuses each layer of learned features to the supervised sub-network in a feature fusion mode to guide the learning of the supervised sub-network, calculates self-supervision loss by utilizing the current detection result of the supervised sub-network, and updates the weight parameters of the unsupervised sub-network;
the supervised subnetwork is trained under supervised learning with three losses including: the loss is calculated according to the unsupervised detection result of the unsupervised subnetwork; calculating the loss by using the supervised learning detection result of the supervised subnetwork under the non-supervision guidance; calculating KL divergence between the detection result of the current model on the memory sample and the recorded old detection result of the memory sample under the guidance of the unsupervised subnetwork;
the memory module consolidates the knowledge learned by the face counterfeiting detection model through continuous experience playback, a part of samples randomly extracted each time are copied into the memory module, and when the face counterfeiting detection model learns a new counterfeiting technology, sample data are continuously sampled from the memory module to train the face counterfeiting detection model.
Further, the unsupervised subnetwork includes a network for capturing counterfeit traceSelf-monitoring encoder->And affine network->Wherein a network of counterfeit marks is captured->The method is composed of 6 convolution layers in an accumulation and stacking way, and the learning aim is to find a fake area in an image and to cover the fake area; self-supervision encoder->Each block in the ResNet34 is utilized to replace the corresponding block in the encoder in the U-Net architecture in sequence; affine network->The system consists of a full connection layer, a batch standardization layer and a ReLU activation function layer;
the supervised subnetwork comprises a supervised encoderDecoder->And a split detector->Wherein, supervised encoder->Self-supervision coder in the same structure as the non-supervision subnetwork>Decoder->Is composed of 6 convolution layers and residual structure, and the supervised coder in the supervised sub-network>And decoder->The residual structures are connected in the same dimension to form a U-Net structure; split detector->Is composed of a full connection layer and a Sigmoid function layer.
Further, in step S2, with the face counterfeit image dataset as a training set, based on a continuous learning strategy, training the pre-built double-layer twin network includes the following steps:
s21, randomly taking a data set consisting of the t-th fake technology from the training setThe j th batch of (a) has label sample images, expressed as +.>Copying a portion of the samples of the batch of sample images into a memory module, represented asThe method comprises the steps of carrying out a first treatment on the surface of the Wherein, recordThe memory module is a memory area for storing sample data when the double-layer twin network is constructed;
s22, regarding the firstData set consisting of individual forgery techniques->Sample image of lot j->Duplicate into two copies, input the unsupervised subnetwork through two different routes separately:
At this time, the features are calculated by an unsupervised manner and />Barlow Tins self-supervision loss->The expression is as follows:
in the formula ,is a super-parameter weighting factor; />Is->And->Is a cross-correlation matrix of (a); m and n represent->And->Index of dimensions on the two feature vectors; />Representation matrix->Meta-on mid-diagonalA prime value; />Representation matrix->The element values of the m-th row and the n-th column;
wherein ,equivalent to two different amplified feature vectors +.>Mth dimension and->B is represented as an index of the current batch of samples;
s23, regarding the data set composed by the t-th fake technologyThe%>Batched label sample image->Firstly, inputting the result into an unsupervised sub-network, and outputting the prediction result of the unsupervised sub-network +.>Using the prediction resultCalculating a first segmentation loss function Dice loss->Feature extraction of a constrained face counterfeiting detection model under unsupervised learning, and a first segmentation loss function Dice loss +.>Expression of (2)The formula is as follows:
wherein y represents the true forging position of the mark;
then, sampling the sample data in the memory module to obtain the sample dataPredicting by using the supervised subnetwork guided by the unsupervised subnetwork to obtain the prediction result of the supervised subnetwork>By using the prediction result->Calculating a second division loss function Dice loss->Feature extraction under supervised learning of a constrained face counterfeiting detection model, and second segmentation loss function Dice loss ++>The expression of (2) is as follows:
representing the number of sample data sampled in the memory module, < >>Representing the true counterfeit location of the sample data annotation of the sample;
prediction results through a supervised networkAnd learning the same sampleRecorded old prediction +.>Calculating KL divergence between the two to obtain +.>Loss:
wherein ,/>Indicating the calculation of the KL divergence between the two inputs, and (2)>Is the hyper-parametric degradation coefficient,/->Indicates activation via the neural network Softmax layer, < ->Is a super parameter temperature coefficient;
finally, supervised loss of face counterfeit detection modelThe method comprises the following steps:
using supervised lossesAnd simultaneously, optimizing the weight parameters of the supervised sub-network and the unsupervised sub-network, and continuously iterating until the conditions are met, so as to finally obtain the optimal weight parameters.
Further, in step S2, the task data set is repeatedly selected based on the continuous learning strategyDifferent ones of (a)Subtask data set composed of manufacturing technology +.>And for continuously training the two-layer twin network until the task dataset is +.>All the t subtask data sets are selected; finally, the optimal high generalization weight parameter is obtained.
Compared with the prior art, the invention has the following remarkable effects:
1. the double-layer twin network model designed by the invention consists of two sub-networks, wherein the two sub-networks respectively correspond to the extracted features under the unsupervised learning and the extracted features under the supervised learning, the two sub-networks interact, the features are fused, the model can acquire the multi-dimensional features, the feature extraction capability of the face counterfeiting image of the model is reasonably enhanced, and the accuracy of the face counterfeiting detection is effectively improved;
2. the invention uses a continuous learning strategy to train the double-layer twin network model, thereby effectively overcoming the defect of the existing method in the aspect of insufficient detection generalization performance; meanwhile, the feature is extracted by adding the unsupervised learning in the continuous learning, so that the problems of disastrous forgetting and overfitting caused by the fact that the existing continuous learning-based method only adopts the supervised learning mode are solved, and the generalization performance of the model face counterfeiting detection is further improved;
3. the invention adopts a special loss function for segmentation detection and a U-Net network structure, judges whether the image is forged or not, and can also segment the image to detect the specific position of face forging.
Drawings
FIG. 1 is a diagram of the whole structure in the training process of a face fake detection model;
FIG. 2 is a schematic diagram of an unsupervised subnetwork in a two-layer twin network of the present invention;
FIG. 3 is a schematic diagram of a supervised subnetwork in a two-tier twin network of the present invention;
fig. 4 is a schematic diagram of the interaction between two subnetworks in a two-layer twin network of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and the detailed description.
The human face counterfeiting detection method based on the double-layer twin network and sustainable learning effectively improves the accuracy of model detection by combining the supervised learning and the unsupervised learning. Meanwhile, the generalization capability of the detection model to unknown fake modes is greatly improved through a continuous learning strategy, and the unsupervised addition also relieves the problems of disastrous forgetting and overfitting in continuous learning. In addition, the invention can accurately detect the specific forged position.
As shown in fig. 1, the embodiment provides a face counterfeiting detection method based on a double-layer twin network and sustainable learning, and the specific details include the following steps:
Preprocessing the obtained huge number of face fake images, dividing the image data set into different task data sets according to different fake technologies, and integrating all the task data sets to form a complete face fake image data set so as to facilitate subsequent continuous learning.
In the embodiment, two data sets are selected as experimental data sets, namely a data set faceforensis++ which is most commonly used for Face forging and a Face public CelebA data set, wherein the faceforensis++ data set has a large number of Face forging videos, four Face forging technologies, namely deep fakes, face2Face, faceSwap and neuroalTextures, are utilized for sampling frames of the videos by using an opencv-python tool library package, and particularly 10 frames per second are utilized for sampling the videos; celebA data set containing a large number of real Face images, forging the Face images by using four Face forging technologies of deep fakes, face2Face, faceSwap and NeuralTextures, randomly adopting a forging mode for each image to ensure that the number of the images applied by the four forging modes is approximately the same, obtaining a huge number of Face forging images by the two modes, preprocessing the Face forging images, and carrying out the imagesThe dimensions of the images are scaled to 158 x 158; dividing the adjusted image into t task data sets according to different forging techniquesWherein->The dataset consisting of individual forgery techniques is denoted +.>All are integrated together to form a large complete face counterfeit image dataset.
Preferably, t is 4 in the embodiment, and the data set is divided into 4 different tasks, each subtask data set contains 3 ten thousand face fake images, and the total data set contains 12 ten thousand face fake images for training and testing.
And 2, training a pre-constructed double-layer twin network based on a continuous learning strategy by taking the face fake image data set in the step 1 as a drive. The dual-layer twin network mainly comprises three parts: the system comprises a supervised subnetwork suitable for fast learning, an unsupervised subnetwork suitable for slow learning and a memory module.
The unsupervised sub-network fully learns the falsified trace features of the human face falsification in an unsupervised learning mode, fuses each layer of learned features to the supervised sub-network in a feature fusion mode to guide the learning of the supervised sub-network, and calculates self-supervision loss by utilizing the current detection result of the supervised sub-network so as to update the weight parameters of the unsupervised sub-network.
The supervised subnetwork uses three losses to train together under supervised learning, including A1) the unsupervised detection results of the unsupervised subnetwork to calculate losses; a2 Using the supervised learning detection results of the supervised subnetwork under the unsupervised guidance to calculate the loss; a3 Calculating KL divergence between the detection result of the current model on the memory sample under the guidance of the non-supervision subnetwork and the old detection result of the memory sample recorded before, combining the three losses, and realizing the feature extraction of the simultaneous constraint model, wherein the non-supervision subnetwork and the supervision subnetwork have complementary effects.
In addition, the memory module consolidates the knowledge learned by the face counterfeiting detection model through continuous experience playback, a part of samples randomly extracted each time are copied into the memory module, when the face counterfeiting detection model learns a new counterfeiting technology, sample data can be continuously sampled from the memory module, and the face counterfeiting detection model is trained, so that old knowledge learned before is consolidated. The weight parameters of the whole model are optimized through continuous iteration, so that the optimal weight parameters are finally obtained, and the common characteristics among the proper various face counterfeiting technologies can be extracted, so that the high generalization in face counterfeiting detection is realized.
Training a pre-constructed double-layer twin network by using the face fake image data set in the step 1, wherein the method comprises the following steps of:
step 21, randomly selecting a data set consisting of the t-th fake technology from the training image data setThe%>Batched label sample image, denoted +.>Copying part of the samples of the batch of sample images into a memory module of the model, denoted +.>The memory module is a memory area for storing sample data when the double-layer twin network is constructed.
Preferably, the number of sample images per batch taken in this embodiment is 32, and the number of sample images copied into the memory module therein is 12.
Step 22, as shown in FIG. 2, begins to build an unsupervised subnetwork in the two-layer twinning network consisting of a network of captured forgery marksSelf-monitoring encoder->And affine network->The structure is that the network for capturing fake trace consists of 6 convolution layer accumulation stacks, and the self-supervision encoder is->The network of the corresponding block in the encoder in the U-Net architecture is replaced by each block in the Resnet34 in turn, affine network ∈>Consists of a full connection layer, a batch standardization layer and a ReLU activation function layer.
For the firstData set consisting of individual forgery techniques->The%>Batch sample image +.>Duplicate into two, walk two different routes to input the unsupervised subnetwork separately:
Another route 2 is through a network of capturing counterfeit tracesThe network aims to find possible fake areas in the image as much as possible and to mask the fake areas, the specific route is the same as route 1, and finally the characteristic +.>。
At this time, the same sample batch is routed through two different routes to obtain a pair of features and />The self-supervision loss of Barlow Twos of these two features can be calculated in an unsupervised manner>The weight parameters of the unsupervised subnetwork are optimized, so that the unsupervised subnetwork can extract more proper counterfeit characteristics, and meanwhile, the unsupervised learning mode can reduce the overfitting of the model and can relieve the catastrophic forgetting in the continuous learning: />
wherein ,
in the formula (1), the components are as follows,is a super-parameter weighting factor; />Is->And->Is a cross-correlation matrix of (a); m and n represent->And->Index of dimensions on the two feature vectors; />Representation matrix->Element values on the middle diagonal; />Representation matrix->The element values of the m-th row and the n-th column. In the formula (2), ->And is also equivalent to two different amplified feature vectors +>Mth dimension/>And b is represented as an index of the current batch of samples. In this embodiment super parameter +.>Set to 0.5.
Step 23, as shown in FIG. 3, the construction of a supervised subnetwork in a two-layer twinning network is started, the supervised subnetwork being composed of supervised encodersDecoder->And a split detector->The construction, wherein the supervision encoder->Self-supervision coder in the same structure as the non-supervision subnetwork>Decoder->Is composed of multiple convolution layers and residual structure, and the supervised encoder in the supervised sub-network>Decoder->And the residual structures are connected in the same dimension to form a Unet structure. Split detector->Is composed of a full connection layer and a Sigmoid function layer.
As shown in fig. 4, in a supervised subnetworkSupervised encoderSelf-supervising encoder with non-supervising sub-networkThe feature fusion mechanism is arranged on the same-dimensional feature of each layer of block for extracting the features, the non-supervision subnetwork is connected with the supervised subnetwork in a unidirectional way, and the element-by-element multiplication can be carried out aiming at the same-dimensional feature, so that the purpose that the non-supervision subnetwork guides the features of the supervised subnetwork is achieved.
For any data set composed of t fake technologyThe%>Batched label sample imageFirstly, inputting the result into an unsupervised sub-network, and outputting the prediction result of the unsupervised sub-network +.>By using the prediction result->Calculating a first segmentation loss function Dice loss->Feature extraction of constrained double-layer twin network under unsupervised learning, first segmentation loss function Dice loss ∈>The expression of (2) is as follows:
wherein y represents the true forging position of the mark;representing intersection operations, ++>Representing a union operation.
Then sampling the sample data in the memory module to obtain sample dataPredicting by using the supervised subnetwork guided by the unsupervised subnetwork to obtain the prediction result of the supervised subnetwork>By using the prediction result->Calculating a second division loss function Dice loss->Feature extraction of the constrained double-layer twin network under supervised learning, and sampling in the memory module, so that the supervised subnetwork can learn common features among various forging technologies better, and the second segmentation loss function price loss->The expression of (2) is as follows: />
in the formula ,representing the number of sample data sampled in the memory module, < >>Representing the true counterfeit location of the sample data annotation of the sample; in this embodiment, the number of sample data sampled in the memory12.
Using prediction results from sampling from a memory module and then passing through a supervised networkAnd learning the old prediction result recorded at the same time of the sample +.>Calculating KL divergence between the two to obtain loss +.>The method can relieve the problem of catastrophic forgetting in continuous learning, and avoid completely different predicted results from the current predicted results when predicting old learned knowledge when learning new knowledge; using shortened KL divergence to enable predictive outcome of a supervised network>And old prediction result->The difference is kept small, and finally, the double-layer twin network can be forced to learn the common characteristics among various face counterfeiting technologies, so that the face counterfeiting detection model has excellent face counterfeiting detection generalization performance and KL divergence loss->The expression of (2) is as follows:
in the formula ,indicating the calculation of the KL divergence between the two, < >>Is the hyper-parametric degradation coefficient,/->Indicates activation via the neural network Softmax layer, < ->Is a super-parametric temperature coefficient.
Eventually, the two-layer twinned network has a supervised penaltySumming the three loss functions described above:
using supervised lossesAnd simultaneously, optimizing the weight parameters of the supervised sub-network and the unsupervised sub-network, and continuously iterating to finally obtain the optimal weight parameters.
Based on continuous learning strategy, task data sets are repeatedly selected continuouslySubtask data set composed of different forgery techniques +.>And for continuously training the two-layer twin network until the task dataset is +.>And (5) all t subtask data sets are selected. Finally, the optimal high generalization weight parameter is obtained.
And 3, detecting the face image to be detected by using the face counterfeiting detection model trained in the step 2 and based on the double-layer twin network and sustainable learning, and obtaining the counterfeiting prediction probability of each pixel point in the image, if the probability is greater than a threshold value, judging that the pixel point is counterfeiting, and finally obtaining a predicted counterfeiting result image corresponding to the image to be detected, wherein the counterfeiting white pixel point position of the predicted result image in fig. 1 is the counterfeiting position predicted by the model.
Preferably, in this embodiment, the threshold is set to 0.5, i.e. if the prediction probability of a pixel is greater than 0.5, the pixel is determined to be counterfeit.
The above is a preferred embodiment of the present invention, and all changes made according to the technical solution of the present invention belong to the protection scope of the present invention when the generated functional effects do not exceed the scope of the technical solution of the present invention.
Claims (6)
1. The human face counterfeiting detection method based on the double-layer twin network and sustainable learning is characterized by comprising the following steps of:
s1, preprocessing the obtained huge number of face fake images, dividing an image dataset into different task datasets according to different fake technologies, and integrating all the task datasets to form a complete face fake image dataset;
s2, training a pre-constructed double-layer twin network based on a continuous learning strategy by taking a face counterfeiting image dataset as a training set to obtain a face counterfeiting detection model based on the double-layer twin network and the continuous learning;
s3, detecting a face image to be detected by adopting a face counterfeiting detection model, and predicting the counterfeiting probability of each pixel point in the image; if the forging probability is larger than the threshold value, judging that the pixel point is forged, and finally obtaining a predicted forging result image and a forging position corresponding to the image to be detected.
2. The face fake detection method based on the double-layer twin network and sustainable learning according to claim 1, wherein in step S1, the implementation steps of the face fake image dataset are as follows:
s11, selecting a public data set faceforensis++ and a face public data set CelebA of a face fake detection task;
the public data set comprises a large number of face fake videos, four face fake technologies of DeepFakes, face2Face, faceSwap and NeuralTextures are utilized, and a frame sampling mode is adopted to obtain face fake images according to the videos in the public data set;
the Face public data set comprises a large number of real Face images, four Face forging technologies of deep fakes, face2Face, faceSwap and NeuralTextures are utilized to forge the Face images in the Face public data set, and each image adopts a forging mode at random;
s12, uniformly adjusting all the obtained face fake images to the same size, and dividing the face fake images into different parts according to fake technologiesPersonal task data set->Wherein->The dataset consisting of individual forgery techniques is denoted +.>All task data sets are integrated together to form a complete face counterfeit image data set.
3. The face falsification detection method based on a double-layer twin network and sustainable learning according to claim 1, wherein in step S2, the double-layer twin network mainly comprises three parts: a supervised subnetwork for fast learning, an unsupervised subnetwork for slow learning, and a memory module;
the unsupervised sub-network fully learns the falsified trace features of the human face falsification in an unsupervised learning mode, fuses each layer of learned features to the supervised sub-network in a feature fusion mode to guide the learning of the supervised sub-network, calculates self-supervision loss by utilizing the current detection result of the supervised sub-network, and updates the weight parameters of the unsupervised sub-network;
the supervised subnetwork is trained under supervised learning with three losses including: the loss is calculated according to the unsupervised detection result of the unsupervised subnetwork; calculating the loss by using the supervised learning detection result of the supervised subnetwork under the non-supervision guidance; calculating KL divergence between the detection result of the current model on the memory sample and the recorded old detection result of the memory sample under the guidance of the unsupervised subnetwork;
the memory module consolidates the knowledge learned by the face counterfeiting detection model through continuous experience playback, a part of samples randomly extracted each time are copied into the memory module, and when the face counterfeiting detection model learns a new counterfeiting technology, sample data are continuously sampled from the memory module to train the face counterfeiting detection model.
4. A face falsification detection method based on a double-layer twin network and sustainable learning as claimed in claim 3 wherein the unsupervised subnetwork comprises a network capturing falsification traceSelf-monitoring encoder->And affine network->Wherein a network of counterfeit marks is captured->The method is composed of 6 convolution layers in an accumulation and stacking way, and the learning aim is to find a fake area in an image and to cover the fake area; self-supervision encoder->Each block in the ResNet34 is utilized to replace the corresponding block in the encoder in the U-Net architecture in sequence; affine network->The system consists of a full connection layer, a batch standardization layer and a ReLU activation function layer;
the supervised subnetwork comprises a supervised encoderDecoder->And a split detector->Wherein, supervised encoder->Self-supervision coder in the same structure as the non-supervision subnetwork>Decoder->Is composed of 6 convolution layers and residual structure, and the supervised coder in the supervised sub-network>And decoder->The residual structures are connected in the same dimension to form a U-Net structure; split detector->By a full connection layer and a Sigmoid function layerThe composition is formed.
5. The face fake detection method based on the double-layer twin network and the sustainable learning according to claim 4, wherein in step S2, the training of the pre-constructed double-layer twin network based on the sustainable learning strategy by using the face fake image dataset as a training set comprises the following steps:
s21, randomly taking the first part from the training setData set consisting of individual forgery techniques->The%>Batched label sample image, denoted +.>Copying a portion of the samples of the batch of sample images into a memory module, represented asThe method comprises the steps of carrying out a first treatment on the surface of the The memory module is a memory area for storing sample data when the double-layer twin network is constructed;
s22, regarding the firstData set consisting of individual forgery techniques->The%>Batch sample image +.>Is duplicated into two parts, respectively by two different partsRoute input unsupervised subnetwork:
route 1 is a direct input self-supervising encoderBy means of a self-supervising encoder->Each layer of extracted features multiplies the same-dimensional features element by element in a feature fusion mode to guide a supervised encoder in a supervised sub-network>Learning process of (a) and obtaining a prediction result +.>The method comprises the steps of carrying out a first treatment on the surface of the Furthermore, the->Sending into affine network->Get the characteristics-> ;
At this time, the features are calculated by an unsupervised manner and />Barlow Tw of (A)ins self-supervision loss->The expression is as follows:
in the formula ,is a super-parameter weighting factor; />Is->And->Is a cross-correlation matrix of (a); />Representation->And->Index of dimensions on the two feature vectors; />Representation matrix->Element values on the middle diagonal; />Representation matrix->Middle->Line, th->Element values of columns;
wherein ,equivalent to two different amplified feature vectors +.>First->Dimension and->Is>Sum of products of corresponding values of the dimensions, +.>An index represented as a current batch of samples;
s23, for the firstData set consisting of individual forgery techniques->The%>Batched label sample image->Firstly, inputting the result into an unsupervised sub-network, and outputting the prediction result of the unsupervised sub-network +.>By using the prediction result->Calculating a first segmentation loss function Dice loss->Feature extraction of a constrained face counterfeiting detection model under unsupervised learning, and a first segmentation loss function Dice loss +.>The expression of (2) is as follows:
then, sampling the sample data in the memory module to obtain the sample dataPredicting by using the supervised subnetwork guided by the unsupervised subnetwork to obtain the prediction result of the supervised subnetwork>By using the prediction result->Calculating a second division loss function Dice loss->Feature extraction under supervised learning of a constrained face counterfeiting detection model, and second segmentation loss function Dice loss ++>The expression of (2) is as follows:
representing the number of sample data sampled in the memory module, < >>Representing the true counterfeit location of the sample data annotation of the sample;
prediction results through a supervised networkAnd learning old prediction results recorded while the same sampleCalculating KL divergence between the two to obtain +.>Loss:
wherein ,/>Indicating the calculation of the KL divergence between the two inputs, and (2)>Is the hyper-parametric degradation coefficient,/->Indicates activation via the neural network Softmax layer, < ->Is a super parameter temperature coefficient;
finally, supervised loss of face counterfeit detection modelThe method comprises the following steps:
6. The face forgery detection method based on the double-layer twin network and the sustainable learning as claimed in claim 4, wherein in step S2, the task data set is repeatedly selected based on the sustainable learning strategySubtask data set composed of different forgery techniques +.>And for continuously training the two-layer twin network until the task dataset is +.>Middle->Subtask data set completionThe part is selected; finally, the optimal high generalization weight parameter is obtained. />
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310474306.XA CN116206375B (en) | 2023-04-28 | 2023-04-28 | Face counterfeiting detection method based on double-layer twin network and sustainable learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310474306.XA CN116206375B (en) | 2023-04-28 | 2023-04-28 | Face counterfeiting detection method based on double-layer twin network and sustainable learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116206375A true CN116206375A (en) | 2023-06-02 |
CN116206375B CN116206375B (en) | 2023-07-25 |
Family
ID=86515030
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310474306.XA Active CN116206375B (en) | 2023-04-28 | 2023-04-28 | Face counterfeiting detection method based on double-layer twin network and sustainable learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116206375B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117056874A (en) * | 2023-08-17 | 2023-11-14 | 国网四川省电力公司营销服务中心 | Unsupervised electricity larceny detection method based on deep twin autoregressive network |
CN117894083A (en) * | 2024-03-14 | 2024-04-16 | 中电科大数据研究院有限公司 | Image recognition method and system based on deep learning |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021137946A1 (en) * | 2019-12-30 | 2021-07-08 | Microsoft Technology Licensing, Llc | Forgery detection of face image |
CN113283403A (en) * | 2021-07-21 | 2021-08-20 | 武汉大学 | Counterfeited face video detection method based on counterstudy |
CN113822377A (en) * | 2021-11-19 | 2021-12-21 | 南京理工大学 | Fake face detection method based on contrast self-learning |
CN114267063A (en) * | 2021-12-22 | 2022-04-01 | 浙江大学 | Unsupervised face forgery assessment method |
CN114627412A (en) * | 2022-03-07 | 2022-06-14 | 公安部第三研究所 | Method, device and processor for realizing unsupervised depth forgery video detection processing based on error reconstruction and computer storage medium thereof |
CN114694220A (en) * | 2022-03-25 | 2022-07-01 | 上海大学 | Double-flow face counterfeiting detection method based on Swin transform |
-
2023
- 2023-04-28 CN CN202310474306.XA patent/CN116206375B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021137946A1 (en) * | 2019-12-30 | 2021-07-08 | Microsoft Technology Licensing, Llc | Forgery detection of face image |
CN113283403A (en) * | 2021-07-21 | 2021-08-20 | 武汉大学 | Counterfeited face video detection method based on counterstudy |
CN113822377A (en) * | 2021-11-19 | 2021-12-21 | 南京理工大学 | Fake face detection method based on contrast self-learning |
CN114267063A (en) * | 2021-12-22 | 2022-04-01 | 浙江大学 | Unsupervised face forgery assessment method |
CN114627412A (en) * | 2022-03-07 | 2022-06-14 | 公安部第三研究所 | Method, device and processor for realizing unsupervised depth forgery video detection processing based on error reconstruction and computer storage medium thereof |
CN114694220A (en) * | 2022-03-25 | 2022-07-01 | 上海大学 | Double-flow face counterfeiting detection method based on Swin transform |
Non-Patent Citations (2)
Title |
---|
PEIPENG YU,ETC.: "Improving Generalization by Commonality Learning in Face Forgery Detection", IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, vol. 17 * |
刘霞;秦华锋;: "基于深度置信网络的假手指静脉图像检测算法", 重庆工商大学学报(自然科学版), no. 05 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117056874A (en) * | 2023-08-17 | 2023-11-14 | 国网四川省电力公司营销服务中心 | Unsupervised electricity larceny detection method based on deep twin autoregressive network |
CN117894083A (en) * | 2024-03-14 | 2024-04-16 | 中电科大数据研究院有限公司 | Image recognition method and system based on deep learning |
Also Published As
Publication number | Publication date |
---|---|
CN116206375B (en) | 2023-07-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cheng et al. | ISNet: Towards improving separability for remote sensing image change detection | |
CN110781838B (en) | Multi-mode track prediction method for pedestrians in complex scene | |
CN111652066B (en) | Medical behavior identification method based on multi-self-attention mechanism deep learning | |
CN109829427B (en) | Face clustering method based on purity detection and spatial attention network | |
CN111611847B (en) | Video motion detection method based on scale attention hole convolution network | |
CN111178319A (en) | Video behavior identification method based on compression reward and punishment mechanism | |
CN111368672A (en) | Construction method and device for genetic disease facial recognition model | |
CN116206375B (en) | Face counterfeiting detection method based on double-layer twin network and sustainable learning | |
CN113689382A (en) | Tumor postoperative life prediction method and system based on medical images and pathological images | |
Putra et al. | Markerless human activity recognition method based on deep neural network model using multiple cameras | |
CN115331146A (en) | Micro target self-adaptive detection method based on data enhancement and feature fusion | |
Sharma et al. | Deepfakes Classification of Faces Using Convolutional Neural Networks. | |
Lee et al. | Face and facial expressions recognition system for blind people using ResNet50 architecture and CNN | |
Kaddar et al. | HCiT: Deepfake video detection using a hybrid model of CNN features and vision transformer | |
CN114511798A (en) | Transformer-based driver distraction detection method and device | |
CN114066844A (en) | Pneumonia X-ray image analysis model and method based on attention superposition and feature fusion | |
CN113221683A (en) | Expression recognition method based on CNN model in teaching scene | |
CN113763417A (en) | Target tracking method based on twin network and residual error structure | |
CN116778233A (en) | Incomplete depth multi-view semi-supervised classification method based on graph neural network | |
CN116844041A (en) | Cultivated land extraction method based on bidirectional convolution time self-attention mechanism | |
CN110992320A (en) | Medical image segmentation network based on double interleaving | |
CN116452494A (en) | Self-organizing mapping network-based unsupervised industrial anomaly detection and positioning method | |
Mallet et al. | Hybrid Deepfake Detection Utilizing MLP and LSTM | |
CN113283393B (en) | Deepfake video detection method based on image group and two-stream network | |
CN113205044B (en) | Deep fake video detection method based on characterization contrast prediction learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |