CN116580462A - Deep fake image detection method based on clustering decision and related equipment - Google Patents

Deep fake image detection method based on clustering decision and related equipment Download PDF

Info

Publication number
CN116580462A
CN116580462A CN202310465572.6A CN202310465572A CN116580462A CN 116580462 A CN116580462 A CN 116580462A CN 202310465572 A CN202310465572 A CN 202310465572A CN 116580462 A CN116580462 A CN 116580462A
Authority
CN
China
Prior art keywords
training data
image
feature extraction
face image
fake
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310465572.6A
Other languages
Chinese (zh)
Inventor
花忠云
侯泽铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Harbin Institute of Technology filed Critical Shenzhen Graduate School Harbin Institute of Technology
Priority to CN202310465572.6A priority Critical patent/CN116580462A/en
Publication of CN116580462A publication Critical patent/CN116580462A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • G06V10/763Non-hierarchical techniques, e.g. based on statistics of modelling distributions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Collating Specific Patterns (AREA)

Abstract

The invention provides a depth counterfeit image detection method and related equipment based on clustering decision, and relates to the technical field of image processing, wherein the method comprises the following steps: acquiring a face image to be detected, and performing image segmentation on the face image to be detected to obtain at least one local face image; inputting the face image to be detected and the local face image into a trained feature extraction model, and obtaining fusion features output by the feature extraction model; and respectively acquiring a first distance between the fusion feature and the trained first clustering center and a second distance between the fusion feature and the trained second clustering center, determining a fake detection result of the face image to be detected according to the first distance and the second distance, wherein the fake detection result reflects whether the face image to be detected is a fake image or not, the first clustering center reflects the fusion feature corresponding to the fake image, and the second clustering center reflects the fusion feature corresponding to the real image. The invention can improve generalization of the detection of the depth fake image.

Description

Deep fake image detection method based on clustering decision and related equipment
Technical Field
The invention relates to the technical field of image processing, in particular to a depth fake image detection and related equipment based on clustering decision.
Background
The depth forging generation technology can generate forged face images, and malicious users can utilize the images generated by the depth forging to damage interests of other people. At present, high-quality deep fake images cannot be distinguished only by eyes, and detection of the deep fake face images is an important topic to be solved in the industry.
Disclosure of Invention
The invention provides a depth counterfeit image detection and related equipment based on clustering decision, which is used for solving the defect that a high-quality depth counterfeit image is difficult to detect in the prior art and realizing accurate detection of the depth counterfeit image.
The invention provides a depth counterfeit image detection method based on clustering decision, which comprises the following steps:
acquiring a face image to be detected, and performing image segmentation on the face image to be detected to obtain at least one local face image, wherein the local face image reflects five sense organs in the face image to be detected;
inputting the face image to be detected and the local face image into a trained feature extraction model, and obtaining fusion features output by the feature extraction model;
Respectively acquiring a first distance between the fusion feature and a trained first clustering center and a second distance between the fusion feature and a trained second clustering center, and determining a fake detection result of the face image to be detected according to the first distance and the second distance, wherein the fake detection result reflects whether the face image to be detected is a fake image or not, the first clustering center reflects the fusion feature corresponding to the fake image, and the second clustering center reflects the fusion feature corresponding to a real image;
the trained feature extraction model, the trained first clustering center and the trained second clustering center are obtained by training based on a plurality of groups of training data, and each group of training data comprises a sample face image to be detected and a fake detection label corresponding to the sample face image to be detected; the training process of the feature extraction model comprises the following steps:
training an initial feature extraction model and a binary classifier based on a plurality of groups of training data to obtain an intermediate feature extraction model, wherein the binary classifier is used for outputting a fake prediction result based on the fusion features output by the feature extraction model, and the fake prediction result is used for classifying the training data into a real image or a fake image;
Training the intermediate feature extraction model based on multiple sets of the training data to obtain the trained feature extraction model, the trained first cluster center, and the trained second cluster center
According to the method for detecting the depth counterfeit image based on the clustering decision, which is provided by the invention, the intermediate feature extraction model is trained based on a plurality of groups of training data, and the trained feature extraction model, the trained first clustering center and the trained second clustering center are obtained, and the method comprises the following steps:
determining initial values of a first clustering center and a second clustering center based on a plurality of groups of training data;
determining target training data batches in a plurality of groups of the training data, wherein the target training data batches comprise a plurality of groups of target training data;
dividing the sample face image to be detected in the target training data to obtain a sample local face image, inputting the sample face image to be detected and the sample local face image into the intermediate feature extraction model, and obtaining sample fusion features output by the intermediate feature extraction model;
updating parameters of the intermediate feature extraction model based on the sample fusion features and the falsification detection tags corresponding to each of the target training data in the target training data batch, and updating the first clustering center and the second clustering center based on the sample fusion features and the falsification detection tags corresponding to each of the target training data in the target training data batch;
And re-executing the step of determining target training data batches in a plurality of groups of training data until the parameters of the intermediate feature extraction model, the first clustering center and the second clustering center are converged, taking the intermediate feature extraction model after parameter convergence as the trained feature extraction model, taking the first clustering center after convergence as the trained first clustering center, and taking the second clustering center after convergence as the trained second clustering center.
According to the method for detecting the depth counterfeit image based on the clustering decision, which is provided by the invention, the initial values of the first clustering center and the second clustering center are determined based on a plurality of groups of training data, and the method comprises the following steps:
classifying each group of training data to obtain a first set and a second set, wherein the fake detection label in the training data in the first set represents that the corresponding sample face image to be detected is a fake image, and the fake detection label in the training data in the second set represents that the corresponding sample face image to be detected is a real image;
And determining a first fusion feature corresponding to each sample face image to be detected in the first set and a second fusion feature corresponding to each sample face image to be detected in the second set based on the intermediate feature extraction model, taking the average value of the first fusion features as the initial value of the first clustering center, and taking the average value of the second fusion features as the initial value of the second clustering center.
According to the method for detecting the deep fake image based on the clustering decision provided by the invention, the parameters of the intermediate feature extraction model are updated based on the sample fusion features and the fake detection labels corresponding to each target training data in the target training data batch, and the method comprises the following steps:
calculating a loss function based on the sample fusion characteristics and the fake detection label corresponding to each target training data in the target training data batch;
updating parameters of the intermediate feature extraction model with the aim of minimizing the loss function;
wherein the loss function includes a first loss function and a second loss function, the first loss function being The second loss function isWherein F represents the sample fusion feature corresponding to the target training data, fake represents the set of sample fusion features corresponding to the target training data in which the corresponding falsification detection tag in the target training data batch is a falsified image, real represents the set of sample fusion features corresponding to the target training data in which the corresponding falsification detection tag in the target training data batch is a true image, CF represents the current first cluster center, CR represents the current second cluster center, and β is a constant.
According to the method for detecting the deep fake image based on the clustering decision provided by the invention, the updating of the first clustering center and the second clustering center based on the sample fusion characteristic and the fake detection label corresponding to each target training data in the target training data batch comprises the following steps:
updating the first clustering center according to the sample fusion characteristics corresponding to all the target training data of the forged image by the corresponding forged detection labels in the target training data batch;
and updating the second aggregation center for the sample fusion characteristics corresponding to all the target training of the fake image according to the corresponding fake detection labels in the target training data batch.
According to the method for detecting the deep fake image based on the clustering decision provided by the invention, the updating of the first clustering center according to the corresponding fake detection labels in the target training data batch for the sample fusion characteristics corresponding to all the target training data of the fake image comprises the following steps:
updating the first cluster center based on a first formula;
the updating the second aggregation center for the sample fusion features corresponding to all the target training of the fake image according to the corresponding fake detection label in the target training data batch includes:
updating the second hub based on a second formula;
the first formula is CF t =(1-α)CF t-1 +αAvg(∑ F∈Fake F) The second formula is CR t =(1-α)CR t-1 +αAvg(∑ F∈Real F);
Wherein CF is as follows t Representing the updated first cluster center, CR, based on the t-th target training batch t Representing the second aggregation center updated based on the t-th target training batch, F representing the sample fusion feature corresponding to the target training data, fake representing a set of the sample fusion features corresponding to the target training data in which the corresponding falsification detection tag in the target training data batch is a falsified image, real representing a set of the sample fusion features corresponding to the target training data in which the corresponding falsification detection tag in the target training data batch is a true image, avg () representing averaging contents in brackets, and α being a constant.
According to the depth counterfeit image detection method based on clustering decision, the feature extraction model comprises a first frequency feature extraction module, a second frequency feature extraction module, a first spatial feature extraction module and a second spatial feature extraction module; the obtaining the fusion feature output by the feature extraction model comprises the following steps:
the face image to be detected is respectively input into the first spatial feature extraction module and the first frequency feature extraction module, the first spatial feature of the face image to be detected is extracted based on the first spatial feature extraction module, and the first frequency feature of the face image to be detected is extracted based on the first frequency feature extraction module;
the local face image is respectively input into the second spatial feature extraction module and the second frequency feature extraction module, the second spatial feature of the local face image is extracted based on the second spatial feature extraction module, and the second frequency feature of the face image to be detected is extracted based on the second frequency feature extraction module;
and splicing all the first spatial features, the first frequency features, the second spatial features and the second frequency features to obtain splicing features, and acquiring the fusion features based on the splicing features.
The invention also provides a depth counterfeit image detection device based on clustering decision, which comprises:
the image acquisition module is used for acquiring a face image to be detected, carrying out image segmentation on the face image to be detected to obtain at least one local face image, and reflecting five sense organs in the face image to be detected;
the feature fusion module is used for inputting the face image to be detected and the local face image into a trained feature extraction model to obtain fusion features output by the feature extraction model;
the fake detection module is used for respectively acquiring a first distance between the fusion feature and a trained first clustering center and a second distance between the fusion feature and a trained second clustering center, determining a fake detection result of the face image to be detected according to the first distance and the second distance, wherein the fake detection result reflects whether the face image to be detected is a fake image or not, the first clustering center reflects the fusion feature corresponding to the fake image, and the second clustering center reflects the fusion feature corresponding to a real image;
the trained feature extraction model, the trained first clustering center and the trained second clustering center are obtained by training based on a plurality of groups of training data, and each group of training data comprises a sample face image to be detected and a fake detection label corresponding to the sample face image to be detected; the training process of the feature extraction model comprises the following steps:
Training an initial feature extraction model and a binary classifier based on a plurality of groups of training data to obtain an intermediate feature extraction model, wherein the binary classifier is used for outputting a fake prediction result based on the fusion features output by the feature extraction model, and the fake prediction result is used for classifying the training data into a real image or a fake image;
and training the intermediate feature extraction model based on a plurality of groups of training data to obtain the trained feature extraction model, the trained first clustering center and the trained second clustering center.
The invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the depth fake image detection method based on the clustering decision when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a cluster decision-based depth falsification image detection method as described in any one of the above.
According to the clustering decision-based depth counterfeit image detection method and related equipment, the face image is segmented, the face image and the partial face head image are both input into the feature extraction model to extract features, then the counterfeit image detection is carried out according to the fusion features output by the feature extraction model and the distance between the fusion features and the clustering centers of the real image and the counterfeit image, in the training process of the feature extraction model, the feature extraction model is trained by adopting a binary classifier, so that the feature extraction model can concentrate on tamper marks of certain selective face areas, and the difference between the real and counterfeit samples is enlarged by further utilizing contrast learning based on the clustering decision, so that the image counterfeit detection result generated according to the feature distance relation between the face image to be detected and the clustering centers of the real and counterfeit images is more accurate, and the accurate detection of the depth counterfeit image is realized.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow diagram of a clustering decision-based depth counterfeit image detection method provided by the invention;
FIG. 2 is a schematic diagram of an image segmentation process in the clustering decision-based depth counterfeit image detection method provided by the invention;
FIG. 3 is a schematic diagram of a process of generating fusion features by a feature extraction model in a clustering decision-based deep forgery image detection method provided by the invention;
FIG. 4 is a schematic diagram of a model training process in the clustering decision-based depth counterfeit image detection method provided by the invention;
fig. 5 is a schematic structural diagram of a depth counterfeit image detection device based on clustering decision provided by the invention;
fig. 6 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The method for detecting the depth counterfeit image based on the clustering decision provided by the invention is described below with reference to fig. 1 to 4.
As shown in fig. 1, the method for detecting the depth counterfeit image based on the clustering decision provided by the invention comprises the following steps:
s110, acquiring a face image to be detected, and performing image segmentation on the face image to be detected to obtain at least one local face image, wherein the local face image reflects five sense organs in the face image to be detected;
s120, inputting the face image to be detected and the local face image into a trained feature extraction model, and obtaining fusion features output by the feature extraction model;
s130, respectively obtaining a first distance between the fusion feature and a trained first clustering center and a second distance between the fusion feature and a trained second clustering center, and determining a fake detection result of the face image to be detected according to the first distance and the second distance, wherein the fake detection result reflects whether the face image to be detected is a fake image or not, the first clustering center reflects the fusion feature corresponding to the fake image, and the second clustering center reflects the fusion feature corresponding to the real image.
The inventors found that when a deep learning model is simply introduced to detect a deep counterfeit image, most of the method can only be applied to detect a specific tamper type, namely, the deep counterfeit trace existing on a data set adopted when training the model, and the model often performs poorly when detecting a deep counterfeit image which is not encountered, and this characteristic can also be called generalization deficiency of the model. In order to enable models for detecting deep counterfeited images to have a higher realistic meaning than for detecting specific types of tampering, in one possible implementation, the models may be designed and trained so as to direct the model to autonomously focus on more common feature anomaly areas, not just some areas that have the greatest impact on the final result, but such models or training strategies tend to be quite complex, and the parameters of the model may be larger than the general reference model, and are not suitable for deployment on devices with limited computational power and memory space.
According to the method provided by the invention, the face image and the partial face head image are both input into the feature extraction model to extract the features, the fake image detection is carried out according to the fused features output by the feature extraction model and the distances between the fused features and the clustering centers of the real image and the fake image, in the training process of the feature extraction model, the binary classifier is firstly adopted to train the feature extraction model, so that the feature extraction model can concentrate on tamper marks of certain selective face areas, and then the contrast learning is further utilized to enlarge the difference between the real and fake samples based on the clustering decision, so that the image fake detection result finally generated according to the feature distance relation between the face image to be detected and the clustering centers of the real and fake images is more accurate, the accurate detection of the depth fake image is realized, and because all the features are treated equally in the distance calculation, the method provided by the invention can avoid excessively depending on certain features, effectively improving the generalization of the model, meanwhile, the complex design of the model structure is not needed, and the quantity of model parameters is less.
Specifically, as shown in fig. 2, the face image to be detected may be a face image taken from a single image or from a video frame in a video, and the face image to be detected is segmented to obtain at least one local face image, where the local face image may be a five-element part in the face image to be detected, such as eyes, mouth, nose, and the like. In one possible implementation manner, the partial face images may be four, which are respectively a left eye area image, a right eye area image, a nose area image and a mouth area image in the face image to be detected. The face image to be detected is segmented to obtain the local face image, and an existing face detector, such as Blazeface, can be adopted.
The feature extraction model comprises a first frequency feature extraction module, a second frequency feature extraction module, a first space feature extraction module and a second space feature extraction module; the obtaining the fusion feature output by the feature extraction model comprises the following steps:
the face image to be detected is respectively input into the first spatial feature extraction module and the first frequency feature extraction module, the first spatial feature of the face image to be detected is extracted based on the first spatial feature extraction module, and the first frequency feature of the face image to be detected is extracted based on the first frequency feature extraction module;
The local face image is respectively input into the second spatial feature extraction module and the second frequency feature extraction module, the second spatial feature of the local face image is extracted based on the second spatial feature extraction module, and the second frequency feature of the face image to be detected is extracted based on the second frequency feature extraction module;
and splicing all the first spatial features, the first frequency features, the second spatial features and the second frequency features to obtain splicing features, and acquiring the fusion features based on the splicing features.
As shown in fig. 3, after the face image to be detected and the local face image are input into the feature extraction model, features of the face image to be detected are extracted by the first spatial feature extraction module and the first frequency feature extraction module, and features of the local face image are extracted by the second spatial feature extraction module and the second frequency feature extraction module. Specifically, the first spatial feature extraction module and the second spatial feature extraction module may use existing image spatial feature extraction modules, such as a convolution module, where each of the first spatial feature and each of the second spatial feature is a vector having a size of 1×256. In order to improve the detection efficiency and the detection precision of the depth counterfeit image, the main discrimination basis for discriminating the depth counterfeit image is concentrated on the facial five sense organs, and the rest is only used as the supplementary features, namely, the number of channels of the first spatial feature extraction module is set to be smaller than that of channels of the second spatial feature extraction module, and the parameter quantity of the first spatial feature extraction module is smaller than that of the second spatial feature extraction module. In order to further enhance the feature representation capability, the method further extracts frequency features of the face image to be detected and the local face image, specifically, the first frequency feature extraction module and the second frequency feature extraction module extract frequency features through Discrete Cosine Transform (DCT), and for each image, average amplitude of each frequency is obtained to form a frequency vector of 1×128 as the first frequency feature or the second frequency feature. Finally, all the first frequency features, the second frequency features, the first spatial features and the second spatial features are spliced together and then projected into a fusion feature F with the size of 1 x 200 through a full connection layer for classification.
In most of the ways of detecting the fake image by adopting the deep learning model, the features extracted based on the image to be detected are input into a binary classifier, the binary classifier outputs the detection result that the image to be detected is a real image or a fake image, and in the training process, the feature extraction model and the binary classifier are trained together. In the method provided by the invention, in order to improve generalization of the model, all the features are treated equally, instead of adopting the method, as shown in fig. 4, a feature extraction model is trained by using a traditional binary classifier, the central position of a positive sample and a negative sample (the positive sample corresponds to a real image and the negative sample corresponds to a fake image) on a feature map is acquired, and after the positive sample center and the negative sample center of a data set are acquired, training of the feature extraction model and the clustering center is performed, so that generalization capability of detecting the fake image based on the feature extraction model and the clustering center can be improved by further separating features of the real and fake images. The trained feature extraction model, the trained first clustering center and the trained second clustering center are obtained by training based on a plurality of groups of training data, wherein each group of training data comprises a sample face image to be detected and a fake detection label corresponding to the sample face image to be detected; the training process of the feature extraction model comprises the following steps:
Training an initial feature extraction model and a binary classifier based on a plurality of groups of training data to obtain an intermediate feature extraction model, wherein the binary classifier is used for outputting a fake prediction result based on the fusion features output by the feature extraction model;
and training the intermediate feature extraction model based on a plurality of groups of training data to obtain the trained feature extraction model, the trained first clustering center and the trained second clustering center.
Specifically, the training the intermediate feature extraction model based on the plurality of sets of training data to obtain the trained feature extraction model, the trained first cluster center and the trained second cluster center includes:
determining initial values of a first clustering center and a second clustering center based on a plurality of groups of training data;
determining target training data batches in a plurality of groups of the training data, wherein the target training data batches comprise a plurality of groups of target training data;
dividing the sample face image to be detected in the target training data to obtain a sample local face image, inputting the sample face image to be detected and the sample local face image into the intermediate feature extraction model, and obtaining sample fusion features output by the intermediate feature extraction model;
Updating parameters of the intermediate feature extraction model based on the sample fusion features and the falsification detection tags corresponding to each of the target training data in the target training data batch, and updating the first clustering center and the second clustering center based on the sample fusion features and the falsification detection tags corresponding to each of the target training data in the target training data batch;
and re-executing the step of determining target training data batches in a plurality of groups of training data until the number of the target training data batches reaches a preset value, finishing training when the number of the target training data batches reaches the preset value, taking the intermediate feature extraction model at the end of training as the trained feature extraction model, taking the first clustering center at the end of training as the trained first clustering center, and taking the second clustering center at the end of training as the trained second clustering center.
The initial feature extraction model and the feature extraction model have the same structure and different parameters, after the initial feature extraction model is trained based on a plurality of groups of training data and the parameters of the initial feature extraction model are updated, the intermediate feature extraction model is obtained until the parameters of the initial feature extraction model are converged, and obviously, the intermediate feature extraction model, the initial feature extraction model and the trained feature extraction model are identical in structure.
Based on the intermediate feature extraction model, a first cluster center for reflecting features of the counterfeit image sample and a second cluster center for reflecting features of the genuine image sample can be obtained. The determining initial values of the first clustering center and the second clustering center based on the plurality of sets of training data comprises:
classifying each group of training data to obtain a first set and a second set, wherein the fake detection label in the training data in the first set represents that the corresponding sample face image to be detected is a fake image, and the fake detection label in the training data in the second set represents that the corresponding sample face image to be detected is a real image;
and determining a first fusion feature corresponding to each sample face image to be detected in the first set and a second fusion feature corresponding to each sample face image to be detected in the second set based on the intermediate feature extraction model, taking the average value of the first fusion features as the initial value of the first clustering center, and taking the average value of the second fusion features as the initial value of the second clustering center.
Specifically, after the intermediate feature extraction model is obtained, traversing a training data set, inputting the sample face image to be detected, which is a fake image, into the intermediate feature extraction model, obtaining each first fusion feature output by the intermediate feature extraction model, inputting the sample face image to be detected, which is a real image, into the intermediate feature extraction model, and obtaining each second fusion feature output by the intermediate feature extraction model. And taking the average value of each first fusion characteristic as an initial value of the first clustering center, and taking the average value of each second fusion characteristic as an initial value of the second clustering center.
The updating parameters of the intermediate feature extraction model based on the sample fusion features and the falsification detection tags corresponding to each target training data in the target training data batch includes:
calculating a loss function based on the sample fusion characteristics and the fake detection label corresponding to each target training data in the target training data batch;
updating parameters of the intermediate feature extraction model with the aim of minimizing the loss function;
Wherein the loss function includes a first loss function and a second loss function, the first loss function beingThe second loss function isWherein F represents the sample fusion feature corresponding to the target training data, fake represents the set of sample fusion features corresponding to the target training data in which the corresponding falsification detection tag in the target training data batch is a falsified image, real represents the set of sample fusion features corresponding to the target training data in which the corresponding falsification detection tag in the target training data batch is a true image, CF represents the current first cluster center, CR represents the current second cluster center, and β is a constant.
After initial values of the first clustering center and the second clustering center are obtained, training and updating the feature extraction model and the clustering center in a clustering decision mode, and the generalization capability of detecting the fake image based on the feature extraction model and the clustering center can be improved by further separating features of real and fake images. By minimizing the loss function, the distance between the feature centers of the real image and the counterfeit image can be enlarged and samples with the same label are gathered, β being used to control the boundaries that can be tolerated. Minimizing the loss function may be to minimize a weighted sum of the first loss function and the second loss function.
The updating the first clustering center and the second clustering center based on the sample fusion feature and the falsification detection tag corresponding to each target training data in the target training data batch includes:
updating the first clustering center according to the sample fusion characteristics corresponding to all the target training data of the forged image by the corresponding forged detection labels in the target training data batch;
and updating the second aggregation center for the sample fusion characteristics corresponding to all the target training of the fake image according to the corresponding fake detection labels in the target training data batch.
In the process of training the feature extraction model based on the clustering decision, a clustering center is updated according to the features of the current training batch and the center movement rate alpha, and a momentum update strategy is adopted to avoid the rigidity of a center point or the too rapid change of the position. The updating the first clustering center according to the sample fusion features corresponding to all the target training data of the fake image by the corresponding fake detection labels in the target training data batch comprises the following steps:
updating the first cluster center based on a first formula;
The updating the second aggregation center for the sample fusion features corresponding to all the target training of the fake image according to the corresponding fake detection label in the target training data batch includes:
updating the second hub based on a second formula;
the first formula is CF t =(1-α)CF t-1 +αAvg(∑ F∈Fake F) The second formula is CR t =(1-α)CR t-1 +αAvg(∑ F∈Real F);
Wherein CF is as follows t Representing the updated first cluster center, CR, based on the t-th target training batch t Representing the second aggregation center updated based on the t-th target training batch, F representing the sample fusion feature corresponding to the target training data, fake representing a set of the sample fusion features corresponding to the target training data in which the corresponding falsification detection tag in the target training data batch is a falsified image, real representing a set of the sample fusion features corresponding to the target training data in which the corresponding falsification detection tag in the target training data batch is a true image, avg () representing averaging contents in brackets, and α being a constant.
The feature extraction model after training is used for fake detection of the face image to be detected, the trained feature extraction model is used for obtaining the fusion features corresponding to the face image to be detected, fake detection results of the face image to be detected are determined according to the distances between the fusion features and the first clustering center and the second clustering center respectively, specifically, when the fusion features are closer to the first clustering center, the face image to be detected is determined to be a fake image, otherwise, the face image to be detected is determined to be a real image.
The clustering decision-based depth counterfeit image detection device provided by the invention is described below, and the clustering decision-based depth counterfeit image detection device described below and the clustering decision-based depth counterfeit image detection method described above can be correspondingly referred to each other. As shown in fig. 5, the depth counterfeit image detection device based on clustering decision provided by the invention comprises:
the image obtaining module 510 is configured to obtain a face image to be detected, and perform image segmentation on the face image to be detected to obtain at least one local face image, where the local face image reflects five sense organs in the face image to be detected;
the feature fusion module 520 is configured to input the face image to be detected and the local face image into a trained feature extraction model, and obtain fusion features output by the feature extraction model;
the fake detection module 530 is configured to obtain a first distance between the fusion feature and a trained first cluster center, and a second distance between the fusion feature and a trained second cluster center, determine a fake detection result of the face image to be detected according to the first distance and the second distance, where the fake detection result reflects whether the face image to be detected is a fake image, the first cluster center reflects the fusion feature corresponding to the fake image, and the second cluster center reflects the fusion feature corresponding to a real image;
The trained feature extraction model, the trained first clustering center and the trained second clustering center are obtained by training based on a plurality of groups of training data, and each group of training data comprises a sample face image to be detected and a fake detection label corresponding to the sample face image to be detected; the training process of the feature extraction model comprises the following steps:
training an initial feature extraction model and a binary classifier based on a plurality of groups of training data to obtain an intermediate feature extraction model, wherein the binary classifier is used for outputting a fake prediction result based on the fusion features output by the feature extraction model, and the fake prediction result is used for classifying the training data into a real image or a fake image;
and training the intermediate feature extraction model based on a plurality of groups of training data to obtain the trained feature extraction model, the trained first clustering center and the trained second clustering center.
Fig. 6 illustrates a physical schematic diagram of an electronic device, as shown in fig. 6, which may include: processor 610, communication interface (Communications Interface) 620, memory 630, and communication bus 640, wherein processor 610, communication interface 620, and memory 630 communicate with each other via communication bus 640. The processor 610 may invoke logic instructions in the memory 630 to perform a cluster decision-based method of deep forgery image detection, the method comprising: acquiring a face image to be detected, and performing image segmentation on the face image to be detected to obtain at least one local face image, wherein the local face image reflects five sense organs in the face image to be detected;
Inputting the face image to be detected and the local face image into a trained feature extraction model, and obtaining fusion features output by the feature extraction model;
respectively acquiring a first distance between the fusion feature and a trained first clustering center and a second distance between the fusion feature and a trained second clustering center, and determining a fake detection result of the face image to be detected according to the first distance and the second distance, wherein the fake detection result reflects whether the face image to be detected is a fake image or not, the first clustering center reflects the fusion feature corresponding to the fake image, and the second clustering center reflects the fusion feature corresponding to a real image;
the trained feature extraction model, the trained first clustering center and the trained second clustering center are obtained by training based on a plurality of groups of training data, and each group of training data comprises a sample face image to be detected and a fake detection label corresponding to the sample face image to be detected; the training process of the feature extraction model comprises the following steps:
training an initial feature extraction model and a binary classifier based on a plurality of groups of training data to obtain an intermediate feature extraction model, wherein the binary classifier is used for outputting a fake prediction result based on the fusion features output by the feature extraction model, and the fake prediction result is used for classifying the training data into a real image or a fake image;
And training the intermediate feature extraction model based on a plurality of groups of training data to obtain the trained feature extraction model, the trained first clustering center and the trained second clustering center.
Further, the logic instructions in the memory 630 may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, the computer program product including a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of executing the clustering decision-based depth falsification image detection method provided by the above methods, the method comprising: acquiring a face image to be detected, and performing image segmentation on the face image to be detected to obtain at least one local face image, wherein the local face image reflects five sense organs in the face image to be detected;
Inputting the face image to be detected and the local face image into a trained feature extraction model, and obtaining fusion features output by the feature extraction model;
respectively acquiring a first distance between the fusion feature and a trained first clustering center and a second distance between the fusion feature and a trained second clustering center, and determining a fake detection result of the face image to be detected according to the first distance and the second distance, wherein the fake detection result reflects whether the face image to be detected is a fake image or not, the first clustering center reflects the fusion feature corresponding to the fake image, and the second clustering center reflects the fusion feature corresponding to a real image;
the trained feature extraction model, the trained first clustering center and the trained second clustering center are obtained by training based on a plurality of groups of training data, and each group of training data comprises a sample face image to be detected and a fake detection label corresponding to the sample face image to be detected; the training process of the feature extraction model comprises the following steps:
training an initial feature extraction model and a binary classifier based on a plurality of groups of training data to obtain an intermediate feature extraction model, wherein the binary classifier is used for outputting a fake prediction result based on the fusion features output by the feature extraction model, and the fake prediction result is used for classifying the training data into a real image or a fake image;
And training the intermediate feature extraction model based on a plurality of groups of training data to obtain the trained feature extraction model, the trained first clustering center and the trained second clustering center.
In yet another aspect, the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the cluster decision-based method for detecting a deep forgery image provided by the above methods, the method comprising: acquiring a face image to be detected, and performing image segmentation on the face image to be detected to obtain at least one local face image, wherein the local face image reflects five sense organs in the face image to be detected;
inputting the face image to be detected and the local face image into a trained feature extraction model, and obtaining fusion features output by the feature extraction model;
respectively acquiring a first distance between the fusion feature and a trained first clustering center and a second distance between the fusion feature and a trained second clustering center, and determining a fake detection result of the face image to be detected according to the first distance and the second distance, wherein the fake detection result reflects whether the face image to be detected is a fake image or not, the first clustering center reflects the fusion feature corresponding to the fake image, and the second clustering center reflects the fusion feature corresponding to a real image;
The trained feature extraction model, the trained first clustering center and the trained second clustering center are obtained by training based on a plurality of groups of training data, and each group of training data comprises a sample face image to be detected and a fake detection label corresponding to the sample face image to be detected; the training process of the feature extraction model comprises the following steps:
training an initial feature extraction model and a binary classifier based on a plurality of groups of training data to obtain an intermediate feature extraction model, wherein the binary classifier is used for outputting a fake prediction result based on the fusion features output by the feature extraction model, and the fake prediction result is used for classifying the training data into a real image or a fake image;
and training the intermediate feature extraction model based on a plurality of groups of training data to obtain the trained feature extraction model, the trained first clustering center and the trained second clustering center.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. The method for detecting the depth counterfeit image based on the clustering decision is characterized by comprising the following steps of:
acquiring a face image to be detected, and performing image segmentation on the face image to be detected to obtain at least one local face image, wherein the local face image reflects five sense organs in the face image to be detected;
inputting the face image to be detected and the local face image into a trained feature extraction model, and obtaining fusion features output by the feature extraction model;
respectively acquiring a first distance between the fusion feature and a trained first clustering center and a second distance between the fusion feature and a trained second clustering center, and determining a fake detection result of the face image to be detected according to the first distance and the second distance, wherein the fake detection result reflects whether the face image to be detected is a fake image or not, the first clustering center reflects the fusion feature corresponding to the fake image, and the second clustering center reflects the fusion feature corresponding to a real image;
the trained feature extraction model, the trained first clustering center and the trained second clustering center are obtained by training based on a plurality of groups of training data, and each group of training data comprises a sample face image to be detected and a fake detection label corresponding to the sample face image to be detected; the training process of the feature extraction model comprises the following steps:
Training an initial feature extraction model and a binary classifier based on a plurality of groups of training data to obtain an intermediate feature extraction model, wherein the binary classifier is used for outputting a fake prediction result based on the fusion features output by the feature extraction model, and the fake prediction result is used for classifying the training data into a real image or a fake image;
and training the intermediate feature extraction model based on a plurality of groups of training data to obtain the trained feature extraction model, the trained first clustering center and the trained second clustering center.
2. The clustering decision-based depth counterfeit image detection method of claim 1, wherein said training said intermediate feature extraction model based on said plurality of sets of training data to obtain said trained feature extraction model, said trained first cluster center and said trained second cluster center comprises:
determining initial values of a first clustering center and a second clustering center based on a plurality of groups of training data;
determining target training data batches in a plurality of groups of the training data, wherein the target training data batches comprise a plurality of groups of target training data;
Dividing the sample face image to be detected in the target training data to obtain a sample local face image, inputting the sample face image to be detected and the sample local face image into the intermediate feature extraction model, and obtaining sample fusion features output by the intermediate feature extraction model;
updating parameters of the intermediate feature extraction model based on the sample fusion features and the falsification detection tags corresponding to each of the target training data in the target training data batch, and updating the first clustering center and the second clustering center based on the sample fusion features and the falsification detection tags corresponding to each of the target training data in the target training data batch;
and re-executing the step of determining target training data batches in a plurality of groups of training data until the number of the target training data batches reaches a preset value, finishing training when the number of the target training data batches reaches the preset value, taking the intermediate feature extraction model at the end of training as the trained feature extraction model, taking the first clustering center at the end of training as the trained first clustering center, and taking the second clustering center at the end of training as the trained second clustering center.
3. The clustering decision-based depth counterfeit image detection method of claim 2, wherein said determining initial values of said first cluster center and said second cluster center based on a plurality of sets of said training data comprises:
classifying each group of training data to obtain a first set and a second set, wherein the fake detection label in the training data in the first set represents that the corresponding sample face image to be detected is a fake image, and the fake detection label in the training data in the second set represents that the corresponding sample face image to be detected is a real image;
and determining a first fusion feature corresponding to each sample face image to be detected in the first set and a second fusion feature corresponding to each sample face image to be detected in the second set based on the intermediate feature extraction model, taking the average value of the first fusion features as the initial value of the first clustering center, and taking the average value of the second fusion features as the initial value of the second clustering center.
4. The clustering decision-based depth counterfeit image detection method of claim 2, wherein updating parameters of the intermediate feature extraction model based on the sample fusion features and the counterfeit detection labels corresponding to each of the target training data in the target training data batch comprises:
Calculating a loss function based on the sample fusion characteristics and the fake detection label corresponding to each target training data in the target training data batch;
updating parameters of the intermediate feature extraction model with the aim of minimizing the loss function;
wherein the loss function includes a first loss function and a second loss function, the first loss function beingThe second loss function isWherein F represents the sample fusion feature corresponding to the target training data, fake represents the set of sample fusion features corresponding to the target training data in which the corresponding falsification detection tag in the target training data batch is a falsified image, real represents the set of sample fusion features corresponding to the target training data in which the corresponding falsification detection tag in the target training data batch is a true image, CF represents the current first cluster center, CR represents the current second cluster center, and β is a constant.
5. The clustering decision-based depth counterfeit image detection method of claim 2, wherein said updating said first and second cluster centers based on said sample fusion features and said counterfeit detection labels corresponding to each of said target training data in said target training data batch comprises:
Updating the first clustering center according to the sample fusion characteristics corresponding to all the target training data of the forged image by the corresponding forged detection labels in the target training data batch;
and updating the second aggregation center for the sample fusion characteristics corresponding to all the target training of the fake image according to the corresponding fake detection labels in the target training data batch.
6. The method for detecting deep forgery images based on clustering decisions according to claim 5, wherein updating the first cluster center according to the sample fusion features corresponding to the forgery detection labels corresponding to the target training data in the target training data batch for all the target training data of a forgery image comprises:
updating the first cluster center based on a first formula;
the updating the second aggregation center for the sample fusion features corresponding to all the target training of the fake image according to the corresponding fake detection label in the target training data batch includes:
updating the second hub based on a second formula;
the first formula is CF t =(1-α)CF t-1 +αAvg(∑ F∈Fake F) The second formula is CR t =(1-α)CR t-1 +αAvg(∑ F∈Real F);
Wherein CF is as follows t Representing the updated first cluster center, CR, based on the t-th target training batch t Representing the second aggregation center updated based on the t-th target training batch, F representing the sample fusion features corresponding to the target training data, fake representing the set of sample fusion features corresponding to the target training data in which the corresponding counterfeit detection label in the target training data batch is a counterfeit image, real representing the set of sample fusion features corresponding to the target training dataAnd the corresponding fake detection label in the target training data batch is a set of sample fusion characteristics corresponding to the target training data of a real image, avg () means that the contents in brackets are averaged, and alpha is a constant.
7. The clustering decision-based depth counterfeit image detection method of claim 1, wherein the feature extraction model comprises a first frequency feature extraction module, a second frequency feature extraction module, a first spatial feature extraction module and a second spatial feature extraction module; the obtaining the fusion feature output by the feature extraction model comprises the following steps:
the face image to be detected is respectively input into the first spatial feature extraction module and the first frequency feature extraction module, the first spatial feature of the face image to be detected is extracted based on the first spatial feature extraction module, and the first frequency feature of the face image to be detected is extracted based on the first frequency feature extraction module;
The local face image is respectively input into the second spatial feature extraction module and the second frequency feature extraction module, the second spatial feature of the local face image is extracted based on the second spatial feature extraction module, and the second frequency feature of the face image to be detected is extracted based on the second frequency feature extraction module;
and splicing all the first spatial features, the first frequency features, the second spatial features and the second frequency features to obtain splicing features, and acquiring the fusion features based on the splicing features.
8. A depth counterfeit image detection device based on clustering decisions, comprising:
the image acquisition module is used for acquiring a face image to be detected, carrying out image segmentation on the face image to be detected to obtain at least one local face image, and reflecting five sense organs in the face image to be detected;
the feature fusion module is used for inputting the face image to be detected and the local face image into a trained feature extraction model to obtain fusion features output by the feature extraction model;
the fake detection module is used for respectively acquiring a first distance between the fusion feature and a trained first clustering center and a second distance between the fusion feature and a trained second clustering center, determining a fake detection result of the face image to be detected according to the first distance and the second distance, wherein the fake detection result reflects whether the face image to be detected is a fake image or not, the first clustering center reflects the fusion feature corresponding to the fake image, and the second clustering center reflects the fusion feature corresponding to a real image;
The trained feature extraction model, the trained first clustering center and the trained second clustering center are obtained by training based on a plurality of groups of training data, and each group of training data comprises a sample face image to be detected and a fake detection label corresponding to the sample face image to be detected; the training process of the feature extraction model comprises the following steps:
training an initial feature extraction model and a binary classifier based on a plurality of groups of training data to obtain an intermediate feature extraction model, wherein the binary classifier is used for outputting a fake prediction result based on the fusion features output by the feature extraction model, and the fake prediction result is used for classifying the training data into a real image or a fake image;
and training the intermediate feature extraction model based on a plurality of groups of training data to obtain the trained feature extraction model, the trained first clustering center and the trained second clustering center.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the cluster decision-based deep forgery image detection method of any of claims 1 to 7 when the program is executed.
10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the cluster decision-based depth forgery image detection method according to any one of claims 1 to 7.
CN202310465572.6A 2023-04-20 2023-04-20 Deep fake image detection method based on clustering decision and related equipment Pending CN116580462A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310465572.6A CN116580462A (en) 2023-04-20 2023-04-20 Deep fake image detection method based on clustering decision and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310465572.6A CN116580462A (en) 2023-04-20 2023-04-20 Deep fake image detection method based on clustering decision and related equipment

Publications (1)

Publication Number Publication Date
CN116580462A true CN116580462A (en) 2023-08-11

Family

ID=87535117

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310465572.6A Pending CN116580462A (en) 2023-04-20 2023-04-20 Deep fake image detection method based on clustering decision and related equipment

Country Status (1)

Country Link
CN (1) CN116580462A (en)

Similar Documents

Publication Publication Date Title
US11188783B2 (en) Reverse neural network for object re-identification
CN112686812B (en) Bank card inclination correction detection method and device, readable storage medium and terminal
Tayab Khan et al. Smart Real‐Time Video Surveillance Platform for Drowsiness Detection Based on Eyelid Closure
CN106663157A (en) User authentication method, device for executing same, and recording medium for storing same
Türkyılmaz et al. License plate recognition system using artificial neural networks
CN103136504A (en) Face recognition method and device
CN111291863B (en) Training method of face changing identification model, face changing identification method, device and equipment
CN106203373B (en) A kind of human face in-vivo detection method based on deep vision bag of words
CN114926892A (en) Fundus image matching method and system based on deep learning and readable medium
CN115240280A (en) Construction method of human face living body detection classification model, detection classification method and device
Huang et al. Human emotion recognition based on face and facial expression detection using deep belief network under complicated backgrounds
Devadethan et al. Face detection and facial feature extraction based on a fusion of knowledge based method and morphological image processing
CN113468954B (en) Face counterfeiting detection method based on local area features under multiple channels
CN113033305B (en) Living body detection method, living body detection device, terminal equipment and storage medium
CN116524269A (en) Visual recognition detection system
CN111191549A (en) Two-stage face anti-counterfeiting detection method
CN113887357B (en) Face representation attack detection method, system, device and medium
CN116580462A (en) Deep fake image detection method based on clustering decision and related equipment
García et al. Pollen grains contour analysis on verification approach
CN111860288B (en) Face recognition method, device and system and readable storage medium
CN114005184A (en) Handwritten signature authenticity identification method and device based on small amount of samples
CN104102896B (en) A kind of method for recognizing human eye state that model is cut based on figure
CN112800941A (en) Face anti-fraud method and system based on asymmetric auxiliary information embedded network
CN111428670A (en) Face detection method, face detection device, storage medium and equipment
CN110688972A (en) System and method for improving face generation performance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination