CN116758643A

CN116758643A - Method and device for detecting deeply forged face image

Info

Publication number: CN116758643A
Application number: CN202310608618.5A
Authority: CN
Inventors: 罗曼
Original assignee: Alipay Hangzhou Information Technology Co Ltd
Current assignee: Alipay Hangzhou Information Technology Co Ltd
Priority date: 2023-05-26
Filing date: 2023-05-26
Publication date: 2023-09-15

Abstract

The embodiment of the specification discloses a detection method for a deep fake face image, which comprises the following steps: responding to a face image detection request, and acquiring a face image to be detected; dividing the face image to obtain a face partial image; inputting the face partial images into corresponding authenticity evaluation submodels aiming at each face partial image to obtain authenticity evaluation scores aiming at the face partial images; and obtaining a detection result of the face image based on the authenticity evaluation score of each face partial image, wherein the detection result is used for representing whether the face image is forged or not. Correspondingly, the invention discloses a device for detecting the deeply forged face image.

Description

Method and device for detecting deeply forged face image

Technical Field

The embodiment of the specification relates to the technical field of machine learning, in particular to a method and a device for detecting deep fake face images.

Background

Depth forging refers to a fake technology that uses artificial intelligence and deep learning techniques to manipulate and tamper with audio-video media data. At present, the method for detecting the depth counterfeit image generally carries out two-class judgment on the whole human face image in a general way, and the detection scheme has lower detection precision in the scene of the depth counterfeit of the partial region of the human face.

Disclosure of Invention

One of the purposes of the invention is to provide a detection method for deeply falsified face images, which can detect more fine face local areas and obtain more accurate detection results with higher interpretability.

According to the above object, an embodiment of the present disclosure provides a method for detecting a deep-counterfeited face image, including the steps of:

responding to a face image detection request, and acquiring a face image to be detected;

dividing the face image to obtain a face partial image;

inputting the face partial images into corresponding authenticity evaluation submodels aiming at each face partial image to obtain authenticity evaluation scores aiming at the face partial images;

and obtaining a detection result of the face image based on the authenticity evaluation score of each face partial image, wherein the detection result is used for representing whether the face image is forged or not.

In the embodiment of the specification, a detection method of deep fake face images aiming at the partial region authenticity of the face is provided, and the method is different from the conventional two-classification method, the partial region cutting is carried out on the facial features of the face to obtain image blocks of a plurality of partial regions of the face, then the partial images of the face are respectively trained to respectively train refined authenticity evaluation sub-models to respectively carry out authenticity scoring, and finally the detection result of the whole face image is obtained based on the authenticity evaluation score of each partial image of the face. The forging difficulty of different five sense organs on the face is fully considered, and a more accurate deep forging detection result is obtained; and meanwhile, the method is also beneficial to the user to locate and verify the local area with high possibility of counterfeiting, and the depth counterfeiting detection result with high interpretability is obtained.

Further, in some embodiments, before the face image is segmented, the method further includes:

inputting the face image into a face quality evaluation model to obtain a face quality evaluation score of the face quality evaluation model aiming at the face image;

and when the face quality evaluation score is smaller than a preset face quality score threshold, re-acquiring the face image.

Still further, in some embodiments, the face quality assessment model is a classification model; obtaining a face quality evaluation score of the face quality evaluation model for the face image, including:

and extracting image characteristics of the face image through the face quality evaluation model, and determining a face quality evaluation score corresponding to a classification result of the face image based on a decision boundary of the face quality evaluation model and the image characteristics.

and adjusting the pose of the face region in the face image to a preset position.

Still further, in some embodiments, the face region is a frontal face region; the method for adjusting the pose of the face region in the face image to the preset position specifically comprises the following steps:

Extracting a face region in the face image, and determining key points of the face region;

determining an affine transformation matrix for adjusting the key points to the standard position points based on the key points in the face area and the preset standard position points;

and adjusting the face area to the preset position based on the affine transformation matrix.

Further, in some embodiments, the facial partial image is obtained by segmentation according to facial features, and the facial partial image includes: a left eye partial image, a right eye partial image, a nose partial image, and a mouth partial image.

Further, in some embodiments, the face image is segmented by a face image segmentation model; the face image segmentation model comprises a key point detection network and a region segmentation model; inputting the face image into a face image segmentation model to obtain a face partial image, which specifically comprises the following steps:

extracting key point information of the face image through the key point detection network;

inputting the key point information into the region segmentation model to obtain a corresponding relation between the key point information and a preset segmentation region;

and determining a face local image corresponding to the preset segmentation area based on the corresponding relation between the key point information and the preset segmentation area.

Further, in some embodiments, the plausibility assessment sub-model is pre-trained in the following manner:

acquiring a training sample image, wherein the training sample image comprises a non-forged positive sample image and a forged negative sample image;

determining a classification supervision tag of each training sample image aiming at each training sample image, wherein the classification supervision tags are used for representing the corresponding authenticity evaluation score of the training sample image;

determining an evaluation loss of the authenticity evaluation sub-model, wherein the evaluation loss represents the difference between a prediction result obtained by predicting the training sample image by the authenticity evaluation sub-model and a classification supervision label of the training sample image;

inputting the training sample image into a pre-constructed network model, and training the network model based on the evaluation loss to obtain the authenticity evaluation sub-model.

Still further, in some embodiments, the evaluation loss employs a cross entropy loss function for minimizing a difference between a predicted result of the plausibility evaluation sub-model on the training sample image and a classification supervision label of the training sample image.

Further, in some embodiments, the obtaining the detection result of the face image based on the authenticity evaluation score specifically includes:

determining whether the face partial image is a fake image or not based on the authenticity evaluation score of each face partial image and the authenticity score threshold of the face partial image;

aiming at the same face image, if at least one face partial image is a fake image, determining that a fake object exists as a detection result of the face image; otherwise, determining that the detection result of the face image is that no fake object exists.

determining a weight coefficient corresponding to the authenticity evaluation score of each face partial image in the face image aiming at the same face image;

weighting and summing the authenticity evaluation scores of all the face partial images of the face image based on the weight coefficient to obtain the authenticity total score of the face image;

and obtaining a detection result of the face image based on the total score of the degree of reality and a preset threshold value of the degree of reality of the face image.

Another object of the present invention is to provide a deeply counterfeited face image detection apparatus capable of obtaining a more accurate evaluation result by finer face partial area detection.

According to the above object, an embodiment of the present specification proposes a depth counterfeit face image detection device, including:

the image acquisition module is configured to respond to the face image detection request and acquire a face image to be detected;

the face image segmentation module is configured to segment the face image to obtain a face partial image;

the face local image evaluation module is configured to evaluate the face local images based on the authenticity evaluation sub-model to obtain the authenticity evaluation score of each face local image;

and the detection result generation module is configured to obtain a detection result of the face image based on the authenticity evaluation score, wherein the detection result is used for representing whether the face image is forged or not.

Further, in some embodiments, the image acquisition module includes an image acquisition module and a face quality evaluation module;

the image acquisition module is configured to respond to the face image detection request or the resampling signal sent by the face quality evaluation module and acquire the face image;

The face quality evaluation module is configured to input the face image into a face quality evaluation model to obtain a face quality evaluation score of the face quality evaluation model for the face image; and when the face quality evaluation score is smaller than a preset face quality score threshold, sending out the resampling signal to the image acquisition module.

Further, in some embodiments, the apparatus further comprises:

and the image preprocessing module is configured to adjust the pose of the face area in the face image to a preset position.

Still further, in some embodiments, the image preprocessing module is specifically configured to extract a face region in the face image, and determine a key point of the face region; determining an affine transformation matrix for adjusting the key points to the standard position points based on the key points in the face area and the preset standard position points; adjusting the face region to the preset position based on the affine transformation matrix; the face area is a frontal face area.

Further, in some embodiments, the face image segmentation module segments the face image into the face partial image through a face image segmentation model; the face image segmentation model comprises a key point detection network and a region segmentation model; the face image segmentation module is specifically used for extracting key point information of the face image through the key point detection network; and inputting the key point information into the region segmentation model to obtain a corresponding relation between the key point information and a preset segmentation region, and determining a face local image corresponding to the preset segmentation region based on the corresponding relation between the key point information and the preset segmentation region.

Further, in some embodiments, the detection result generating module is specifically configured to determine whether the face partial image is a counterfeit image based on a plausibility evaluation score of each of the face partial images and a plausibility score threshold of the face partial image; judging whether at least one partial face image is a fake image or not according to the same face image, if yes, determining that a fake object exists as a detection result of the face image; otherwise, determining that the detection result of the face image is that no fake object exists.

Further, in some embodiments, the detection result generating module is specifically configured to determine, for the same face image, a weight coefficient corresponding to a score for evaluating the authenticity of each of the face partial images in the face image; weighting and summing the true degree evaluation scores of all the face partial images corresponding to the face image based on the weight coefficient to obtain the true degree total score of the face image; and obtaining a detection result of the face image based on the total score of the degree of reality and a preset threshold value of the degree of reality of the face image.

The embodiments of the present specification also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described deep counterfeited face image detection method.

The embodiment of the specification also provides an electronic device, including:

one or more processors; and

a memory associated with the one or more processors, the memory for storing program instructions that, when read and executed by the one or more processors, perform the above-described deep counterfeited face image detection method.

The method for detecting the deep-forging face image has the advantages that the face image is cut into image blocks of a plurality of face local areas, then the face local images are respectively trained to respectively train refined authenticity evaluation submodels to respectively score the authenticity, finer authenticity evaluation and deep-forging detection are achieved, meanwhile, the authenticity scoring condition of the five sense organs can also enhance the interpretability of the detection result, a user is guided to pay attention to and verify certain local areas with higher forging possibility according to the authenticity scoring, and the identification capability of the deep-forging face image is integrally improved.

Different weights are given to different partial images of the face, the forging difficulty of different five sense organs on the face is fully considered, the deep forging detection is carried out on the face area in a focus mode, and a more accurate deep forging detection result can be obtained.

The deep fake face image detection device disclosed by the embodiment of the specification has the same beneficial effects.

Drawings

Fig. 1 schematically shows a flow chart of a method for detecting a deep counterfeited face image according to an embodiment of the present disclosure in an embodiment.

Fig. 2 schematically illustrates a flow chart of a method for detecting a deep counterfeited face image according to an embodiment of the present disclosure in a specific implementation manner.

Fig. 3 exemplarily shows a schematic structural diagram of the deep-counterfeited face image detection apparatus according to the embodiment of the present disclosure in an embodiment.

Detailed Description

It is first noted that the terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. In addition, the embodiments and features in the embodiments in the present specification may be combined with each other without conflict.

The method and apparatus for detecting deep counterfeited face images according to the embodiments of the present disclosure will be described in further detail below with reference to the accompanying drawings and specific embodiments of the present disclosure, but the detailed description is not meant to limit the embodiments of the present disclosure.

The face recognition algorithm can collect face images and extract face information such as texture features, color features and illumination features, and the recognition result containing the face image identity information is obtained through analysis and comparison of the face information, so that the face recognition algorithm is widely applied to the fields of identity authentication, real-name transaction and the like and becomes a vital technology in life.

The depth fake face image detection method disclosed by the embodiment of the specification realizes more accurate depth fake detection by carrying out face image segmentation, and the face image segmentation can be carried out based on different features of the face; for example, the facial feature position can be identified by extracting facial key point information, thereby completing the facial image segmentation based on the key points; facial five sense organs can be distinguished through the texture features of the human face, and human face image segmentation based on texture is completed.

The key point detection of the human face is one of the cores for realizing the human face recognition algorithm, and is also a precondition and a break for further realizing the operations such as expression analysis, three-dimensional human face reconstruction, three-dimensional human face animation and the like. The facial expression and the head action can be quantitatively represented by the specific positions of the key points by using a facial key point detection algorithm to add key points to the outline and the five sense organs of the face and bind the key points, and when the facial expression or the head action changes, the change can be also represented by the movement condition of the key points. In some embodiments of the present disclosure, the landmark-68 algorithm is used to implement face key point detection, and as the name implies, 68 key points bound to the face outline or each five sense organs can be obtained, and these key points can be used as the basis for adjusting the pose of the face, dividing the face area, and checking the depth forgery. Of course, in other embodiments, the landmark-5 algorithm or the landmark-101 algorithm may be selected for face keypoint detection.

In one embodiment of the present specification, a deep counterfeited face image detection method is presented. Fig. 1 schematically shows a flow chart of a method for detecting a deep counterfeited face image according to an embodiment of the present disclosure in an embodiment.

As shown in fig. 1, the method comprises the following steps:

100: and responding to the face image detection request, and acquiring a face image to be detected.

When the detection of the deep fake face image is needed, the face image to be detected can be acquired in real time by using image acquisition equipment such as a camera, or can be arranged into a data set in advance, and the face image is directly acquired from the data set when the detection is started.

When the face image is acquired, only one face can be detected at a time, so that a face area with the largest visual field range is identified from a picture captured by a camera or an image acquired from a data set by adopting a face key point detection algorithm as a face to be detected.

102: and dividing the face image to obtain a face partial image.

In some embodiments, before the segmentation of the face image, the method further comprises:

And when the face quality evaluation score is smaller than a preset face quality score threshold value, re-acquiring the face image.

After the face image is obtained, carrying out quality evaluation on the face image, and if the face image passes the quality evaluation, continuing to segment the face image to obtain a face partial image; if the quality evaluation is not passed, the face image is required to be acquired again, and then the quality evaluation is carried out on the newly acquired face image.

In order to ensure the efficiency and accuracy of the deep forgery detection, some disqualified face images which can influence the deep forgery detection result need to be filtered through quality evaluation. The quality evaluation score can reflect quality problems such as image definition, integrity and the like, wherein the definition comprises the definition of the whole face image and the definition of each key part of the face; the integrity includes whether the acquired face image is blocked, whether key parts which cannot be identified exist in the face image, and whether the face image is a full-face image. The lower the quality evaluation score, the greater the quality problem of the face image, for example, the too blurred face image, the severely damaged or blocked face image and the too incomplete face image will be evaluated as failed, and if such face image is subjected to the deep forgery detection, the accuracy of the deep forgery detection result will not be ensured.

In some specific embodiments, the face quality assessment model is a classification model; obtaining a face quality evaluation score of a face quality evaluation model for a face image, comprising:

extracting image characteristics of the face image through the face quality evaluation model, and determining face quality evaluation scores corresponding to classification results of the face image based on decision boundaries and the image characteristics of the face quality evaluation model.

The image features of the face image can embody the definition and integrity information of the whole face and the definition and integrity information of the key parts of the face, and the face quality evaluation score can be obtained by comparing the features with the decision boundary of a preset face quality evaluation model, so that whether the face image is qualified or not is judged.

In some more specific embodiments, a CNN network model may be employed for quality assessment. The CNN is a commonly used and easily available image processing network model in the prior art, and quality evaluation can be quickly and well completed through the utilization of a deep learning technology.

and adjusting the pose of the face area in the face image to a preset position.

The pose of a face represents the angle by which the head rotates on three coordinate axes perpendicular to each other in three-dimensional space, in other words, it can represent the orientation of the head. The pose of the face area in the face image is unified, so that the face can be adjusted to the pose which is more convenient for image segmentation, the image segmentation efficiency is improved, the image segmentation effect is perfected, and meanwhile, the influence on the depth forging result due to the difference of the face pose is avoided.

In some more specific embodiments, the face region is a frontal face region; the method for adjusting the pose of the face region in the face image to the preset position specifically comprises the following steps:

extracting a face region in a face image, and determining key points of the face region;

and adjusting the face area to a preset position based on the affine transformation matrix.

The keypoints of the face region are used to identify the facial contours or locations of the various facial features, such as the eyes, nose, and mouth, for subsequent further processing of the face image for the facial features.

In some embodiments, the face partial image is obtained by dividing the facial features, and the face partial image includes: a left eye partial image, a right eye partial image, a nose partial image, and a mouth partial image.

Eyes, nose and mouth are basic and key local areas of the face, and naturally, in other embodiments, eyebrow local images, forehead local images and cheek local images can be further segmented, and face image information can be grasped in more detail, so that the depth forging detection result is more accurate.

In some embodiments, the face image is segmented by a face image segmentation model; the face image segmentation model comprises a key point detection network and a region segmentation model; inputting a face image into a face image segmentation model to obtain a face partial image, which specifically comprises the following steps:

extracting key point information of the face image through a key point detection network;

inputting the key point information into a region segmentation model to obtain the corresponding relation between the key point information and a preset segmentation region;

and determining the partial image of the face corresponding to the preset segmentation area based on the corresponding relation between the key point information and the preset segmentation area.

After obtaining key point information of the face image, judging which preset segmentation area the key point belongs to according to the area segmentation model aiming at each key point, so as to establish a corresponding relation between the key point information and the preset segmentation area; and then, collecting key points belonging to the same preset segmentation area to obtain a corresponding face partial image. For example, if a key point of the mouth corner is judged to belong to a preset mouth segmentation area by the area segmentation model, then the association between the key point and the preset mouth segmentation area is established, and all key points associated with the preset mouth segmentation area are gathered, so that a mouth local image can be obtained.

104: and inputting the face partial images into corresponding authenticity evaluation submodels aiming at each face partial image to obtain the authenticity evaluation score aiming at the face partial images.

The authenticity evaluation sub-model corresponding to each face partial image is trained in advance, and the authenticity evaluation sub-models corresponding to different face partial images can adopt the same network structure, and personalized adjustment of the network structure can be performed according to the forging difficulty of different face partial areas. The higher the authenticity evaluation score obtained through the authenticity evaluation, the greater the possibility that the face partial image is not forged or tampered, that is, the greater the probability that the face partial image is an actual image.

In some embodiments, the plausibility assessment sub-model is pre-trained in the following manner:

Determining the evaluation loss of the authenticity evaluation sub-model, wherein the evaluation loss represents the difference between a prediction result obtained by predicting the training sample image by the authenticity evaluation sub-model and a classification supervision label of the training sample image;

inputting the training sample image into a pre-constructed network model, and training the network model based on the evaluation loss to obtain a true degree evaluation sub-model.

The two-class supervision labels of the training sample image comprise falsified images and falsified images. For a positive sample image which is not forged and tampered, after the positive sample image is input into a authenticity evaluation submodel, the higher the obtained authenticity evaluation score is, the smaller the difference between the authenticity evaluation result and a two-class supervision label of the positive sample image is, and the smaller the evaluation loss is; on the contrary, for the falsified negative sample image, the lower the obtained authenticity evaluation score is, the smaller the difference between the authenticity evaluation result and the two classification supervision labels of the negative sample image is, and the smaller the evaluation loss is.

In some more specific embodiments, the evaluation loss employs a cross entropy loss function for minimizing the difference between the prediction result of the training sample image by the plausibility evaluation sub-model and the bi-classified supervision labels of the training sample image.

And respectively training corresponding authenticity evaluation submodels aiming at different face partial images, wherein the loss function forms adopted in the training process are consistent, and the respective authenticity evaluation submodels are trained with the aim of minimizing the evaluation loss.

In some more specific embodiments, a plausibility assessment sub-model may be built based on the CNN network structure; of course, in other embodiments, the establishment of the plausibility evaluation sub-model may be performed based on the SVM network structure.

106: based on the authenticity evaluation score of each face partial image, a detection result of the face image is obtained, and the detection result is used for representing whether the face image is forged or not.

The detection result obtained based on the authenticity evaluation score of each face partial image may be in a two-classification form, for example, the face image is directly output as a real image or a fake image; the method can also be presented in a score form, for example, the total score of the authenticity evaluation of the whole face image is calculated according to the authenticity evaluation score of each face partial image, and whether the face image is forged or not is judged according to the total score.

In some embodiments, obtaining a detection result of the face image based on the authenticity evaluation score specifically includes:

Determining whether each face partial image is a fake image or not based on the authenticity evaluation score of the face partial image and the authenticity score threshold of the face partial image;

if at least one partial face image exists as a fake image aiming at the same face image, determining that a fake object exists as a detection result of the face image; otherwise, determining that the detection result of the face image is that no fake object exists.

Considering that the forging difficulty of different face partial areas is different, the authenticity score threshold value set for different face partial images is also different, so that the detection result is more accurate, for example, a lower authenticity score threshold value is set for eyes and mouth areas with larger forging difficulty.

The depth fake judgment is carried out on each face partial image, then a decision is made on the whole detection result based on all partial judgment results, so that the obtained detection result has stronger interpretability, and a specific region which is falsified can be more easily positioned.

and obtaining a detection result of the face image based on the total score of the authenticity and a preset threshold value of the authenticity score of the face image.

Considering that the forging difficulty of different face partial areas is different, the weight coefficients set for different face partial images are also different, for example, compared with the nose, forehead and cheek, the forging difficulty is higher, whether the eyes and the mouth are forged or not is easy to distinguish, so that higher weights are set for the eyes and the mouth, and the final depth forging detection result is more reliable.

The problem of excessively black boxes of the model is effectively solved by obtaining the reality evaluation condition of each face partial image, the interpretability of the deep counterfeiting detection result is enhanced, and a user can pay attention to the area with higher counterfeiting probability according to the reality evaluation score.

In some other embodiments, the authenticity evaluation score of each face partial image may be directly output, and then analyzed by other methods to determine whether the face partial image is forged.

The depth-counterfeit face image detection method described in the embodiments of the present specification will be further described by a specific embodiment, but the description is not limited to the embodiments of the present specification.

As shown in fig. 2, in response to a face image detection request, firstly, initializing a camera for acquiring a face image and setting parameters, and adjusting illumination, the height, the distance and the like of the camera to a proper position; and then, a face key point detection algorithm landmark-68 is operated to extract a face area with the largest range in the field of view of the camera, and a face image to be subjected to deep forgery detection and 68 face key points are obtained. Inputting the face image into a face quality evaluation model for face quality score prediction, and outputting a corresponding face quality evaluation score; when the face quality evaluation score is smaller than a preset face quality score threshold, the quality evaluation is unqualified, a notification of re-acquiring a face image is required to be displayed on an interface, a user of the face to be acquired is reminded of adjusting the gesture to re-acquire the face image, and then the quality evaluation is performed on the newly acquired face image; if the face quality evaluation score is greater than or equal to a preset face quality score threshold, the quality evaluation is qualified, and the face five sense organs partition authenticity scoring link can be continuously entered.

And uniformly adjusting the pose of the face region in the face image to a preset position so as to better segment the image. And dividing the whole face image into a left eye partial image block r1, a right eye partial image block r2, a nose partial image block r3 and a mouth partial image block r4 by performing facial region-based region division on the whole face image according to the extracted 68 face key points, determining the corresponding relation between each key point and a preset division region through a face image division model, and obtaining a face partial image corresponding to the preset division region. And respectively inputting the local image blocks into corresponding authenticity evaluation submodels M1, M2, M3 and M4 for local depth counterfeiting detection to obtain corresponding authenticity scores s1, s2, s3 and s4, wherein all the authenticity evaluation submodels are respectively pre-trained refined submodels constructed based on a CNN network structure.

Different weights w1, w2, w3 and w4 are allocated according to different forging difficulties of left eye, right eye, nose and mouth areas, and whether the eyes and the mouth are forged or not is easily distinguished in consideration of the fact that the forging difficulty of the eyes and the mouth is larger than that of the nose, so that higher weights are set for the eyes and the mouth; the finally output depth falsification detection result is expressed as a comprehensive falsification score obtained by carrying out weighted summation operation on the plausibility score s and the weight w, the higher the score is, the greater the probability that the face image is not falsified, the comprehensive plausibility score is compared with a preset face image plausibility score threshold value, and whether the face image is falsified or not is judged.

According to the depth fake face image detection method provided by the embodiment of the specification, the face image is cut into the image blocks of a plurality of face local areas, then the fine authenticity evaluation sub-models are trained for the face local images respectively to score the authenticity, so that finer authenticity evaluation and depth fake detection are realized, meanwhile, the authenticity scoring condition of the five sense organs can also enhance the interpretability of the detection result, a user is guided to pay important attention to and verify certain local areas with higher fake possibility according to the authenticity scoring, and the discrimination capability of the depth fake face image is improved integrally. In addition, different weights are given to different partial images of the face, the forging difficulty of different five sense organs on the face is fully considered, the deep forging detection is carried out on the face area in a focus mode, and a more accurate deep forging detection result can be obtained.

In another embodiment of the present specification, a depth counterfeit face image detection device is provided. Fig. 3 exemplarily shows a schematic structural diagram of the deep-counterfeited face image detection apparatus according to the embodiment of the present specification in an embodiment.

As shown in fig. 3, includes:

the detection result generation module is configured to obtain a detection result of the face image based on the authenticity evaluation score, wherein the detection result is used for representing whether the face image is forged or not.

In some embodiments, the image acquisition module includes an image acquisition module and a face quality evaluation module;

the image acquisition module is configured to respond to the face image detection request or the resampling signal sent by the face quality evaluation module and acquire a face image;

the face quality evaluation module is configured to input a face image into the face quality evaluation model to obtain a face quality evaluation score of the face quality evaluation model aiming at the face image; and sending a resampling signal to the image acquisition module when the face quality evaluation score is smaller than a preset face quality score threshold.

The image acquisition module can acquire the face image to be detected in real time by using image acquisition equipment such as a camera, and can also arrange the face image to be detected into a data set in advance, and the face image can be directly acquired from the data set when the detection is started. When the face image is acquired, only one face can be detected at a time, so that a face area with the largest visual field range is identified from a picture captured by a camera or an image acquired from a data set by adopting a face key point detection algorithm as a face to be detected.

In the face quality evaluation module, if the quality evaluation is passed, the face image segmentation is continued to obtain a face partial image; if the quality evaluation is not passed, the face image is required to be acquired again, and then the quality evaluation is carried out on the newly acquired face image.

In order to ensure the efficiency and accuracy of the deep forgery detection, a face quality evaluation module is required to filter out some unqualified face images which can influence the deep forgery detection result. The quality evaluation score can reflect quality problems such as image definition, integrity and the like, wherein the definition comprises the definition of the whole face image and the definition of each key part of the face; the integrity includes whether the acquired face image is blocked, whether key parts which cannot be identified exist in the face image, and whether the face image is a full-face image. The lower the quality evaluation score, the greater the quality problem of the face image, for example, the too blurred face image, the severely damaged or blocked face image and the too incomplete face image will be evaluated as failed, and if such face image is subjected to the deep forgery detection, the accuracy of the deep forgery detection result will not be ensured.

In some embodiments, the apparatus further comprises:

In some more specific embodiments, the image preprocessing module is specifically configured to extract a face area in the face image, and determine key points of the face area; determining an affine transformation matrix for adjusting the key points to the standard position points based on the key points in the face area and the preset standard position points; adjusting the face region to a preset position based on the affine transformation matrix; the face region is a frontal face region.

The pose of a face represents the angle by which the head rotates on three coordinate axes perpendicular to each other in three-dimensional space, in other words, it can represent the orientation of the head. The image preprocessing module unifies the pose of the face region in the face image, is favorable for adjusting the face to the pose which is more convenient for image segmentation, improves the image segmentation efficiency, improves the image segmentation effect, and simultaneously avoids influencing the depth forging result due to the difference of the face pose.

In some embodiments, the face image segmentation module segments according to facial features to obtain a face partial image, including: a left eye partial image, a right eye partial image, a nose partial image, and a mouth partial image.

Eyes, nose and mouth are basic and key local areas of the face, and of course, in other embodiments, the facial image segmentation module can further segment eyebrow local images, forehead local images and cheek local images, grasp facial image information in more detail, and enable the depth forging detection result to be more accurate.

In some embodiments, the face image segmentation module segments the face image into a face partial image through a face image segmentation model; the face image segmentation model comprises a key point detection network and a region segmentation model; the face image segmentation module is specifically used for extracting key point information of a face image through a key point detection network; and inputting the key point information into the region segmentation model to obtain the corresponding relation between the key point information and the preset segmentation region, and determining the face local image corresponding to the preset segmentation region based on the corresponding relation between the key point information and the preset segmentation region.

The face image segmentation module judges which preset segmentation area the key point belongs to according to the area segmentation model aiming at each key point, so that the corresponding relation between the key point information and the preset segmentation area is established; and then, collecting key points belonging to the same preset segmentation area to obtain a corresponding face partial image.

The authenticity evaluation sub-model corresponding to each face partial image in the face partial image evaluation module is trained in advance, and the authenticity evaluation sub-models corresponding to different face partial images can adopt the same network structure, and personalized adjustment of the network structure can be performed according to the forging difficulty of different face partial areas. And respectively performing supervised training on the corresponding authenticity evaluation submodels aiming at different face partial images, wherein the loss function forms adopted in the training process are consistent, and the respective authenticity evaluation submodels are trained with the aim of minimizing the evaluation loss. The higher the authenticity evaluation score obtained by the face partial image evaluation module, the greater the possibility that the face partial image is not forged or tampered, namely the greater the probability that the face partial image is a real image.

The detection result obtained by the detection result generation module can be in a two-classification form, for example, the face image is directly output as a real image or a fake image; the method can also be presented in a score form, for example, the total score of the authenticity evaluation of the whole face image is calculated according to the authenticity evaluation score of each face partial image, and whether the face image is forged or not is judged according to the total score.

In some embodiments, the detection result generation module is specifically configured to determine whether each face partial image is a counterfeit image based on a score of the authenticity evaluation of the face partial image and a threshold of the authenticity score of the face partial image; judging whether at least one partial face image exists as a fake image according to the same face image, if so, determining that a fake object exists as a detection result of the face image; otherwise, determining that the detection result of the face image is that no fake object exists.

The detection result generation module considers that the forging difficulty of different face partial areas is different, and the authenticity score threshold values set for different face partial images are also different, so that the detection result can be more accurate. The depth fake judgment is carried out on each face partial image, then a decision is made on the whole detection result based on all partial judgment results, so that the obtained detection result has stronger interpretability, and a specific region which is falsified can be more easily positioned.

In some embodiments, the detection result generation module is specifically configured to determine, for the same face image, a weight coefficient corresponding to a score for evaluating the authenticity of each face partial image in the face image; weighting and summing the true degree evaluation scores of all the face partial images corresponding to the face image based on the weight coefficient to obtain the true degree total score of the face image; and obtaining a detection result of the face image based on the total score of the authenticity and a preset threshold value of the authenticity score of the face image.

The detection result generation module considers that the forging difficulty of different face local areas is different, and the weight coefficients set for different face local images are also different, for example, compared with the nose, the forehead and the cheek, the eye and the mouth are larger in forging difficulty, whether the face is forged or not is easy to distinguish, so that higher weights are set for the eye and the mouth, and the final depth forging detection result is more reliable. The problem of excessively black boxes of the model is effectively solved by obtaining the reality evaluation condition of each face partial image, the interpretability of the deep counterfeiting detection result is enhanced, and a user can pay attention to the area with higher counterfeiting probability according to the reality evaluation score.

In some other embodiments, the detection result generating module may also directly output the score of the evaluation of the authenticity of each face partial image, and then analyze by using other methods and make a judgment as to whether the face partial image is forged.

An embodiment in the present specification further provides a computer readable storage medium having a computer program stored thereon, which when executed by a processor, implements the above-described deep counterfeited face image detection method.

One embodiment in the present specification also provides an electronic device, including:

One or more processors; and

and a memory associated with the one or more processors, the memory for storing program instructions that, when read and executed by the one or more processors, perform the deep counterfeited face image detection method described above.

The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

It should be noted that the above-mentioned embodiments are merely examples of the present invention, and it is obvious that the present invention is not limited to the above-mentioned embodiments, and many similar variations are possible. All modifications attainable or obvious from the present disclosure set forth herein should be deemed to be within the scope of the present disclosure.

Claims

1. A detection method for deeply forging a face image comprises the following steps:

dividing the face image to obtain a face partial image;

2. The method of claim 1, further comprising, prior to segmenting the face image:

3. The method of claim 2, wherein the face quality assessment model is a classification model; obtaining a face quality evaluation score of the face quality evaluation model for the face image, including:

4. The method of claim 1, further comprising, prior to segmenting the face image:

5. The method of claim 4, wherein the face region is a frontal face region; the method for adjusting the pose of the face region in the face image to the preset position specifically comprises the following steps:

6. The method of claim 1, wherein the face partial image is obtained by face five sense organs segmentation, and the face partial image comprises: a left eye partial image, a right eye partial image, a nose partial image, and a mouth partial image.

7. The method of claim 1, the face image being segmented by a face image segmentation model; the face image segmentation model comprises a key point detection network and a region segmentation model; inputting the face image into a face image segmentation model to obtain a face partial image, which specifically comprises the following steps:

8. The method of claim 1, the plausibility assessment sub-model being pre-trained in the following manner:

9. The method of claim 8, the evaluation loss employing a cross entropy loss function for minimizing a difference between a prediction of the training sample image by the plausibility evaluation sub-model and a classification supervision label of the training sample image.

10. The method of claim 1, wherein the obtaining the detection result of the face image based on the authenticity evaluation score specifically includes:

11. The method of claim 1, wherein the obtaining the detection result of the face image based on the authenticity evaluation score specifically includes:

12. A depth counterfeited face image detection apparatus comprising:

13. The apparatus of claim 12, the image acquisition module comprising an image acquisition module and a face quality evaluation module;

14. The apparatus of claim 12, further comprising:

15. The apparatus of claim 14, wherein the image preprocessing module is specifically configured to extract a face region in the face image, and determine a key point of the face region; determining an affine transformation matrix for adjusting the key points to the standard position points based on the key points in the face area and the preset standard position points; adjusting the face region to the preset position based on the affine transformation matrix; the face area is a frontal face area.

16. The apparatus of claim 12, the face image segmentation module to segment the face image into the face partial image by a face image segmentation model; the face image segmentation model comprises a key point detection network and a region segmentation model; the face image segmentation module is specifically used for extracting key point information of the face image through the key point detection network; and inputting the key point information into the region segmentation model to obtain a corresponding relation between the key point information and a preset segmentation region, and determining a face local image corresponding to the preset segmentation region based on the corresponding relation between the key point information and the preset segmentation region.

17. The apparatus of claim 12, wherein the detection result generation module is specifically configured to determine whether each of the face partial images is a counterfeit image based on a plausibility evaluation score of the face partial image and a plausibility score threshold of the face partial image; judging whether at least one partial face image is a fake image or not according to the same face image, if yes, determining that a fake object exists as a detection result of the face image; otherwise, determining that the detection result of the face image is that no fake object exists.

18. The apparatus of claim 12, wherein the detection result generating module is specifically configured to determine, for the same face image, a weight coefficient corresponding to a score for evaluating the authenticity of each of the face partial images in the face image; weighting and summing the true degree evaluation scores of all the face partial images corresponding to the face image based on the weight coefficient to obtain the true degree total score of the face image; and obtaining a detection result of the face image based on the total score of the degree of reality and a preset threshold value of the degree of reality of the face image.

19. A computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of any of claims 1 to 11.

20. An electronic device, comprising:

one or more processors; and

a memory associated with the one or more processors for storing program instructions that, when read for execution by the one or more processors, perform the steps of the method of any of claims 1 to 11.