CN111768336B

CN111768336B - Face image processing method and device, computer equipment and storage medium

Info

Publication number: CN111768336B
Application number: CN202010659115.7A
Authority: CN
Inventors: 姚太平; 张克越; 吴双; 孟嘉; 丁守鸿; 李季檩
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-07-09
Filing date: 2020-07-09
Publication date: 2022-11-01
Anticipated expiration: 2040-07-09
Also published as: CN111768336A

Abstract

The embodiment of the application discloses a face image processing method, a face image processing device, computer equipment and a storage medium, wherein positive and negative sample images can be obtained, the positive sample image is a real face image, the negative sample image is a forged face image, and the label of the sample image comprises the probability of the real face image; the method comprises the steps of respectively replacing at least one block of image content of positive and negative sample images to obtain reconstructed positive and negative sample images; adjusting the probability of the real face image in the label of the reconstructed positive and negative sample images; therefore, data distribution of the positive and negative sample images changes, the influence of the data distribution can be ignored to a certain extent based on the face detection model obtained by sample image training, generalization capability is improved, the face detection model can learn to give a classification result which is not too accurate based on labels for reconstructing the positive and negative sample images, and the problem of excessive confidence of the model is solved.

Description

Face image processing method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for processing a face image, and a storage medium.

Background

At present, in a face recognition system with a face falling to the ground, a living body detection technology is taken as an important part, and the safety of face recognition is ensured by resisting face attack.

In the related art, the living body detection technology is often implemented based on a deep neural network model, and training of the network model requires massive training data training. Although neural networks can perform unsurprisingly on training sets due to the strong data fitting capability of the neural networks, when the neural networks are applied to actual production life, the generalization capability is insufficient, and when some data from different distributions are encountered, wrong answers can be given with confidence, so that the performance is greatly reduced.

Disclosure of Invention

The embodiment of the invention provides a face image processing method, a face image processing device and a storage medium, which can effectively relieve the problem of excessive self-confidence of a network for face recognition and improve the generalization capability of a model.

The embodiment of the invention provides a face image processing method, which comprises the following steps:

acquiring positive and negative sample images, wherein the positive sample image is a real face image, the negative sample image is a forged face image, and labels of the positive and negative sample images comprise real face image probabilities;

respectively replacing at least one block of image content of the positive sample image and the negative sample image to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image;

respectively adjusting the probability of reconstructing real face images in labels of the positive sample image and the negative sample image based on the proportion of the replaced image content in the reconstructed positive sample image and the reconstructed negative sample image to the whole image content, wherein the replaced image content in the reconstructed positive sample image and the reconstructed negative sample image is respectively regarded as the negative sample image content and the positive sample image content;

and training a face detection model based on the positive and negative sample images, the reconstructed positive sample image and the reconstructed negative sample image to obtain a trained face detection model.

The embodiment of the invention also provides a face image processing device, which comprises:

the sample acquisition unit is used for acquiring positive and negative sample images, wherein the positive sample image is a real face image, the negative sample image is a forged face image, and the labels of the positive and negative sample images comprise: the probability of a real face image and the probability of a fake face image;

the image reconstruction unit is used for respectively replacing at least one block of image content of the positive sample image and the negative sample image to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image;

the label resetting unit is used for respectively adjusting the probability of the real face image in the label of the reconstructed positive sample image and the probability of the real face image in the label of the reconstructed negative sample image based on the proportion of the replaced image content in the reconstructed positive sample image and the reconstructed negative sample image to the whole image content, wherein the replaced image content in the reconstructed positive sample image and the reconstructed negative sample image is respectively regarded as the negative sample image content and the positive sample image content;

and the model training unit is used for training the face detection model based on the positive and negative sample images, the reconstructed positive sample image and the reconstructed negative sample image to obtain the trained face detection model.

In some embodiments of the present invention, there may also be provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method as described above when executing the computer program.

In some embodiments of the invention, there may also be provided a storage medium having stored thereon a computer program which, when run on a computer, causes the computer to perform the steps of the method as described above.

By adopting the embodiment of the application, the positive and negative sample images can be obtained, wherein the positive sample image is a real face image, the negative sample image is a forged face image, and the labels of the positive and negative sample images comprise: the probability of a real face image; according to the method and the device, at least one block of image content is replaced for the positive sample image and the negative sample image respectively, so that a reconstructed positive sample image and a reconstructed negative sample image can be obtained; respectively modifying the probability of the real face images in the labels of the reconstructed positive sample images and the reconstructed negative sample images based on the proportion of the replaced image content in the reconstructed positive sample images and the reconstructed negative sample images to the whole image content, wherein the replaced image content in the reconstructed positive sample images and the reconstructed negative sample images are respectively regarded as the negative sample image content and the positive sample image content; therefore, the data distribution is changed when the positive sample image and the negative sample image are reconstructed compared with the positive sample image and the negative sample image, the obtained face detection model is trained based on the positive sample image and the negative sample image, the generalization capability can be improved, the face detection model can learn to provide an inaccurate classification result based on the labels of the reconstructed positive sample image and the reconstructed negative sample image, and the problem that the model is over confident is effectively solved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.

Fig. 1a is a schematic structural diagram of a face image processing system according to an embodiment of the present invention;

fig. 1b is a flowchart of a face image processing method according to an embodiment of the present invention;

FIG. 2a is a schematic view of an image processing flow for providing an original face image according to an embodiment of the present invention;

FIG. 2b is a schematic diagram of deconstruction and collage of positive and negative sample images according to an embodiment of the present invention;

fig. 2c is a schematic structural diagram of a face detection model according to an embodiment of the present invention;

FIG. 2d is a schematic structural diagram of another face detection model according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a face image processing apparatus according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention provides a face image processing method and device, computer equipment and a storage medium.

The embodiment of the invention provides a face image processing system, which comprises a face image processing device suitable for computer equipment. The computer device may be a terminal or a server.

The terminal can be a mobile phone, a tablet computer, a notebook computer and other terminal equipment, and also can be wearable equipment, an intelligent television or other intelligent terminals with display modules.

The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, a middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, but is not limited thereto.

The face image processing apparatus of this embodiment may be integrated in a face image processing terminal, and specifically, may be integrated in the face image processing terminal in an application program manner.

Referring to fig. 1a, the face image processing system provided by the present embodiment includes a face image processing terminal 10, a face detection terminal 20, and the like.

The face image processing terminal 10 may be configured to: acquiring positive and negative sample images, wherein the positive sample image is a real face image, the negative sample image is a forged face image, and the labels of the positive and negative sample images comprise: the probability of a real face image; respectively replacing at least one block of image content of the positive sample image and the negative sample image to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image; respectively adjusting the probability of reconstructing real face images in labels of the positive sample image and the negative sample image based on the proportion of the replaced image content in the reconstructed positive sample image and the reconstructed negative sample image to the whole image content, wherein the replaced image content in the reconstructed positive sample image and the reconstructed negative sample image is respectively regarded as the negative sample image content and the positive sample image content; and training a face detection model based on the positive and negative sample images, the reconstructed positive sample image and the reconstructed negative sample image to obtain a trained face detection model.

The face detection terminal 20 may be configured to obtain a face image to be detected; extracting image characteristic information of a face image to be detected through a characteristic extraction module; carrying out classification prediction on a real face image based on image characteristic information through a classification module to obtain the predicted real face image probability and the predicted fake face image probability of the face image to be detected; performing feature regression on the image feature information through a feature regression module to obtain predicted feature information of the face image to be detected in a preset image feature dimension; determining the detection score of the face image to be detected on the preset image feature dimension based on the predicted feature information; determining the total detection score of the face image to be detected based on the probability of predicting the real face image and the detection score; and when the total detection value is greater than a preset value threshold value, determining that the facial image to be detected is a real facial image, otherwise, determining that the facial image to be detected is a fake facial image.

The following are detailed below. It should be noted that the following description of the embodiments is not intended to limit the preferred order of the embodiments.

The embodiments of the present invention will be described in terms of a face image processing apparatus, which may be specifically integrated in a terminal, for example, in the form of an application program.

The embodiment of the invention provides a method for processing a face image, which can be executed by a processor of a terminal,

the face detection model in this embodiment is an application based on Computer Vision technology, and Computer Vision technology (Computer Vision, CV) is a science for researching how to make a machine look, and in further detail, it means that a camera and a Computer are used to replace human eyes to perform machine Vision such as recognition, tracking and measurement on a target, and further perform graphic processing, so that the Computer processing becomes an image more suitable for human eye observation or transmission to an instrument for detection. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.

The trained face detection model in this embodiment is a model capable of implementing face detection, and is implemented based on AI (Artificial intelligence) technology, and in particular, based on Computer Vision (Computer Vision) and Machine Learning (ML) in Artificial intelligence technology.

The face detection model in this embodiment may be constructed based on an artificial neural network technique in machine learning. The learning behavior of human beings can be simulated or realized to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the knowledge structure.

As shown in fig. 1b, the flow of the face image processing method may be as follows:

101. acquiring positive and negative sample images, wherein the positive sample image is a real face image, the negative sample image is a forged face image, and labels of the positive and negative sample images comprise real face image probabilities;

in this embodiment, the real face image includes a real face, the forged face image includes a forged face instead of the real face, the real face image may be an image acquired by using a camera or a terminal, and the like, and the forged face image may be an image acquired from an existing image or video by using a shooting, screen capturing, copying, and the like, or the forged face image may also be a simulated face image generated by using an image generation and the like, which is not limited in this embodiment.

The real face image probability of the present embodiment represents the probability that the image is a real face image. It can be understood that, for a real face image, the probability of the real face image is 1, and for a fake face image, the probability of the real face image is 0.

In an example, the label of the positive and negative sample images may include only the above-mentioned real face image probability, and it can be understood that the sum of the real face image probability of one sample image and the fake face image probability is 1, so that knowing one of the probabilities, it can clearly know the other probability, the actual fake face image probability of the positive and negative sample images can be calculated only when needed, the fake face image probability represents the probability that the image is a fake face image, it can be understood that, for a real face image, the fake face image probability is 0, and for a fake face image, the fake face image probability is 1.

For example, assuming that a positive sample image is represented by a and a negative sample image is represented by B, labels of the positive and negative sample images are label_AAnd label_BWherein, label_AAnd label_BOnly one kind of identification information such as the probability of a real face image can be set, and the identification information can be used for identifying the image corresponding to the label as trueReal face images or fake face images, e.g. label_A=1, the probability of the real face image representing image a is 1, i.e. image a is a real face image, such as label_B=0 indicates that the true face image probability of image B is 0, i.e., image B is a fake face image.

In another example, the labels of the positive and negative sample images may include both the above two probabilities, that is, the label of the positive sample image includes the probability of the real face image and the probability of the fake face image, and the label of the negative sample image also includes the probability of the real face image and the probability of the fake face image.

For example, labels label of the respective positive and negative sample images_AAnd label_BCan include the above two probabilities, for a positive sample image, label_A= (1, 0), 1 and 0 respectively indicate that the true face image probability of the image is 1 and the false face image probability is 0. For negative sample images, label_BAnd =0, 1, where 0 and 1 respectively denote that the true face image probability of an image is 0 and the false face image probability is 1.

In this embodiment, the step of acquiring the positive and negative sample images may include:

after an original face image is obtained, carrying out face detection on the original face image to determine a face area in the original face image;

in an original face image, a face area is expanded by taking the face area as a center to obtain an expanded face area;

intercepting an image of the expanded face area from an original face image to serve as a positive sample image;

and acquiring a negative sample image, wherein the negative sample image comprises a forged face.

The original face image may be an image obtained by shooting a real person, and the negative sample image may be an image obtained by shooting or capturing a video or an image.

Referring to the schematic image processing flow diagram of the original face image in fig. 2a, after the original face image is obtained, a face region where the user's face is located in the original face image (refer to the image identified by 21 in fig. 2 a) may be framed by using a face detection technology, and when the face region is enlarged, the area of the face region may be enlarged to a preset multiple, for example, the enlarged face region (refer to the image identified by 22 in fig. 2 a) is obtained by enlarging the face region by 1.8 times with the face region as a center, and an image of the enlarged face region is cut out from the original face image and is used as a positive sample image (refer to the image identified by 23 in fig. 2 a).

Further, the image proportion of the face in the negative sample image and the image proportion of the face in the positive sample image in the embodiment are preferably relatively close to each other, for example, both are within a specific proportion range, such as within a range of 80% -90%, so as to avoid that the face proportion is too small, which causes poor deconstruction and collage effects of the positive and negative sample images in subsequent steps.

And if the image sizes are not consistent, changing the image sizes of the positive and negative sample images to make the image sizes of all the positive and negative sample images consistent.

In this embodiment, the negative sample image may be generated by a face generator, for example, a face generator may generate a forged face image as the negative sample image.

Alternatively, the negative sample image may be acquired in a similar manner to the positive sample image. For example, acquiring negative sample images includes:

acquiring an original forged face image, carrying out face detection on the original forged face image, and determining a face region in the original forged face image;

in the original forged face image, a face area is expanded by taking the face area as a center to obtain an expanded face area;

and intercepting the image of the expanded face area from the original forged face image as a negative sample image.

In this embodiment, after the positive and negative sample images are obtained, label setting may be performed on the positive and negative sample images, the label of the positive sample image may be set to (1, 0), and the label of the negative sample image may be set to (0, 1).

102. Respectively replacing at least one block of image content of the positive sample image and the negative sample image to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image;

in this example, the content replacement of the positive sample image and the negative sample image may be implemented according to a face part included in the replaced image content in the positive sample image and the negative sample image.

For example, for a positive sample image, image content to be replaced in the image is determined first, replacement image content containing the same face part is obtained based on the face part in the image content, and the replacement image content is used for replacing the image content to be replaced in the positive sample image, so that a reconstructed positive sample image is obtained.

Similarly for the negative sample image, the image content to be replaced in the image can be determined first, the replacement image content containing the same face part is obtained based on the face part in the image content, and the replacement image content is used for replacing the image content to be replaced in the negative sample image to obtain the reconstructed negative sample image.

In another example, reconstruction of positive and negative sample images may be achieved based on content interchange between the positive and negative sample images.

Optionally, the step of respectively performing replacement of at least one block of image content on the positive sample image and the negative sample image to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image may include:

determining a positive sample image and a negative sample image of which the image contents need to be exchanged;

and exchanging the image content of at least one block of the same position in the positive sample image and the negative sample image to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image.

The method includes the steps of firstly, randomly extracting N positive sample images and N negative sample images from the positive sample image and the negative sample image respectively to form N positive sample image and negative sample image pairs, and then exchanging image contents of the positive sample image and the negative sample image in each pair of the positive sample image and the negative sample image.

Alternatively, the exchange of positive and negative sample images may be implemented based on the division of the images. The step of exchanging image contents of at least one block of same position in the positive sample image and the negative sample image to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image may include:

dividing the positive sample image and the negative sample image into image blocks with the same quantity according to the same division rule;

and selecting at least one image block positioned at the same position from the positive sample image and the negative sample image for exchanging to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image.

In this embodiment, exchanged image blocks may be selected in a certain manner, so that the exchange number of the image blocks of the positive and negative sample image pairs is not completely the same, the exchange number of the image blocks of the real-time positive and negative sample image pairs is the same, and the positions of the image blocks of the positive and negative sample image pairs that are actually exchanged are not completely the same. So as to enhance the content deconstruction of the sample image and improve the generalization capability of the model.

Optionally, the step of selecting at least one image block located at the same position from the positive sample image and the negative sample image to perform switching to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image may include:

randomly selecting a numerical value from a preset numerical value range as the image block exchange quantity of the positive sample image and the negative sample image;

numbering the image blocks of the positive and negative sample images to obtain an image block number sequence;

the sequence of the numbers in the image block number sequence is disordered to obtain the disordered image block number sequence;

selecting the number of the exchange number of the image blocks from the image block number sequence after disorder as an exchange number;

and exchanging the image blocks indicated by the exchange numbers in the positive sample image and the negative sample image to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image.

The numerical values included in the preset numerical range may be integers, which is not limited in this embodiment.

For example, the present embodiment obtains N positive sample images and N negative sample images, respectively labeled img_AiAnd img_Bi，i∈[1，N]，img_AiLabel of (2)_A＝1，img_BiLabel of (1)_B=0. Then, each sample image is uniformly divided into M multiplied by M image blocks, and any group of imgs is taken_AiAnd img_BiFrom uniform distribution of U [0,2 ]]A number B is sampled arbitrarily as the number of image blocks to be exchanged between the positive and negative sample images. The sequence range (0, M) is then shuffled to give a new sequence L. Then, the first B position marks are taken out from L and marked as index_jJ ∈ [0, B). Each index_jThe block specified is in img_AiAnd img_BiExchanging to obtain deconstructed and collaged img _ cutmix_Ai，img_cutmix_Bi。

For example, refer to the sample image deconstructing collage schematic shown in FIG. 2b, in which img_AiAnd img_BiPositive and negative sample images, respectively, from a uniform distribution of U0, 2]Randomly sampling a number B (assuming that the number B is 2), dividing positive and negative sample images into 3 × 3 image blocks as the number of blocks to be exchanged between positive and negative, obtaining an image block number sequence range (0, 9), scrambling the sequence to obtain a new sequence L, assuming that the scrambled L is L = (2, 4,0,8,5,1,3,7, 6), and taking out the first 2 image block numbers from L and marking the numbers as index_jJ ∈ [0, 2)), and each index_jSpecified image Block is in img_AiAnd img_BiExchange is performed, refer to FIG. 2b, exchange img_AiAnd img_BiThe image blocks with the middle numbers of 2 and 4 obtain the deconstructed and tiled image img _ cutmix_Ai，img_cutmix_Bi(i.e., reconstructing positive and negative sample images).

In an example, the dividing of the image blocks may not be performed on the sample image, but some image contents are randomly selected at a time for exchanging, and optionally, the step of exchanging image contents of at least one same position in the positive sample image and the negative sample image to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image may include:

randomly selecting at least one piece of image content in the positive sample image as a first exchange image content;

determining second exchange image content located at the same position in the negative sample image based on the position of the first exchange image content in the positive sample image;

and exchanging the first exchange image content and the second exchange image content which correspond to each other in the positive sample image and the negative sample image to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image.

In this example, the image content of the positive and negative sample image exchange is the same in position and size in both images.

103. Respectively adjusting the probability of reconstructing real face images in labels of the positive sample image and the negative sample image based on the proportion of the replaced image content in the reconstructed positive sample image and the reconstructed negative sample image to the whole image content, wherein the replaced image content in the reconstructed positive sample image and the reconstructed negative sample image is respectively regarded as the negative sample image content and the positive sample image content;

it can be understood that after the image content is replaced, a part of the image content in each of the positive sample image and the negative sample image is not the original image content of itself, and the proportion of the part of the image content in the image can be used for modifying the label.

The step of respectively adjusting the probabilities of the real face images in the labels of the reconstructed positive and negative sample images based on the proportion of the replaced image content to the whole image content in the reconstructed positive and negative sample images may include:

calculating a positive sample image content proportion of the positive sample image content which is not replaced in the reconstructed positive sample image and the reconstructed positive sample image based on the replaced image content in the reconstructed positive sample image;

adjusting the real face image probability of the reconstructed positive sample image based on the positive sample image content proportion of the reconstructed positive sample image;

calculating the proportion of the replaced image content in the reconstructed negative sample image and the reconstructed negative sample image, and taking the proportion as the proportion of the positive sample image content of the reconstructed negative sample image;

and adjusting the probability of the real face image of the reconstructed negative sample image based on the content proportion of the positive sample image of the reconstructed negative sample image.

The positive sample image content proportion in the sample image can be determined as the true face image probability.

For example, if the ratio of the negative sample image content of the reconstructed positive sample image calculated by taking the replaced image content in the reconstructed positive sample image as the negative sample image content is 0.12, then for the reconstructed positive sample image in which the ratio of the positive sample image content (the original image content in the reconstructed positive sample image) to the positive sample image content of the overall image content is 0.82, the label of the reconstructed positive sample image may include: 0.82, if two probabilities are originally described in the tag, the tag includes: (0.82 and 0.18), the probability of a real face image of the reconstructed positive sample image is 0.82, and the probability of a fake face image is 0.18.

In the scheme of dividing the image blocks of the positive and negative sample images and then exchanging the image blocks with each other to obtain the reconstructed positive and negative sample images, the various proportions can be calculated based on the number of the exchanged image blocks and the total number of the image blocks of the sample images.

Optionally, the step of "calculating a ratio of the positive sample image content in the reconstructed positive sample image that is not replaced to the positive sample image content in the reconstructed positive sample image based on the replaced image content in the reconstructed positive sample image" may include:

and for the reconstructed positive sample image, counting the actual positive sample image blocks based on the negative sample image blocks, and calculating the proportion of the positive sample image blocks to all the image blocks to serve as the content proportion of the positive sample image.

Optionally, if the actual probability of the forged face image of the reconstructed positive sample image needs to be calculated, the result of subtracting the content ratio of the positive sample image from 1 may be calculated, or the ratio of the negative sample image block to all the image blocks may be calculated as the content ratio of the negative sample image as the probability of the forged face image.

Optionally, the step of "calculating a ratio of the replaced image content in the reconstructed negative sample image to the reconstructed negative sample image, and taking the ratio as a ratio of the positive sample image content in the reconstructed negative sample image" may include:

for the reconstructed negative sample image, the ratio of the positive sample image block to all the image blocks is calculated as the positive sample image content ratio.

In this embodiment, for reconstructing the positive and negative sample images, the positive sample image content ratio of the image may be used as the probability of the real face image.

If the false face image probability of reconstructing the positive and negative sample images is required, after the content proportion of the positive sample image is calculated, a value obtained by subtracting the content proportion of the positive sample image is used as the content proportion of the negative sample image, and the content proportion of the negative sample image is used as the false face image probability.

For example, taking the previous images a and B as an example, after exchanging B image blocks for positive and negative sample images, the formula for calculating the probability of the real face image for reconstructing the positive sample image is

The probability of forging the face image is calculated by the following formula

For the reconstructed negative sample image, the calculation formula of the probability of the real face image is

The probability of forging the face image is calculated by the formula

104. And training the face detection model based on the positive and negative sample images, the reconstructed positive sample image and the reconstructed negative sample image to obtain the trained face detection model.

In one example, the face detection model of the present embodiment includes: a feature extraction module and a classification module;

the step of training the face detection model based on the positive and negative sample images, the reconstructed positive sample image and the reconstructed negative sample image to obtain the trained face detection model may include:

taking the positive and negative sample images, the reconstructed positive sample image and the reconstructed negative sample image as training sample images of the face detection model to be trained;

extracting image characteristic information of a training sample image through a characteristic extraction module;

performing classification prediction on a real face image and a forged face image on a training sample image through a classification module based on image characteristic information;

calculating the classification loss of the face detection model based on the classification prediction result of the training sample image and the label of the training sample image;

and based on the classification loss, adjusting parameters of the face detection model to obtain the trained face detection model.

In one example, the classification output of the classification module includes two, one classification being a face image classification, the classification result being a probability of predicting a real face image, indicating the probability of the image being a real face image predicted by the classification module, one classification being a counterfeit face image classification, the classification result being a probability of predicting a counterfeit face image, indicating the probability of the image being a counterfeit face image predicted by the classification module.

The feature extraction module and the classification module can be realized based on any network structure which can realize feature extraction and detection classification in the related technology. Optionally, the feature extraction module may include a plurality of convolution layers for extracting image feature information, and the classification module may be configured based on a plurality of full-link layers, but the structures of the feature extraction module and the classification module are not limited to the convolution layers and the full-link layers.

For example, referring to fig. 2c, in fig. 2c, the face detection model 20 includes a feature extraction module 201 and a classification module 202, and the training sample image of the feature extraction module 201 includes a positive sample image img and a negative sample image img_AiAnd img_BiAnd reconstructing the positive and negative sample images img _ cutmix_Ai,img_cutmix_Bi. The feature extraction module 201 may extract image feature information Z of the training sample image, and the classification module 202 may perform classification prediction based on the image feature information Z.

In one example, after the face image processing is finished, a face image to be detected can be obtained; extracting image characteristic information of a human face image to be detected through a characteristic extraction module; and performing classification prediction on the real face image and the forged face image through a classification module based on the image characteristic information to obtain the predicted real face image probability and the predicted forged face image probability of the face image to be detected, and determining whether the face image to be detected is the real face image based on the predicted real face image probability.

In one example, training the classification predictors for the sample images includes: the method comprises the steps of training the probability of a real human face image of a sample image and predicting a forged human face image; the step of calculating the classification loss of the face detection model based on the classification prediction result of the training sample image and the label of the training sample image may include:

calculating a first classification loss of the training sample image based on the real face image probability in the label of the training sample image and the predicted real face image probability in the classification prediction result;

determining the actual probability of the forged face image of the training sample image based on the actual probability of the face image of the training sample image;

calculating a second classification loss of the training sample image based on the actual probability of the forged face image of the training sample image and the probability of the predicted forged face image in the classification prediction result;

and obtaining the total classification loss of the face detection model based on the first classification loss and the second classification loss of the training sample image.

In one example, the first classification loss and the second classification loss are weighted and summed to obtain the total classification loss of the face detection model.

In this embodiment, for each training sample image, the first classification loss and the second classification loss are calculated.

The overall classification penalty of the face detection model is exemplified below. Denote the training sample image by x_iRepresenting the ith training sample image, y, in the current batch_iLabels corresponding to the training sample images are represented, the positive sample image is (1, 0), the negative sample image is (0, 1), and img _ cutmix is used for reconstructing the positive and negative sample images_Ai,img_cutmix_BiThe label of (a) can be determined according to the scheme of the foregoing content, and is not described herein again.

The training sample image x is processed by a feature extraction module and a classification module of a face detection model, and then the classification result is normalized, so that two probabilities with the value ranges of 0-1, namely the probability of predicting a real face image and the probability of predicting a fake face image, can be obtained, and then the total classification loss can be calculated, wherein the calculation formula of the total classification loss is as follows:

of these, softmax (classic (Enc (x))_i) X) represents a training sample image x_iIncluding training sample images x_iThe two probabilities of prediction are (0.12, 0.88), y_iIs a training sample image x_iThe corresponding label comprises the probability of a real face image and the probability of a fake face image, and the probability of the corresponding real face image and the probability of the fake face image are assumed to be (0.2, 0.8) respectively (0.12, 0.88).

For training sample image x_iThe corresponding total loss, based on the above formula for the total classification loss, is calculated as follows:

in one example, the face detection model further comprises a feature regression module connected to the feature extraction module; the face image processing method further comprises the following steps:

acquiring positive and negative sample images, and presetting actual characteristic information of image characteristic dimensions;

acquiring actual characteristic information of the reconstructed positive sample image and the reconstructed negative sample image in a preset image characteristic dimension;

based on the classification loss, the parameters of the face detection model are adjusted, and before the trained face detection model is obtained, the method further comprises the following steps:

performing feature regression on the image feature information of the training sample image through a feature regression module to obtain predicted feature information of the training sample image in a preset image feature dimension;

obtaining dimension loss of the face detection model in the preset image feature dimension based on the training sample image, actual feature information and predicted feature information in the preset image feature dimension;

based on the classification loss, adjusting parameters of the face detection model to obtain the trained face detection model, including:

and adjusting parameters of the face detection model based on the classification loss and the dimension loss to obtain the trained face detection model.

For example, referring to the face detection model shown in fig. 2d, compared to the structure shown in fig. 2c, the feature extraction module 201 is further followed by a feature regression module 203. The feature regression module 203 and the classification module 202 share the feature extraction module 201.

In the example shown in fig. 2d, after the training sample image is input into the feature extraction module 201, the feature extraction module 201 outputs image feature information, such as a feature map, to the feature regression module 203 and the classification module 202, the classification module 202 performs prediction on a real face image and a forged face image based on the feature map to obtain a predicted real face image probability and a predicted forged face image probability, the feature regression module 203 performs feature regression (or feature mapping) based on the feature map to map information in the feature map to a preset image feature dimension to obtain actual feature information of the preset image feature dimension.

In an example of this embodiment, the actual feature information of the preset image feature dimension includes, but is not limited to, an LBP (Local Binary Pattern) graph, a HoG graph (Histogram of Oriented gradients), where LBP is an operator used to describe the Local texture feature of the image; the HOG feature is a feature descriptor used for object detection in computer vision and image processing. It constructs features by calculating and counting the histogram of gradient direction of local area of image.

In another example, the actual feature information of the image feature dimension is preset, and may also be an actual face depth map of the face depth dimension.

Optionally, the feature regression module includes a depth feature regression module, and the step of performing feature regression on the image feature information of the training sample image through the feature regression module to obtain predicted feature information of the training sample image in the preset image feature dimension may include:

performing depth regression on the image feature information of the training sample image through a depth feature regression module to obtain a predicted face depth map of the training sample image;

based on a training sample image, obtaining dimension loss of a face detection model in a preset image feature dimension at the actual feature information and the predicted feature information of the preset image feature dimension, and the method comprises the following steps:

calculating the loss of the depth map based on the actual face depth map and the predicted face depth map of the same training sample image;

determining a face detection model based on the depth map loss, the dimension loss in the face depth dimension.

The depth feature regression module in this embodiment includes a depth feature regression parameter, and may map the image feature information extracted by the feature extraction module to a preset face depth map.

In this embodiment, based on the classification loss and the dimension loss, adjusting parameters of the face detection model to obtain the trained face detection model may include:

and adjusting parameters of the classification module and the feature extraction module based on the classification loss, and adjusting parameters of the feature regression module and the feature extraction module based on the dimension loss.

For example, referring to the module structure shown in fig. 2d, assuming that the feature regression module 203 is a depth feature regression module, the module 203 performs depth feature regression (or depth feature mapping) based on the feature map Z output by the module 201, and maps information in the feature map to a face depth dimension to obtain a predicted face depth map of the training sample image.

In this embodiment, the step of "obtaining the positive and negative sample images, and presetting the actual feature information of the image feature dimension" may include:

carrying out depth estimation on a face region in the positive sample image to obtain an actual face depth image of the positive sample image;

and for the negative sample image, setting a depth map without depth information as an actual face depth map of the negative sample image.

The depth estimation may be implemented by using an existing depth estimation model, or if the original face image corresponding to the positive sample image is captured by a depth camera, the actual face depth map of the positive sample image may be determined based on depth information in the image captured by the depth camera.

For example, referring to fig. 2a, after a positive sample image is obtained by cropping, depth recognition is performed on the positive sample image through a depth estimation model, so as to obtain an actual face depth map of the positive sample image (refer to the depth map identified by 24 in fig. 2 a). Taking the positive sample image A and the negative sample image B as examples, the positive sample image A and the negative sample image B are positiveThe actual face depth maps of the sample image a and the negative sample image B can be respectively expressed as depth_AiAnd depth_Bi. Wherein depth is_BiAn image may be set to black, indicating that there is no depth information therein.

In one example, acquiring a reconstructed positive sample image and a reconstructed negative sample image, and acquiring actual feature information in a preset image feature dimension, wherein the acquiring comprises the following steps:

determining an actual face depth map of a positive sample image corresponding to the reconstructed positive sample image as a first initial face depth map of the reconstructed positive sample image;

replacing the depth information at the same position in the first initial face depth map with the depth information of the content of the negative sample image based on the position of the content of the negative sample image in the reconstructed positive sample image to obtain an actual face depth map of the reconstructed positive sample image;

determining the actual face depth map of the negative sample image corresponding to the reconstructed negative sample image as a second initial face depth map of the reconstructed negative sample image;

and replacing the depth information at the same position in the second initial face depth map with the depth information of the content of the positive sample image based on the position of the content of the positive sample image in the reconstructed negative sample image to obtain the actual face depth map of the reconstructed negative sample image.

In this embodiment, the depth information corresponding to the negative sample image content is no depth information, and an image area without depth information may be represented in black on the face depth map.

Under the scene that image reconstruction is realized without exchanging positive and negative sample images, replacing depth information at the same position in the first initial face depth map with depth information of the negative sample image content based on the position of the negative sample image content in the reconstructed positive sample image, namely replacing the depth image content at the same position in the first initial face depth map with a black image; based on the position of the positive sample image content in the reconstructed negative sample image, the depth information at the same position in the second initial face depth map is replaced by the depth information of the positive sample image content, a depth map of the positive sample image content in the reconstructed negative sample image may be obtained first, and the black image content at the same position as the positive sample image content in the second initial face depth map is replaced by the depth map of the positive sample image content.

In the example of obtaining the reconstructed positive and negative sample images through positive and negative sample image content exchange, the image content in the face depth map may be exchanged while the image content of the positive and negative sample images is exchanged.

Referring to FIG. 2B, the positive sample image A and the negative sample image B are taken as examples, and the positive sample image img and the negative sample image img are taken as examples_AiAnd img_BiWhen dividing image blocks according to M x M, the actual face depth map depth of the positive sample image A and the negative sample image B_AiAnd depth_Bi. Also divided into M x M depth tiles, the number of depth tiles corresponding to the number of image blocks in the positive and negative sample images. In accordance with index_jJ ∈ [0, B)), and each index_jSpecified image Block is in img_AiAnd img_BiWhen the real face depth image depth is exchanged, the real face depth image depth is also exchanged_AiAnd depth_BiIndex in_jExchanging the appointed depth image blocks to obtain the deconstructed and tiled img _ cutmix_Ai，img_cutmix_Bi，depth_cutmix_Ai，depth_cutmix_BiWherein depth is_BiAn image may be set to black, indicating that there is no depth information therein. Wherein, depth _ cutmix_Ai，depth_cutmix_BiRespectively represent reconstructed positive and negative sample images img _ cutmix_Ai，img_cutmix_BiThe actual face depth map.

See FIG. 2b, depth_AiDepth tiles numbered 2 and 4 and depth in (1)_BiDepth _ cutmix, depth tile swap,

numbers

2 and 4 in (1)_AiDepth tiles numbered 2 and 4 are black, and depth _ cutmix_BiThe depth tiles numbered 2 and 4 in the middle are not black, with depth information.

In this embodiment, the loss of the depth map may be calculated by any available loss calculation method in the related art, and this embodiment is not limited thereto, and in the depth map, each pixel value represents an actual distance from an object captured in the image by an image capturing device such as a sensor. The loss of the depth map can be calculated based on the pixel values in the depth map.

In one example, the step "calculating a depth map loss based on an actual face depth map and a predicted face depth map of the same training sample image" may include:

and calculating the absolute difference value of each pixel in the two depth maps of the actual face depth map and the predicted face depth map of the same training sample image, and averaging to obtain the average absolute difference value of the pixels to be used as the loss of the depth maps.

Optionally, in one example, the face detection model has dimension loss in the face depth dimension, i.e. loss l of the depth map_depthThe calculation formula of (c) is as follows:

wherein img, img_cutmixRepresenting the real face image (i.e. positive sample image), the forged face image (i.e. negative sample image) and reconstructing the positive and negative sample images in the current batch, dep_xIs the actual face depth map corresponding to the training sample image x. Inputting a training sample image x into a face detection model, obtaining a prediction result through the model, namely a prediction face Depth map Depth (Enc (x)), and then using an actual face Depth map dep_xAnd calculating the absolute difference value of each pixel (the absolute difference values of the pixels at the same position in the two Depth maps) according to the prediction result Depth (Enc (x)), and averaging to obtain the average absolute difference value of the pixels as l_depth。

In one example, adjusting parameters of the face detection model based on the classification loss and the dimension loss to obtain a trained face detection model, further includes:

acquiring a human face image to be detected;

extracting image characteristic information of a face image to be detected through a characteristic extraction module;

classifying and predicting the real face image and the forged face image based on the image characteristic information through a classification module to obtain the predicted real face image probability and the predicted forged face image probability of the face image to be detected;

performing feature regression on the image feature information through a feature regression module to obtain predicted feature information of the face image to be detected in a preset image feature dimension;

determining the detection score of the face image to be detected on the preset image feature dimension based on the predicted feature information;

determining the total detection score of the face image to be detected based on the probability of predicting the real face image and the detection score;

and when the total detection value is greater than a preset value threshold value, determining that the facial image to be detected is a real facial image, otherwise, determining that the facial image to be detected is a fake facial image.

When the feature regression module is a depth feature regression module, performing feature regression on the image feature information through the feature regression module to obtain predicted feature information of the to-be-detected face image in a preset image feature dimension, and performing depth feature regression on the image feature information through the depth feature regression module to obtain a predicted face depth map of the to-be-detected face image.

After the face detection model of this embodiment is trained, it may learn depth information rules of the face image content and the forged face image content, that is, the forged face image content does not have depth information, and it may be understood that, for the face detection model of this embodiment, a predicted face depth map recognized from a face image to be detected may have a black region, that is, a region without depth information, indicating that the image region belongs to a forged face image.

Correspondingly, the step of determining the detection score of the face image to be detected in the preset image feature dimension based on the predicted feature information may include:

and calculating an average pixel value of the predicted face depth image, and determining a detection score of the face image to be detected in the face depth dimension based on the average pixel value.

In one example, the average pixel value may be used as a detection score of the face image to be detected in the face depth dimension.

In another example, a maximum pixel value may be obtained from the predicted face depth map, an average pixel value of the predicted face depth map is obtained, and the average pixel value is normalized based on the maximum pixel value so as to be in a range of 0-1, for example, a ratio of the average pixel value to the maximum pixel value is used as a detection score of the face image to be detected in the face depth dimension.

In one embodiment, the probability of predicting the real face image and the detection may be weighted and summed to obtain a total detection score of the face image to be detected. The preset score threshold in this embodiment may be set according to the severity of the detection and/or according to experience, and this embodiment is not limited to this.

In this embodiment, the trained face detection model may be deployed in a required device. The device may be a networked device or an unconnected device, and is set according to actual needs, which is not limited in this embodiment.

Taking a residential area face recognition access control system as an example, the face detection model of the embodiment can be used as a face detection module for detection in the face recognition access control system and is integrated in the access control system, the access control system comprises an image acquisition module, the image acquisition module transmits a face image to be detected to the face detection module after acquiring the face image to be detected, the face image to be detected is identified by adopting a packaged face detection model, the specific identification process can refer to the description, and after a detection result is obtained (namely the face image to be detected is a real face image or a forged face image), whether the door is opened or not is determined based on the detection result and a user authentication result of a residential area user authentication model based on the face in the access control system.

Taking a client providing a face detection function as an example, assuming that a client application on a terminal provides a face detection entry for a user, when the user triggers the face detection entry, the terminal can call a camera, if the front camera collects a face image to be detected, then the terminal transmits the face image to be detected to a corresponding server of the client application, the server is deployed with the face detection model trained in the embodiment, and the server detects the face image to be detected based on the face detection module to obtain a face detection result, namely, whether the face image to be detected is a real face image or a fake face image, and sends the detection result to the terminal. The terminal can judge whether the terminal is in a detection scene or not based on the result, namely whether the front of the camera is a real face or not, and further, the terminal can judge whether the user identity authentication in front of the camera passes or not by combining other recognition results based on the face.

By adopting the embodiment of the application, the image contents in the positive and negative sample images can be exchanged to obtain the reconstructed positive and negative sample images, and the labels of the reconstructed positive and negative sample images are modified based on the exchange; therefore, data distribution in the positive and negative sample images changes, the generalization capability of the face detection model obtained by training is improved based on the positive and negative sample images, the reconstructed positive sample images and the reconstructed negative sample images, and the face detection model can learn to give a classification result which is not too accurate based on the modified labels of the reconstructed positive and negative sample images, so that the problem of excessive confidence of the model is effectively solved.

In order to better implement the method, correspondingly, the embodiment of the invention also provides a face image processing device, and the face image processing device is specifically integrated in the face image processing terminal.

Referring to fig. 3, the apparatus includes:

a sample obtaining unit 301, configured to obtain positive and negative sample images, where the positive sample image is a real face image, the negative sample image is a forged face image, and labels of the positive and negative sample images include probabilities of the real face image;

an image reconstruction unit 302, configured to perform at least one block of image content replacement on the positive sample image and the negative sample image, respectively, to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image;

a label resetting unit 303, configured to adjust probabilities of real face images in labels of the reconstructed positive and negative sample images based on a ratio of the replaced image content in the reconstructed positive and negative sample images to the whole image content, where the replaced image content in the reconstructed positive and negative sample images is regarded as the negative sample image content and the positive sample image content, respectively;

and the model training unit 304 is configured to train the face detection model based on the positive and negative sample images, the reconstructed positive sample image and the reconstructed negative sample image, so as to obtain a trained face detection model.

In an alternative example, the image reconstruction unit is configured to:

Correspondingly, the label resetting unit is used for:

and adjusting the real face image probability of the reconstructed negative sample image based on the content proportion of the positive sample image of the reconstructed negative sample image.

In an alternative example, the image reconstruction unit is configured to:

numbering image blocks of the positive and negative sample images to obtain an image block number sequence;

In an alternative example, the image reconstruction unit is configured to:

randomly selecting at least one piece of image content in the positive sample image as first exchange image content;

In an alternative example, the face detection model includes a feature extraction module and a classification module; a model training unit to:

performing classification prediction of real face images and forged face images on training sample images through a classification module based on image characteristic information;

and adjusting parameters of the face detection model based on the classification loss to obtain the trained face detection model.

In an alternative example, the classification prediction result of the training sample image includes: predicting the probability of a real face image and predicting the probability of a forged face image; a model training unit to:

determining the actual probability of the forged face image of the training sample image based on the real face image probability of the training sample image;

In an optional example, the face detection model further comprises a feature regression module connected to the feature extraction module; a model training unit further to:

before parameters of a face detection model are adjusted based on classification loss to obtain a trained face detection model, feature regression is carried out on image feature information of a training sample image through a feature regression module to obtain predicted feature information of the training sample image in a preset image feature dimension;

and obtaining dimension loss of the face detection model in the preset image feature dimension based on the training sample image, the actual feature information and the predicted feature information in the preset image feature dimension.

The model training unit adjusts parameters of the face detection model based on the classification loss, and the mode of obtaining the trained face detection model specifically comprises: and adjusting parameters of the face detection model based on the classification loss and the dimension loss to obtain the trained face detection model.

In an optional example, the actual feature information of the preset image feature dimension includes: the actual face depth map of face depth dimension, the feature regression module include degree of depth feature regression module, and the model training unit is used for:

In an alternative example, the model training unit is configured to:

setting a depth map without depth information as an actual face depth map of the negative sample image for the negative sample image;

replacing depth information at the same position in the first initial face depth map with depth information of the negative sample image content based on the position of the negative sample image content in the reconstructed positive sample image to obtain an actual face depth map of the reconstructed positive sample image;

and replacing the depth information at the same position in the second initial face depth image with the depth information of the content of the positive sample image based on the position of the content of the positive sample image in the reconstructed negative sample image to obtain the actual face depth image of the reconstructed negative sample image.

In an optional example, the sample acquiring unit is configured to: after an original face image is obtained, carrying out face detection on the original face image to determine a face area in the original face image; in an original face image, a face area is expanded by taking the face area as a center to obtain an expanded face area; intercepting an image of the expanded face area from the original face image as a positive sample image; and acquiring a negative sample image, wherein the negative sample image comprises a forged face.

In an optional example, the face image processing apparatus further includes a detection unit, where after the model training unit adjusts parameters of the face detection model based on the classification loss and the dimension loss to obtain a trained face detection model, the detection unit is configured to:

acquiring a human face image to be detected;

By adopting the embodiment of the application, the image contents in the positive and negative sample images can be exchanged to obtain the reconstructed positive and negative sample images, and the labels of the reconstructed positive and negative sample images are modified based on the exchange; therefore, data distribution in the positive and negative sample images changes, the generalization capability of the face detection model obtained through training is improved based on the positive and negative sample images, the reconstructed positive sample images and the reconstructed negative sample images, the face detection model can learn to give an inaccurate classification result based on the modified labels of the reconstructed positive and negative sample images, and the problem of excessive confidence of the model is effectively solved.

In addition, an embodiment of the present invention further provides a computer device, where the computer device may be a terminal or a server, as shown in fig. 4, which shows a schematic structural diagram of the computer device according to the embodiment of the present invention, and specifically:

the computer device may include components such as a processor 401 of one or more processing cores, memory 402 of one or more computer-readable storage media, a power supply 403, and an input unit 404. Those skilled in the art will appreciate that the computer device configuration illustrated in FIG. 4 does not constitute a limitation of computer devices, and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components. Wherein:

the processor 401 is a control center of the computer device, connects various parts of the entire computer device using various interfaces and lines, and performs various functions of the computer device and processes data by operating or executing software programs and/or modules stored in the memory 402 and calling data stored in the memory 402. Optionally, processor 401 may include one or more processing cores; preferably, the processor 401 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 401.

The memory 402 may be used to store software programs and modules, and the processor 401 executes various functional applications and data processing by operating the software programs and modules stored in the memory 402. The memory 402 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the computer device, and the like. Further, the memory 402 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 402 may also include a memory controller to provide the processor 401 access to the memory 402.

The computer device further comprises a power supply 403 for supplying power to the various components, and preferably, the power supply 403 is logically connected to the processor 401 through a power management system, so that the functions of managing charging, discharging, and power consumption are realized through the power management system. The power supply 403 may also include any component of one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.

The computer device may also include an input unit 404, the input unit 404 being operable to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.

Although not shown, the computer device may further include a display unit and the like, which are not described in detail herein. Specifically, in this embodiment, the processor 401 in the computer device loads the executable file corresponding to the process of one or more application programs into the memory 402 according to the following instructions, and the processor 401 runs the application programs stored in the memory 402, thereby implementing various functions as follows:

acquiring positive and negative sample images, wherein the positive sample image is a real face image, the negative sample image is a forged face image, and the labels of the positive and negative sample images comprise: the probability of a real face image;

The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.

It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions, or by instructions controlling associated hardware, which may be stored in a computer-readable storage medium and loaded and executed by a processor.

Therefore, the embodiment of the present invention further provides a storage medium, in which a plurality of instructions are stored, and the instructions can be loaded by a processor to execute the method for processing a face image according to the embodiment of the present invention.

Wherein the storage medium may include: read Only Memory (ROM), random Access Memory (RAM), magnetic or optical disks, and the like.

Since the instructions stored in the storage medium can execute the steps in the face image processing method provided in the embodiment of the present invention, the beneficial effects that can be achieved by the face image processing method provided in the embodiment of the present invention can be achieved, which are detailed in the foregoing embodiments and will not be described herein again.

The face image processing method, apparatus, computer device and storage medium provided by the embodiments of the present invention are described in detail above, and a specific example is applied in the text to explain the principle and the implementation of the present invention, and the description of the above embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for those skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A face image processing method is characterized by comprising the following steps:

exchanging at least one block of image content at the same position in a positive sample image and a negative sample image of the image content to be exchanged to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image, wherein the image sizes of the positive sample image and the negative sample image are consistent;

calculating a ratio of the positive sample image content in the reconstructed positive sample image that is not replaced to the positive sample image content in the reconstructed positive sample image based on the replaced image content in the reconstructed positive sample image, wherein the replaced image content in the reconstructed positive sample image is regarded as negative sample image content;

calculating the proportion of the replaced image content in the reconstructed negative sample image to the reconstructed negative sample image, and taking the proportion as the proportion of the positive sample image content in the reconstructed negative sample image, wherein the replaced image content in the reconstructed negative sample image is regarded as the positive sample image content;

adjusting the probability of a real face image of the reconstructed negative sample image based on the content proportion of the positive sample image of the reconstructed negative sample image;

2. The method of claim 1, wherein before exchanging image contents of at least one block of same position in the positive sample image and the negative sample image to obtain a reconstructed positive sample image corresponding to the positive sample image, the method further comprises:

positive and negative sample images are determined that require an exchange of image content.

3. The method according to claim 2, wherein the exchanging the image content of at least one block of the positive sample image and the negative sample image at the same position to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image comprises:

and selecting at least one image block positioned at the same position from the positive sample image and the negative sample image for exchange to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image.

4. The method according to claim 3, wherein the selecting at least one image block located at the same position from the positive sample image and the negative sample image for swapping to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image comprises:

randomly selecting a numerical value from a preset numerical value range as the number of image block exchanges of the positive sample image and the negative sample image;

the sequence of the serial numbers in the image block serial number sequence is disordered to obtain an image block serial number sequence after disorder;

selecting the number of the image block exchange number from the image block number sequence after disorder as an exchange number;

5. The method according to claim 2, wherein the exchanging the image content of at least one block of the positive sample image and the negative sample image at the same position to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image comprises:

determining second exchange image content located at the same position in a negative sample image based on the position of the first exchange image content in the positive sample image;

6. The method for processing the human face image according to any one of claims 1 to 5, wherein the human face detection model comprises a feature extraction module and a classification module;

based on positive and negative sample images, and a reconstructed positive sample image and a reconstructed negative sample image, the face detection model is trained to obtain a trained face detection model, which comprises the following steps:

taking the positive and negative sample images, the reconstructed positive sample image and the reconstructed negative sample image as training sample images of a face detection model to be trained;

extracting image characteristic information of a training sample image through the characteristic extraction module;

performing classification prediction of real face images and forged face images on the training sample images through the classification module based on the image characteristic information;

7. The method of claim 6, wherein the classification prediction result of the training sample image comprises: the method comprises the steps of training the probability of a real human face image predicted by a sample image, and predicting the probability of a fake human face image;

the calculating the classification loss of the face detection model based on the classification prediction result of the training sample image and the label of the training sample image comprises:

calculating a first classification loss of the training sample image based on a real face image probability in a label of the training sample image and a predicted real face image probability in the classification prediction result;

8. The method of claim 6, wherein the face detection model further comprises a feature regression module connected to the feature extraction module; the face image processing method further comprises the following steps:

acquiring actual characteristic information of the positive and negative sample images in a preset image characteristic dimension;

acquiring the actual characteristic information of the reconstructed positive sample image and the reconstructed negative sample image in the preset image characteristic dimension;

based on the classification loss, adjusting parameters of the face detection model, and before obtaining the trained face detection model, the method further comprises:

performing feature regression on the image feature information of the training sample image through the feature regression module to obtain predicted feature information of the training sample image in the preset image feature dimension;

obtaining dimension loss of the face detection model in the preset image feature dimension based on the training sample image, actual feature information and predicted feature information of the preset image feature dimension;

based on the classification loss, adjusting parameters of the face detection model to obtain a trained face detection model, including:

9. The method of claim 8, wherein the actual feature information of the preset image feature dimension comprises: an actual face depth map of a face depth dimension, the feature regression module comprising a depth feature regression module,

performing feature regression on the image feature information of the training sample image through the feature regression module to obtain predicted feature information of the training sample image in the preset image feature dimension, including:

performing depth regression on the image feature information of the training sample image through the depth feature regression module to obtain a predicted face depth map of the training sample image;

the obtaining of the dimension loss of the face detection model in the preset image feature dimension based on the training sample image, the actual feature information and the predicted feature information in the preset image feature dimension, includes:

determining a dimension loss of the face detection model in the face depth dimension based on the depth map loss.

10. The method for processing the face image according to claim 9, wherein the acquiring of the actual feature information of the positive and negative sample images in a preset image feature dimension includes:

carrying out depth estimation on the face region in the positive sample image to obtain an actual face depth image of the positive sample image;

the acquiring of the actual feature information of the preset image feature dimension of the reconstructed positive sample image and the reconstructed negative sample image includes:

determining an actual face depth map of a positive sample image corresponding to a reconstructed positive sample image as a first initial face depth map of the reconstructed positive sample image;

determining an actual face depth map of a negative sample image corresponding to a reconstructed negative sample image as a second initial face depth map of the reconstructed negative sample image;

11. The method for processing the human face image according to any one of claims 1 to 5, wherein the acquiring of the positive and negative sample images comprises:

in the original face image, the face area is expanded by taking the face area as a center, and an expanded face area is obtained;

intercepting the image of the expanded face area from the original face image to be used as a positive sample image;

12. The method of claim 8, wherein the adjusting parameters of the face detection model based on the classification loss and the dimension loss to obtain a trained face detection model further comprises:

acquiring a human face image to be detected;

extracting image characteristic information of the facial image to be detected through the characteristic extraction module;

classifying and predicting real face images and forged face images based on the image characteristic information through the classification module to obtain the predicted real face image probability and the predicted forged face image probability of the face images to be detected;

performing feature regression on the image feature information through the feature regression module to obtain predicted feature information of the face image to be detected in the preset image feature dimension;

determining the detection score of the facial image to be detected on the preset image feature dimension based on the predicted feature information;

13. A face image processing apparatus characterized by comprising:

the system comprises a sample acquisition unit, a comparison unit and a comparison unit, wherein the sample acquisition unit is used for acquiring positive and negative sample images, the positive sample image is a real face image, the negative sample image is a forged face image, and labels of the positive and negative sample images comprise real face image probabilities;

the image reconstruction unit is used for exchanging the image content of at least one block of the positive sample image and the negative sample image at the same position to obtain a reconstructed positive sample image corresponding to the positive sample image and a reconstructed negative sample image corresponding to the negative sample image, wherein the image sizes of the positive sample image and the negative sample image are consistent;

a label resetting unit, configured to calculate, based on the replaced image content in the reconstructed positive sample image, a ratio of the positive sample image content in the reconstructed positive sample image that is not replaced to the positive sample image content in the reconstructed positive sample image, where the replaced image content in the reconstructed positive sample image is regarded as negative sample image content; adjusting the probability of a real face image of the reconstructed positive sample image based on the content proportion of the positive sample image of the reconstructed positive sample image; calculating the proportion of the replaced image content in the reconstructed negative sample image to the reconstructed negative sample image, and taking the proportion as the proportion of the positive sample image content in the reconstructed negative sample image, wherein the replaced image content in the reconstructed negative sample image is regarded as the positive sample image content; adjusting the probability of a real face image of the reconstructed negative sample image based on the content proportion of the positive sample image of the reconstructed negative sample image;

14. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method according to any one of claims 1 to 12 when executing the computer program.

15. A computer-readable storage medium, on which a computer program is stored, which, when the computer program is run on a computer, causes the computer to carry out the steps of the method according to any one of claims 1 to 12.