CN113971822A

CN113971822A - Face detection method, intelligent terminal and storage medium

Info

Publication number: CN113971822A
Application number: CN202010710797.XA
Authority: CN
Inventors: 汪浩; 刘阳兴; 王树朋; 李秀阳
Original assignee: Wuhan TCL Group Industrial Research Institute Co Ltd
Current assignee: Wuhan TCL Group Industrial Research Institute Co Ltd
Priority date: 2020-07-22
Filing date: 2020-07-22
Publication date: 2022-01-25

Abstract

The invention discloses a face detection method, an intelligent terminal and a storage medium, wherein the method comprises the following steps: acquiring an original image to be detected; inputting an original image into a trained attribute value detection model set for processing, and outputting a face attribute value and a scene attribute value; and inputting the original image, the face attribute value and the scene attribute value into a trained face key point detection model for processing, and outputting face key points, wherein the accuracy of the face key point detection model for detecting the face key points is greater than or equal to the preset accuracy. The method and the device can improve the accuracy of the face key point detection.

Description

Face detection method, intelligent terminal and storage medium

Technical Field

The invention relates to the technical field of image processing, in particular to a face detection method, an intelligent terminal and a storage medium.

Background

As people rely on smartphones more and more, their requirements on smartphones are also higher and higher. In order to meet the requirements of users, manufacturers gradually increase the pixels of the mobile phones, and therefore, the requirements for processing photos after being photographed are higher and higher. Modern beauty functions include peeling, whitening, face thinning, eye enhancement, and the like. The first step of beautifying is to accurately acquire the skin area of the face in the image.

There are two methods commonly used at present to obtain the skin area of the human face. One is to use the traditional algorithm to segment the skin, and the other is to detect key points of the face through deep learning, so as to determine the skin area. However, the traditional algorithm has poor effect of segmenting the skin, and can segment the hair or irrelevant areas with similar skin colors in the image. If the background in the photograph is flesh, the segmented skin area will have a large portion of the background. Meanwhile, after the skin obtained by the traditional algorithm is beautified, the edge effect is poor, such as unnatural transition and the like. Although the adoption of deep learning detection can improve the segmentation accuracy, the face beautification cannot take into account factors such as light rays in the image, and the facial beautified face and the image background have unnatural feeling, so that the face beautification effect is poor. At present, the detection of the key points of the human face is mainly realized by training images through a convolutional neural network, and the influence of factors such as the illumination environment where a picture is located is not considered in the training process, so that the accuracy rate of the detection of the key points of the human face is low in a complex environment.

Disclosure of Invention

The invention mainly aims to provide a face detection method, an intelligent terminal and a storage medium, and aims to improve the accuracy of face key point detection.

In order to achieve the above object, the present invention provides a face detection method, which comprises the following steps:

acquiring an original image to be detected;

inputting an original image into a trained attribute value detection model set for processing, and outputting a face attribute value and a scene attribute value;

and inputting the original image, the face attribute value and the scene attribute value into a trained face key point detection model for processing, and outputting face key points, wherein the accuracy of the face key point detection model for detecting the face key points is greater than or equal to the preset accuracy.

Optionally, the face detection method further includes:

determining a skin area to be beautified in the original image according to the key points of the face;

and beautifying the skin area according to the face attribute value and the scene attribute value to obtain a beautified target image.

In addition, to achieve the above object, the present invention further provides an intelligent terminal, wherein the intelligent terminal includes: the face detection method comprises a memory, a processor and a face detection program stored on the memory and capable of running on the processor, wherein the steps of the face detection method are realized when the face detection program is executed by the processor.

In addition, in order to achieve the above object, the present invention further provides a storage medium, wherein the storage medium stores a face detection program, and the face detection program implements the steps of the above face detection method when executed by a processor.

The invention provides a face detection method considering both a face attribute value and a scene attribute value. The method comprises the steps of firstly inputting an original image to be detected into an attribute value detection model set to obtain a face attribute value and a scene attribute value, then inputting a human body attribute value and the scene attribute value into a face key point detection model together with the original image, and carrying out face key point detection on the image by the face key point detection model under the premise of comprehensively considering the face attribute value and the scene attribute value. Therefore, the method and the device can eliminate or weaken the interference of the face attribute and the scene attribute in the picture on the face key point detection, thereby improving the accuracy of the face key point detection.

The invention also comprises a step of beautifying the skin area corresponding to the face key point detection after the detection, and corresponding beautifying parameters such as whitening amplitude, filtering algorithm and filtering parameter can be determined after the face attribute value and the scene attribute value are determined, so that different beautifying methods can be matched according to different face attribute values and scene attribute values, and the self-adaption and more flexible and intelligent beautifying can be realized.

Drawings

FIG. 1 is a flow chart of a preferred embodiment of the face detection method of the present invention;

FIG. 2 is a flowchart illustrating the process of obtaining the face attribute values in step S200 according to the preferred embodiment of the face detection method of the present invention;

FIG. 3 is a flowchart illustrating steps before step S200 according to a preferred embodiment of the face detection method of the present invention;

fig. 4 is a flowchart of steps of the face detection method of the present invention after step S300;

FIG. 5 is a flowchart of step S500 when the skin care to be performed is whitening according to the preferred embodiment of the face detection method of the present invention;

FIG. 6 is a flowchart of step S500 when the skin-care process is performed according to the preferred embodiment of the face detection method of the present invention;

fig. 7 is a schematic operating environment diagram of an intelligent terminal according to a preferred embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, the face detection method of the preferred embodiment of the present invention includes the following steps:

and S100, the intelligent terminal acquires an original image to be detected.

In this embodiment, the intelligent terminal executing the face detection method is a smart phone. In practical application, the device can be a television, a computer and the like. When a user obtains an image through the smart phone, the camera sends the original image serving as an original image to be detected to the smart phone so as to detect the face key points of the original image.

And S200, inputting the original image into the trained attribute value detection model set by the intelligent terminal for processing, and outputting a face attribute value and a scene attribute value.

Specifically, the attribute value detection model comprises a plurality of detection models, and the detection models are used for detecting the original image, so that the corresponding face attribute value and the scene attribute value are obtained.

The face attribute value refers to the face attribute of the parameter that can affect the subsequent beauty, such as gender, age, race, etc. If the person in the original image is a baby, the corresponding age is small, the skin of the baby is fair, the whitening parameters can be reduced, and the original image after being beautified is more real.

The scene attribute values refer to various attribute values which can influence subsequent whitening parameters, such as the light intensity and the light source of a scene in an original image. For example, if the light is strong, the whitening amplitude can be adjusted to be large, and if the light is weak, the whitening amplitude can be adjusted to be small. Thereby making the beauty more natural.

Further, the attribute value detection model set includes a face attribute value detection model and a scene attribute value detection model, and step S200 includes:

the intelligent terminal inputs the original image into a face attribute value detection model for face detection and outputs a face attribute value; the intelligent terminal inputs an original image into a scene attribute value detection model for scene detection and outputs a scene attribute value; alternatively, the first and second electrodes may be,

the intelligent terminal inputs an original image into a scene attribute value detection model for scene detection and outputs a scene attribute value; the intelligent terminal inputs the original image into a face attribute value detection model for face detection and outputs a face attribute value; alternatively, the first and second electrodes may be,

the intelligent terminal inputs the original image into the scene attribute value detection model for scene detection and outputs the scene attribute value in the process of inputting the original image into the face attribute value detection model for face detection and outputting the face attribute value.

Specifically, a face attribute detection model and a scene attribute detection model are preset, and an original image to be detected is input into the face attribute value detection model, where the face attribute value detection model adopted in this embodiment is a neural network model obtained by learning a large number of training samples in advance.

After an original image is input into a face attribute value detection model, taking the face attribute value detection model as a convolutional neural network as an example, the face attribute value detection model can extract features of the original image, then classify the attributes of the face according to the extracted features, and divide the human in the original image into males and females according to gender, so as to obtain a face attribute value for prediction.

When, after or before the original image is input into the face attribute value detection model, the original image is also input into the scene attribute value detection model, and the scene attribute value is obtained based on the same principle.

In addition to using a Convolutional Neural Network (CNN) model or a Convolutional Neural network (RNN) model alone, the two models may be combined, for example, the RNN model is used for feature extraction, and the CNN model is used for classification, so as to obtain a face attribute value or a scene attribute value.

Further, referring to fig. 2, in this embodiment, the face attribute value includes age and gender, the face attribute value detection model includes a face alignment model, an age detection model, and a gender detection model, and the specific process of obtaining the face attribute value in step S200 includes:

and step S211, the intelligent terminal inputs the original image into the face alignment model for alignment, and outputs an aligned image.

Specifically, the face alignment refers to the facial alignment between five sense organs according to the input original image and a standard face. In order to reduce interference of other factors, in the training process of models such as face recognition, the face in the original image needs to be standardized in advance, that is, the face needs to be aligned. The face alignment model is therefore included in the face attribute value detection. The method comprises the steps of firstly inputting an original image into a face alignment model, finding key points such as a nose, eyes and a mouth in the original image by the face alignment model, and then processing a face in the original image through means of rotation, translation and the like so as to align the face in the original image with the face in a training sample image.

For example, the face in the training sample image is not smiling, but the original image is smiling, the original image is input into the face alignment model, the face alignment model finds the key point of the mouth, moves the key point of the mouth downwards, and turns the mouth in the original image into a non-smiling state, so as to obtain an alignment image aligned with the training sample image.

And S212, the intelligent terminal respectively inputs the aligned images into an age detection model and a gender detection model for age identification and gender identification, and outputs the age and the gender.

In this embodiment, the age detection model and the gender detection model are obtained by training a neural network model. The neural network model includes a CNN model, an RNN model, and the like. In the training process, a large number of training samples are subjected to forward propagation to obtain a prediction result, the loss between the prediction result and a calibration file is calculated, and parameters are optimized through backward propagation. In this embodiment, a CNN model is used as a basic model of an age detection model, and a brief description is given.

Firstly, acquiring a large number of training sample images, wherein the training sample images contain human faces of different ages, labeling the corresponding ages of the human faces of each training sample image, and storing the labeled images as labeled files. The training sample images are then input into the CNN model. The CNN model comprises a convolutional layer, a pooling layer and a fully connected layer. Because the image essence is piled up by a plurality of pixel points, and each pixel point can be represented by color values such as RGB, the training sample image can be regarded as a matrix. Firstly, the convolution layer of the CNN model is used for convolving the training sample image, and the pooling layer is used for pooling the matrix after convolution. The combination of the convolutional layer and the pooling layer can appear in the hidden layer for a plurality of times, and the characteristic diagram obtained by the last pooling layer is sequentially expanded according to rows to be connected into a vector which is input into the full-connection network. And finally, judging a prediction result corresponding to the training sample image through a classifier. The prediction result may have some difference from the age in the previous annotation file, and the difference between the two, i.e. the loss value, may be calculated according to a loss function. And (4) adjusting each parameter of the model by reversely transmitting the loss value back to the CNN model until the model converges to obtain the age detection model.

By the same method, a gender detection model can be obtained. And inputting the aligned image obtained by the human face alignment model into an age detection model and a gender detection model. And the age detection model and the gender detection model respectively carry out age detection and gender detection on the aligned images to obtain the corresponding age and gender of the original image.

Further, the Loss function of the face attribute value detection model and the scene attribute value detection model in the above embodiments is a Focal local function.

Specifically, the Focal local function mainly solves the problem of serious imbalance of the proportion of positive and negative samples in one-stage target detection. The loss function reduces the weight that a large number of simple negative samples occupy in training. The Focal local function is obtained by modifying on the basis of a cross entropy Loss function. The formula of the Focal local function is as follows:

wherein L is_flIs the loss value, y' is the prediction result, y is the value noted in the noted file, y is within the range of [0, 1 ]]α is a balance factor and γ is a weight index. By adjusting alpha and gamma, the weight ratio between the positive sample and the negative sample can be adjusted, the error is reduced, and the accuracy of the result is improved.

Further, referring to fig. 3, before step S200, the method further includes:

and step S110, the intelligent terminal calculates the face skin color value of the original image based on a preset face skin color algorithm.

Specifically, the current face skin color algorithm includes skin color recognition based on RGB (Red Green Blue ) color space, and the like. The color value of the pixel point is mainly calculated to a certain extent to obtain the human face skin color value, and the calculation of the human face skin color value is a conventional means, so that further description is not provided herein.

And step S112, the intelligent terminal respectively determines a face attribute value detection model and a scene attribute value detection model corresponding to the detection model set according to the face skin color value.

Specifically, the detection model set includes a plurality of different face attribute value detection models and scene attribute value detection models, and different face settings are grouped in advance, for example, into three groups, each group corresponding to a different face skin color value range. And different ranges of the human face skin color correspond to different human face attribute value detection models and different scene attribute value detection models. And after the face skin color is obtained through calculation, determining the corresponding range of the face skin color, thereby further determining a corresponding face attribute value detection model and a scene attribute value detection model. By the method, the detection fitting degree is improved, and the error rate is reduced.

And step S300, the intelligent terminal inputs the original image, the face attribute value and the scene attribute value into a trained face key point detection model for processing, and outputs face key points, wherein the accuracy of the face key point detection model for detecting the face key points is greater than or equal to the preset accuracy.

Specifically, after obtaining the face attribute value and the scene person attribute value, the face attribute value and the scene person attribute value are input into a face key point detection model together with an original image. The face key point detection model can also be obtained by training a neural network model. Because the face key point detection model detects the original image according to the face attribute value and the scene attribute value, the obtained face key points are obtained on the basis of the face attribute value and the scene attribute value, and compared with the traditional face key point detection model, the scheme can reduce the problem of inaccurate detection caused by face attribute difference and scene difference.

Further, the face key point detection model is based on a depth convolution generation countermeasure network constructed by a self-attention mechanism.

In this embodiment, the face keypoint detection model is obtained by training a Deep convolution generated confrontation network (DCGAN). The DCGAN network model is obtained by integrating a CNN model and a Generative Adaptive Networks (GAN) model.

First, a GAN model is introduced, which includes a generative model and a discriminant model. The generative model is used to generate near-real images or data, and the discriminative model is used to discriminate whether the images or data generated by the generative model are real or counterfeit. Generative models and discriminative models game each other constantly, so GAN models are more sensitive to differences.

And the DCGAN model builds a generation model and a discrimination model by using CNN, and simultaneously cancels improvements such as pooling formation and full connection layers. Therefore, the human face key point detection model obtained through the training of the DCGAN network model is closer to the true value and more stable. The input original image and the face attribute value are taken as an example for simple description. The generation model generates a series of pseudo images similar to real images through the input human face attribute values. And then judging whether the pseudo image and the original image are real or not by adopting a discrimination model, and predicting the key points and the attribute values of the face on the images. And then calculating the loss values of the predicted face key points and face attribute values and the face key points and face attribute values in the input calibration file, and reversely transmitting the loss values to generate a model and a discrimination model. The generated model further optimizes the generated pseudo-graph according to the loss value, and the discrimination model further optimizes the accuracy of face key point detection and authenticity judgment until the face key point detection and authenticity judgment are converged and balanced, and training is completed. And then inputting the original image, the face attribute value and the scene attribute value into the trained discrimination model, and controlling the discrimination model to detect the original image so as to determine the face key point in the original image, wherein the training in the step is similar to a neural network training method until the model converges. In this embodiment, the convergence determination method is that the accuracy of the face keypoint detection model for detecting the face keypoints is approximately equal to the preset accuracy. In this embodiment, the loss function of the face key point detection model can be represented by the following formula:

wherein L is_ganRepresenting the loss value, E representing the expectation, i.e. the pre-calibration of the training samples,/_tRepresenting the input face attribute value, v_tRepresenting input face key points, D representing judgmentModel, G denotes a generative model, l_gValues of face attributes, v, representing pseudo-images_gFace keypoints representing a pseudo-image. And adding a scene attribute value, and adding the scene attribute value for both generation of the generation model and judgment of the discrimination model.

In addition, the DCGAN model also adopts a self-attention mechanism, and in the feature extraction part, the learning weight of the face attribute value and the scene attribute value is increased, so that more features related to the face attribute and the scene attribute in the extracted features are reserved. For example, when the light is dark, the weight of the pixel points with the closer boundary value can be increased when the face key point detection model detects the face key points.

Further, referring to fig. 4, after step S300, the method further includes:

and step S400, the intelligent terminal determines a skin area to be beautified in the original image according to the key points of the face.

In this embodiment, the face key points include the five sense organs of the person, such as the nose, eyes and mouth, and the contour of the face, including the points where the skin and hair or the background are in contact. Furthermore, points of the skin of a person such as an arm or neck that may be exposed may be selected. After the key points of the face are obtained, the key points can be associated together through the position relation between the key points, and therefore the skin area to be detected is selected in a frame mode.

And S500, the intelligent terminal beautifies the skin area according to the face attribute value and the scene attribute value to obtain an beautified target image.

Specifically, beauty includes buffing, whitening, face thinning, eye enhancement, highlight, and the like.

Taking face thinning as an example, face thinning is to implement face enlargement or face reduction by shifting pixel points of a face contour in an original image. After determining the face attribute value and the scene attribute value, a corresponding offset mapping relationship may be determined, for example, the coordinate of a certain pixel point is (x)₁，y₁) After a specific offset mapping relationship, the coordinate is (x)₂，y₂). After all the pixel points are shifted, face thinning can be finished, a target image after face thinning is obtained, and finally the target image is outputAnd displaying the smart phone on a display screen.

Further, referring to fig. 5, when the beautifying to be performed is whitening, step S500 includes:

step S511, the intelligent terminal obtains color values of all pixel points in the skin area.

Furthermore, because the image is composed of a plurality of pixel points, each pixel point can be represented by a color value. The skin area also has its specific color value. The color values of all pixel points in the skin area are obtained firstly, and the whitening effect is achieved through the increase of the color values subsequently.

And S512, the intelligent terminal determines whitening amplitude according to the face attribute value and the scene attribute value.

Specifically, the whitening amplitude is to increase the color value of the pixel point by a certain value, and the value can adjust the dark color and the dark color corresponding to the color value into the light color or the bright color, so that the whitening effect is visually presented. And the whitening increasing amplitude corresponding to the face attribute value and the scene attribute value is corresponding in advance. If the light is dark, the whitening amplitude is reduced. If the whitening margin is too large, a strong difference feeling is generated between the background light and the whitening effect, and the whitening effect is not natural.

And step S513, the intelligent terminal increases the color value according to the whitening increase range to obtain the whitened target image.

Specifically, the whitening increasing range is a fixed numerical value, and after the whitening increasing range is obtained, the color value of the pixel point of the skin area is increased by the numerical value of the whitening increasing range, so that the color value of the skin area is larger, and the whitening effect is visually presented.

Further, referring to fig. 6, when the beautifying to be performed is peeling, step S500 includes:

step S521, the intelligent terminal obtains color values of all pixel points in the skin area.

In step S522, the intelligent terminal determines a filtering algorithm and filtering parameters according to the face attribute value and the scene attribute value.

Specifically, dermabrasion essentially performs filtering processing on a skin region through a certain filtering algorithm, so that the skin region is blurred, and a defect filtering effect is achieved. After the face attribute value and the scene attribute value are determined, what filtering algorithm should be used and the filtering parameters used by the filtering algorithm can be further determined. Preferred optional filtering algorithms for this embodiment include bilateral filtering and gaussian filtering. Bilateral filtering is a nonlinear filtering method, which combines a compromise treatment of spatial proximity and color value similarity of an image, and considers spatial information and gray level similarity at the same time, thereby achieving the purpose of edge-preserving and denoising. Has the characteristics of simplicity, non-iteration and locality. Gaussian filtering is a linear smoothing filter, which is mainly suitable for processing gaussian noise and is very effective for suppressing noise that follows normal distribution.

Step S523, the intelligent terminal filters the skin area according to the filtering algorithm and the filtering parameter, so as to obtain a target image after beautifying.

Specifically, a filtering algorithm is taken as an example of bilateral filtering. And substituting the filtering parameters into a filtering algorithm to obtain the filter. And then inputting each color value into a filter for filtering processing to obtain a filtered color value and a ground target image, and finally outputting the target image.

Because the beautifying comprises a plurality of parts, each whitening step is taken as a plurality of modules and spliced in a serial mode. And after one module executes a certain beautifying step, entering the next module to carry out next beautifying.

The scheme takes the face attribute value and the scene attribute value as reference bases for beautifying, and different beautifying parameters such as whitening amplitude, a filtering algorithm, a filtering parameter and an offset mapping relation are provided for different attribute values, so that the naturalness of beautifying is improved.

Further, as shown in fig. 7, based on the above-mentioned face detection method, the present invention also provides an intelligent terminal, and the intelligent terminal includes a processor 10, a memory 20, and a display 30. Fig. 7 shows only some of the components of the smart terminal, but it should be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.

The storage 20 may be an internal storage unit of the intelligent terminal in some embodiments, such as a hard disk or a memory of the intelligent terminal. The memory 20 may also be an external storage device of the Smart terminal in other embodiments, such as a plug-in hard disk provided on the Smart terminal, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. Further, the memory 20 may also include both an internal storage unit and an external storage device of the smart terminal. The memory 20 is used for storing application software installed in the intelligent terminal and various data, such as program codes for installing the intelligent terminal. The memory 20 may also be used to temporarily store data that has been output or is to be output. In one embodiment, the memory 20 stores a face detection program 40, and the face detection program 40 can be executed by the processor 10 to implement the face detection method of the present application.

The processor 10 may be a Central Processing Unit (CPU), microprocessor or other data Processing chip in some embodiments, and is used for executing program codes stored in the memory 20 or Processing data, such as executing a face detection method.

The display 30 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch panel, or the like in some embodiments. The display 30 is used for displaying information at the smart terminal and for displaying a visual user interface. The components 10-30 of the intelligent terminal communicate with each other via a system bus.

In one embodiment, the following steps are implemented when the processor 10 executes the face detection program 40 in the memory 20:

acquiring an original image to be detected;

The face detection method further comprises the following steps:

The attribute value detection model set comprises a face attribute value detection model and a scene attribute value detection model, original images are input into the trained attribute value detection model set for processing, and face attribute values and scene attribute values are output, and the method comprises the following steps:

inputting the original image into a face attribute value detection model for face detection, and outputting a face attribute value; inputting an original image into a scene attribute value detection model for scene detection, and outputting a scene attribute value; alternatively, the first and second electrodes may be,

inputting an original image into a scene attribute value detection model for scene detection, and outputting a scene attribute value; inputting the original image into a face attribute value detection model for face detection, and outputting a face attribute value; alternatively, the first and second electrodes may be,

in the process of inputting the original image into the face attribute value detection model for face detection and outputting the face attribute value, the original image is input into the scene attribute value detection model for scene detection and the scene attribute value is output.

The face attribute value comprises age and gender, and the face attribute value detection model comprises a face alignment model, an age detection model and a gender detection model.

Inputting an original image into a human face attribute value detection model for human face detection, and outputting a human face attribute value, wherein the method comprises the following steps:

inputting an original image into a face alignment model for alignment, and outputting an alignment image;

and respectively inputting the aligned images into an age detection model and a gender detection model for age identification and gender identification, and outputting the age and the gender.

The method for beautifying the skin area according to the face attribute value and the scene attribute value to obtain a target image after beautifying comprises the following steps:

acquiring color values of all pixel points in a skin area;

determining whitening amplitude according to the face attribute value and the scene attribute value;

and increasing the color value according to the whitening increase range to obtain the whitened target image.

acquiring color values of all pixel points in a skin area;

determining a filtering algorithm and filtering parameters according to the face attribute value and the scene attribute value;

and filtering the skin area according to the filtering algorithm and the filtering parameters to obtain the target image after beautifying.

Wherein, before inputting the original image into the trained attribute value detection model set for processing and outputting the face attribute value and the scene attribute value, the method further comprises:

calculating a human face skin color value of an original image based on a preset human face skin color algorithm;

and respectively determining a face attribute value detection model and a scene attribute value detection model which correspond to the detection model set according to the face skin color value.

The invention also provides a storage medium, wherein the storage medium stores a face detection program, and the steps of the face detection method are realized when the face detection program is executed by the processor.

In summary, the present invention provides a face detection method, an intelligent terminal and a storage medium, wherein the method includes: acquiring an original image to be detected; inputting an original image into a trained attribute value detection model set for processing, and outputting a face attribute value and a scene attribute value; and inputting the original image, the face attribute value and the scene attribute value into a trained face key point detection model for processing, and outputting face key points, wherein the accuracy of the face key point detection model for detecting the face key points is greater than or equal to the preset accuracy. The method and the device can improve the accuracy of the face key point detection.

Of course, it will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program instructing relevant hardware (such as a processor, a controller, etc.) to complete, and the program can be stored in a computer readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a memory, a magnetic disk, an optical disk, etc.

It is to be understood that the invention is not limited to the examples described above, but that modifications and variations may be effected thereto by those of ordinary skill in the art in light of the foregoing description, and that all such modifications and variations are intended to be within the scope of the invention as defined by the appended claims.

Claims

1. A face detection method, comprising:

acquiring an original image to be detected;

inputting the original image into a trained attribute value detection model set for processing, and outputting a face attribute value and a scene attribute value;

2. The method of claim 1, further comprising:

determining a skin area to be beautified in the original image according to the face key points;

3. The method according to claim 1 or 2, wherein the attribute value detection model set includes a face attribute value detection model and a scene attribute value detection model, and the inputting the original image into the trained attribute value detection model set for processing and outputting a face attribute value and a scene attribute value includes:

inputting the original image into the face attribute value detection model for face detection, and outputting a face attribute value; inputting the original image into the scene attribute value detection model for scene detection, and outputting a scene attribute value; alternatively, the first and second electrodes may be,

inputting the original image into the scene attribute value detection model for scene detection, and outputting a scene attribute value; inputting the original image into the face attribute value detection model for face detection, and outputting a face attribute value; alternatively, the first and second electrodes may be,

and in the process of inputting the original image into the face attribute value detection model for face detection and outputting the face attribute value, inputting the original image into the scene attribute value detection model for scene detection and outputting the scene attribute value.

4. The face detection method of claim 3, wherein the face attribute values comprise age and gender, and the face attribute value detection model comprises a face alignment model, an age detection model, and a gender detection model.

5. The method of claim 4, wherein the inputting the original image into the face attribute value detection model for face detection and outputting the face attribute value comprises:

inputting the original image into the face alignment model for alignment, and outputting an aligned image;

and respectively inputting the aligned images into the age detection model and the gender detection model for age identification and gender identification, and outputting the age and the gender.

6. The method of claim 2, wherein the beautifying the skin region according to the face attribute value and the scene attribute value to obtain a target image after beautifying, comprises:

acquiring color values of all pixel points in the skin area;

and increasing the color value according to the whitening increase amplitude to obtain a whitened target image.

7. The method of claim 2, wherein the beautifying the skin region according to the face attribute value and the scene attribute value to obtain a target image after beautifying, comprises:

acquiring color values of all pixel points in the skin area;

determining a filtering algorithm and a filtering parameter according to the face attribute value and the scene attribute value;

and filtering the skin area according to the filtering algorithm and the filtering parameters to obtain a target image after beautifying.

8. The method of any one of claims 5-7, wherein before inputting the original image into a trained attribute value detection model set for processing and outputting the face attribute value and the scene attribute value, the method further comprises:

calculating the human face skin color value of the original image based on a preset human face skin color algorithm;

9. An intelligent terminal, characterized in that, intelligent terminal includes: a memory, a processor and a face detection program stored on the memory and executable on the processor, the face detection program when executed by the processor implementing the steps of the face detection method according to any one of claims 1 to 8.

10. A storage medium, characterized in that the storage medium stores a face detection program, which when executed by a processor implements the steps of the face detection method according to any one of claims 1 to 8.