CN114399803A

CN114399803A - Face key point detection method and device

Info

Publication number: CN114399803A
Application number: CN202111462801.6A
Authority: CN
Inventors: 贺克赛; 程新景; 杨睿刚
Original assignee: International Network Technology Shanghai Co Ltd
Current assignee: International Network Technology Shanghai Co Ltd
Priority date: 2021-11-30
Filing date: 2021-11-30
Publication date: 2022-04-26

Abstract

The invention provides a method and a device for detecting key points of a human face, wherein the method comprises the following steps: acquiring a face image; inputting the face image into a face key point detection model to obtain a face key point detection result output by the face key point detection model; the face key point detection model is obtained by training based on the sample image, the corresponding angle class label of the sample image and a face key point true value; the face key point model is used for detecting face key points of the face image based on the angle category obtained by predicting the face image so as to obtain a face key point detection result. According to the invention, the acquired face image is input into the face key point detection model, so that key point detection is carried out on the face image according to the angle type corresponding to the predicted attitude angle of the face image, the detection precision of the face key point is improved according to the angle range, the situation that the key point detection precision is poor due to the influence of the attitude is avoided, and the detection precision of the face key point is improved.

Description

Face key point detection method and device

Technical Field

The invention relates to the technical field of image processing, in particular to a method and a device for detecting key points of a human face.

Background

With the development of scientific technology, more and more scenes exist in the actual life of the face recognition technology, for example, a 3D model of the face is reconstructed based on the faces recognized in one or more 2D images, and then the effect of changing the face of the video is achieved; or, the face of the user is subjected to living body detection to verify the identity or the authority of the user so as to resist various fraudulent attacks. The human face key point detection is an important component in the human face recognition technology, and the overall performance of a human face recognition/analysis/search system is greatly influenced.

At present, a human face key point detection model is used, which mainly trains all key points together, and finally obtains a global optimal solution by mainly considering the relative position relationship between the key points.

However, in the existing face key point detection process, extra information of a face key point task is not considered, especially when the face is in different pose angles, for example, when the face position in the original face image is compared with the face image in a front view state, the shift pose causes the shift of the positions of five sense organs of the face, so that the deviation between the detection position of the face key point and the target position is large, further causing an excessively large calculation amount, and in the set iteration number, the final detection position of each face key point in the original face image cannot be accurately positioned, thereby increasing the detection difficulty of the face key point and reducing the positioning accuracy of the face key point.

Disclosure of Invention

The invention provides a method and a device for detecting key points of a human face, which are used for solving the defect of poor accuracy rate of detection of the key points of the human face caused by the influence of postures in the prior art and realizing accurate detection of the key points of the human face under the condition of multiple postures.

The invention provides a face key point detection method, which comprises the following steps: acquiring a face image; inputting the face image into a face key point detection model to obtain a face key point detection result output by the face key point detection model; the face key point detection model is obtained by training based on a sample image, an angle class label corresponding to the sample image and a face key point true value; the face key point model is used for detecting face key points of the face image based on the angle category obtained by predicting the face image so as to obtain a face key point detection result.

According to the face key point detection method provided by the invention, the face key point detection model comprises the following steps: the angle prediction layer is used for predicting the type of the attitude angle of the input face image to obtain the angle type; and the face key point detection layer is used for detecting the face image based on the angle type to obtain a face key point detection result.

According to the method for detecting the key points of the human face, the gesture angle type prediction of the input human face image comprises the following steps: extracting features based on the input human face image to obtain an attitude angle; and determining the angle range of the attitude angle based on the preset angle category to obtain the angle category.

According to the human face key point detection method provided by the invention, the attitude angle comprises a pitch angle and a yaw angle, and the angle prediction layer comprises a first convolution layer and a second convolution layer, wherein the first convolution layer is used for predicting the pitch angle and the second convolution layer is used for predicting the yaw angle respectively.

According to the face key point detection method provided by the invention, the training of the face key point detection model comprises the following steps: acquiring a sample image, and an angle category label and a face key point true value corresponding to the sample image; and taking the sample image as input data of a model to be trained, taking the angle class label as a label of an angle prediction type obtained by predicting the model to be trained based on the sample image, taking the face key point truth value as a label of a face key point prediction result obtained by detecting the face key point of the model to be trained based on the angle prediction type, and training the model to be trained to obtain the face key point detection model for generating the face key point detection result of the face image.

According to the method for detecting the key points of the human face, provided by the invention, the training of the model to be trained comprises the following steps: inputting the sample image to an angle prediction layer to obtain an angle prediction category output by the angle prediction layer; inputting the angle prediction category and the sample image into a face key point detection layer to obtain a face key point prediction result output by the face key point detection layer; constructing an angle loss function based on the angle prediction category and the angle category label, and constructing a key point loss function based on the face key point prediction result and the face key point truth value; and obtaining a total loss function according to the angle loss function and the key point loss function, converging based on the total loss function, and finishing the training.

The invention also provides a face key point detection device, which comprises: the data acquisition module acquires a face image; the face key point detection module is used for inputting the face image into a face key point detection model to obtain a face key point detection result output by the face key point detection model; the face key point detection model is obtained by training based on a sample image, an angle class label corresponding to the sample image and a face key point true value; the face key point model is used for detecting face key points of the face image based on the angle category obtained by predicting the face image so as to obtain a face key point detection result.

The invention also provides an electronic device, which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor executes the program to realize the steps of any one of the human face key point detection methods.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method for detecting a face keypoint as described in any of the above.

The invention also provides a computer program product comprising a computer program, wherein the computer program realizes the steps of the human face key point detection method when being executed by a processor.

According to the method and the device for detecting the key points of the human face, the acquired human face image is input into the human face key point detection model, so that the key point detection is performed on the human face image according to the angle type corresponding to the predicted attitude angle of the human face image, the detection precision of the key points of the human face is improved according to the angle range, the situation that the key point detection precision is poor due to the influence of the attitude is avoided, and the detection precision of the key points of the human face is improved.

Drawings

In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a schematic flow chart of a face key point detection method provided by the present invention;

FIG. 2 is a schematic diagram of a face keypoint detection model according to the present invention;

FIG. 3 is a schematic flow chart of a detection model for training face key points according to the present invention;

FIG. 4 is a schematic structural diagram of a face key point detection apparatus provided in the present invention;

FIG. 5 is a schematic diagram of a training module according to the present invention;

fig. 6 is a schematic structural diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 shows a schematic flow chart of a face key point detection method of the present invention, which includes:

s11, acquiring a face image;

s12, inputting the face image into the face key point detection model to obtain the face key point detection result output by the face key point detection model; the face key point detection model is obtained by training based on the sample image, the corresponding angle class label of the sample image and a face key point true value; the face key point model is used for detecting face key points of the face image based on the angle category obtained by predicting the face image so as to obtain a face key point detection result.

It should be noted that S1N in this specification does not represent the sequence of the face key point detection method, and the face key point detection method of the present invention is described below with reference to fig. 2 specifically.

In step S11, a face image is acquired.

In this embodiment, acquiring a face image includes: acquiring a face image to be subjected to face key point detection based on electronic equipment or an application platform applying the face key point detection method; or, the face image is obtained based on a terminal device connected with an electronic device or an application platform applying the face key point detection method. It should be noted that the terminal device may obtain a face image of a person in the recognition area through a visual sensor connected to the terminal device. It should be noted that the face image may be a single frame picture or a sequence of picture frames obtained by shooting, or an image frame or a sequence of image frames obtained by shot-cutting a video and associated with a face to be detected.

In an alternative embodiment, the acquired face image may be derived from images taken by millimeter wave radar, laser radar, detector, camera and other camera devices based on a specific face, and the source of the face image is not further limited herein.

Step S12, inputting the face image into the face key point detection model to obtain the face key point detection result output by the face key point detection model; the face key point detection model is obtained by training based on the sample image, the corresponding angle class label of the sample image and a face key point true value; the face key point model is used for detecting face key points of the face image based on the angle category obtained by predicting the face image so as to obtain a face key point detection result.

In this embodiment, referring to fig. 2, the face key point detection model includes: the angle prediction layer is used for predicting the type of the attitude angle of the input face image to obtain the angle type; and the face key point detection layer is used for detecting the face image based on the angle category to obtain a face key point detection result. It should be noted that the angle prediction layer performs angle prediction on the input face image, so as to determine the angle range to which the attitude angle of the corresponding face image belongs according to the predicted angle category, and thus perform key point detection on the face image according to the range, so as to avoid the influence of the attitude angle on the face image detection result, and improve the accuracy of face key point detection.

Specifically, first, the angle prediction layer performs pose angle type prediction on an input face image to obtain an angle type. In this embodiment, the performing pose angle type prediction on an input face image includes: extracting features based on the input human face image to obtain an attitude angle; and determining the angle range of the attitude angle based on the preset angle category to obtain the angle category.

It should be noted that, before determining the angle range to which the attitude angle belongs based on the preset angle category and obtaining the angle category, the attitude angles in a certain range need to be divided according to the size of the specific attitude angle, for example, every ten degrees are divided into one type, and the size of the attitude angle is divided into ranges, so that the face images of the similar attitude angles are classified into one type, and therefore, the face images are detected based on the angle range to which the face images belong, and the influence of the attitude angle on the detection of the key points of the face is avoided.

It should be noted that when the attitude angle includes the pitch angle pitch and the yaw angle yaw, the angle prediction layer includes a first convolution layer that predicts the pitch angle and a second convolution layer that predicts the yaw angle, respectively.

And secondly, detecting the face image based on the angle category by the face key point detection layer to obtain a face key point detection result. In this embodiment, detecting a face image based on an angle category to obtain a face key point detection result includes: based on the angle category, distributing corresponding weight data to perform key point detection on the face image to obtain a face key point detection result; or, performing key point detection on the face image to obtain an initial key point; and carrying out angle correction on the initial key points based on the angle categories to obtain a face key point detection result.

In an alternative embodiment, referring to fig. 3, the method further comprises: before inputting a face image into a face key point detection model, training the face key point detection model, specifically comprising:

s31, acquiring a sample image, an angle category label and a face key point true value corresponding to the sample image;

s32, taking the sample image as input data of the model to be trained, taking the angle class label as a label of an angle prediction type obtained by the model to be trained based on sample image prediction, taking the face key point truth value as a label of a face key point prediction result obtained by the model to be trained based on the angle prediction type for face key point detection, and training the model to be trained to obtain the face key point detection model for generating the face key point detection result of the face image.

It should be noted that S3N in this specification does not represent the sequence of the face key point detection method, and the face key point detection method of the present invention is specifically described below.

Step S31, obtaining the sample image and the angle category label and the face key point true value corresponding to the sample image.

In this embodiment, obtaining the sample image, the angle category label corresponding to the sample image, and the face key point true value includes: acquiring a sample image; and labeling the sample image to obtain an angle category label and a face key point true value.

It should be noted that, acquiring a sample image includes: acquiring a video stream; acquiring a certain number of video frame images as sample images based on a preset interval; or at least one frame of image is continuously shot based on at least one human face as a sample image. It should be noted that when acquiring a video stream or taking a face image, the acquisition may be based on external factors such as different attitude angles, obstructions, and lighting.

In addition, after acquiring the sample image, the method further comprises: and carrying out face recognition on the obtained sample image to remove the face image which does not contain the face. It should be noted that the sample image may be understood as a set of image data, the image data may be an image that is specific to at least one human face target and is located at different angles, different pixel colors, and the like corresponding to each human face target, and the image data amount is usually large and can reach millions. The sample images are used for training a network to be trained so as to complete the construction of the model.

In an optional embodiment, after obtaining the sample image and the angle category label and the face key point true value corresponding to the sample image, the method further includes: and performing data enhancement on the sample image by using a data enhancement strategy. Specifically, the data enhancement strategy comprises at least one of flipping, rotating, clipping, deforming and scaling; and/or the data enhancement strategy comprises at least one of noise, blurring, color transformation, erasure, and padding. It should be noted that the data enhancement strategy selected based on this embodiment is suitable for performing data enhancement on the sample image, so as to increase the data volume of the training image, and facilitate the subsequent model training process to greatly improve the detection accuracy of the model for the key points in the scenes such as illumination, occlusion, incompleteness, large deflection angle, expression, and the like.

When the data enhancement strategy is actually selected, any one of turning, rotating, clipping, deforming or scaling can be selected, any one of noise, blurring, color transformation, erasing and filling can be selected, or at least two of turning, rotating, clipping, deforming and scaling can be selected; or, selecting at least two of noise, blur, color transformation, erasure, and padding; alternatively, at least two of flipping, rotating, clipping, morphing, scaling, noise, blurring, color transformation, erasing, and padding are selected.

Step S32, using the sample image as the input data of the model to be trained, using the angle class label as the label of the angle prediction type obtained by the model to be trained based on the sample image prediction, using the face key point truth value as the label of the face key point prediction result obtained by the model to be trained based on the angle prediction type to perform face key point detection, and training the model to be trained to obtain the face key point detection model for generating the face key point detection result of the face image.

In this embodiment, the network to be trained generally includes an angle prediction layer for predicting the pose angle category of the sample image, a face key point detection layer for performing key point detection on the face image based on the angle category obtained by the angle prediction layer, and a loss function; and inputting the sample image or the sample picture subjected to data enhancement into a model to be trained for training according to a preset iteration rule to obtain a trained target recognition model.

Specifically, training a model to be trained includes: inputting the sample image to an angle prediction layer to obtain an angle prediction category output by the angle prediction layer; inputting the angle prediction category and the sample image into a face key point detection layer to obtain a face key point prediction result output by the face key point detection layer; constructing an angle loss function based on the angle prediction category and the angle category label, and constructing a key point loss function based on the face key point prediction result and the face key point truth value; and obtaining a total loss function according to the angle loss function and the key point loss function, converging based on the total loss function, and finishing the training. It should be noted that, based on the angle prediction category to which the attitude angle corresponding to the sample image belongs, the learning capability of the model to be recognized on the key points of the face is improved, and the detection accuracy of the model is improved, so as to avoid the situation that the detection accuracy of the key points is poor due to the influence of the attitude.

In summary, the embodiment of the present invention inputs the acquired face image into the face key point detection model, so as to perform key point detection on the face image according to the angle type corresponding to the predicted pose angle of the face image, improve the detection accuracy of the face key point according to the angle range, avoid the situation of poor key point detection accuracy caused by the influence of the pose, and improve the accuracy of the face key point detection.

The following describes the face key point detection device provided by the present invention, and the face key point detection device described below and the face key point detection method described above may be referred to in correspondence with each other.

Fig. 4 shows a schematic structural diagram of a face key point detection device, which includes:

a data acquisition module 41 for acquiring a face image;

the face key point detection module 42 is used for inputting the face image into the face key point detection model to obtain a face key point detection result output by the face key point detection model;

the face key point detection model is obtained by training based on the sample image, the corresponding angle class label of the sample image and a face key point true value;

the face key point model is used for detecting face key points of the face image based on the angle category obtained by predicting the face image so as to obtain a face key point detection result.

In this embodiment, the data obtaining module 41 includes: the data acquisition unit is used for acquiring a face image to be subjected to face key point detection based on the electronic equipment or the application platform applying the face key point detection method; or the data acquisition unit acquires the face image based on the terminal equipment connected with the electronic equipment or the application platform applying the face key point detection method. It should be noted that the terminal device may obtain a face image of a person in the recognition area through a visual sensor connected to the terminal device. It should be noted that the face image may be a single frame picture or a sequence of picture frames obtained by shooting, or an image frame or a sequence of image frames obtained by shot-cutting a video and associated with a face to be detected.

In an alternative embodiment, the face image acquired by the data acquisition module 41 may be derived from images captured by millimeter wave radar, laser radar, detector, camera and other camera devices based on a specific face, and the source of the face image is not further limited herein.

The face key point detection module 42 includes: the angle prediction unit is used for predicting the type of the attitude angle of the input face image to obtain the angle type; and the face key point detection unit is used for detecting the face image based on the angle category to obtain a face key point detection result. It should be noted that the angle prediction layer performs angle prediction on the input face image, so as to determine the angle range to which the attitude angle of the corresponding face image belongs according to the predicted angle category, and thus perform key point detection on the face image according to the range, so as to avoid the influence of the attitude angle on the face image detection result, and improve the accuracy of face key point detection.

Specifically, the angle prediction unit includes: the angle prediction subunit is used for extracting features based on the input human face image to obtain an attitude angle; and the classification subunit determines the angle range to which the attitude angle belongs based on the preset angle category to obtain the angle category.

The angle prediction unit further includes: the range dividing subunit divides the range into a plurality of types according to the specific attitude angle, for example, every ten degrees. The classification subunit determines the angle range of the attitude angle based on the preset angle category, and before the angle category is obtained, the size of the attitude angle is subjected to range division so that the face images of similar attitude angles are classified into one class, the face images are detected based on the angle range of the category of the face images, and the influence of the attitude angle on the detection of key points of the face is avoided.

It should be noted that when the attitude angles include a pitch angle pitch and a yaw angle yaw, the angle prediction unit includes a first convolution unit that predicts the pitch angles and a second convolution unit that predicts the yaw angles, respectively, in other words, if the attitude angles include at least two, the number of angle prediction units is determined according to the number of attitude angles, so as to perform the prediction of the angle class for the face image for each angle, respectively.

Secondly, face key point detection unit includes: the detection subunit is used for distributing corresponding weight data based on the angle category so as to detect key points of the face image and obtain a face key point detection result; or, the detection subunit performs key point detection on the face image to obtain an initial key point; and the deviation rectifying subunit is used for carrying out angle correction on the initial key points based on the angle categories to obtain the detection result of the face key points.

In an alternative embodiment, referring to fig. 5, in order to train the face key point detection model, the apparatus further includes a training module, which specifically includes:

the sample acquiring unit 51 is used for acquiring a sample image, an angle class label corresponding to the sample image and a face key point true value;

the training unit 52 takes the sample image as input data of the model to be trained, takes the angle class label as a label of an angle prediction type obtained by predicting the model to be trained based on the sample image, takes the face key point truth value as a label of a face key point prediction result obtained by detecting the face key point of the model to be trained based on the angle prediction type, and trains the model to be trained to obtain the face key point detection model for generating the face key point detection result of the face image.

Specifically, the sample acquiring unit 51 includes: a sample acquisition subunit that acquires a sample image; and the labeling subunit is used for labeling the sample image to obtain an angle class label and a face key point true value.

It should be noted that the sample acquiring subunit includes: the video acquiring unit acquires a video stream; the image acquisition sun unit acquires a certain number of video frame images as sample images based on a preset interval; or the image obtaining grandchild unit continuously shoots at least one frame of image based on at least one human face to serve as a sample image. It should be noted that when acquiring a video stream or taking a face image, the acquisition may be based on external factors such as different attitude angles, obstructions, and lighting.

In addition, the sample acquiring subunit further includes: and the screening subunit is used for carrying out face recognition on the acquired sample image so as to remove the face image which does not contain the face. It should be noted that the sample image may be understood as a set of image data, the image data may be an image that is specific to at least one human face target and is located at different angles, different pixel colors, and the like corresponding to each human face target, and the image data amount is usually large and can reach millions. The sample images are used for training a network to be trained so as to complete the construction of the model.

In an optional embodiment, the apparatus further comprises: and the data enhancement module is used for enhancing the data of the sample image by using a data enhancement strategy. Specifically, the data enhancement strategy comprises at least one of flipping, rotating, clipping, deforming and scaling; and/or the data enhancement strategy comprises at least one of noise, blurring, color transformation, erasure, and padding. It should be noted that after the sample image, the angle category label corresponding to the sample image, and the face key point true value are obtained, the data enhancement strategy selected in this embodiment is used to perform data enhancement on the sample image, so as to increase the data volume of the training image, and facilitate the subsequent model training process to greatly improve the accuracy of the model in detecting the key points in the scenes such as illumination, occlusion, incompleteness, large deflection angle, and expression.

A training unit 52 comprising: the angle prediction subunit inputs the sample image to the angle prediction layer to obtain the angle prediction category output by the angle prediction layer; inputting the angle prediction category and the sample image into a face key point detection layer to obtain a face key point prediction result output by the face key point detection layer; a loss function construction subunit, which constructs an angle loss function based on the angle prediction category and the angle category label, and constructs a key point loss function based on the face key point prediction result and the face key point truth value; and the training subunit obtains a total loss function according to the angle loss function and the key point loss function, converges based on the total loss function, and ends the training. It should be noted that, based on the angle prediction category to which the attitude angle corresponding to the sample image belongs, the learning capability of the model to be recognized on the key points of the face is improved, and the detection accuracy of the model is improved, so as to avoid the situation that the detection accuracy of the key points is poor due to the influence of the attitude.

Fig. 6 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 6: a processor (processor)61, a communication Interface (communication Interface)62, a memory (memory)63 and a communication bus 64, wherein the processor 61, the communication Interface 62 and the memory 63 complete communication with each other through the communication bus 64. The processor 61 may invoke logic instructions in the memory 63 to perform a face keypoint detection method comprising: acquiring a face image; inputting the face image into a face key point detection model to obtain a face key point detection result output by the face key point detection model; the face key point detection model is obtained by training based on the sample image, the corresponding angle class label of the sample image and a face key point true value; the face key point model is used for detecting face key points of the face image based on the angle category obtained by predicting the face image so as to obtain a face key point detection result.

Furthermore, the logic instructions in the memory 63 may be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, the present invention further provides a computer program product, where the computer program product includes a computer program, the computer program can be stored on a non-transitory computer readable storage medium, and when the computer program is executed by a processor, a computer can execute the face keypoint detection method provided by the above methods, and the method includes: acquiring a face image; inputting the face image into a face key point detection model to obtain a face key point detection result output by the face key point detection model; the face key point detection model is obtained by training based on the sample image, the corresponding angle class label of the sample image and a face key point true value; the face key point model is used for detecting face key points of the face image based on the angle category obtained by predicting the face image so as to obtain a face key point detection result.

In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to execute the face keypoint detection method provided by the above methods, the method comprising: acquiring a face image; inputting the face image into a face key point detection model to obtain a face key point detection result output by the face key point detection model; the face key point detection model is obtained by training based on the sample image, the corresponding angle class label of the sample image and a face key point true value; the face key point model is used for detecting face key points of the face image based on the angle category obtained by predicting the face image so as to obtain a face key point detection result.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A face key point detection method is characterized by comprising the following steps:

acquiring a face image;

inputting the face image into a face key point detection model to obtain a face key point detection result output by the face key point detection model;

the face key point detection model is obtained by training based on a sample image, an angle class label corresponding to the sample image and a face key point true value;

2. The method of claim 1, wherein the face keypoint detection model comprises:

the angle prediction layer is used for predicting the type of the attitude angle of the input face image to obtain the angle type;

and the face key point detection layer is used for detecting the face image based on the angle type to obtain a face key point detection result.

3. The method according to claim 2, wherein the performing pose angle type prediction on the input face image comprises:

extracting features based on the input human face image to obtain an attitude angle;

and determining the angle range of the attitude angle based on the preset angle category to obtain the angle category.

4. The method of claim 2, wherein the attitude angle comprises a pitch angle and a yaw angle, and the angle prediction layer comprises a first convolution layer for predicting the pitch angle and a second convolution layer for predicting the yaw angle, respectively.

5. The method of claim 1, wherein training the face keypoint detection model comprises:

acquiring a sample image, and an angle category label and a face key point true value corresponding to the sample image;

and taking the sample image as input data of a model to be trained, taking the angle class label as a label of an angle prediction type obtained by predicting the model to be trained based on the sample image, taking the face key point truth value as a label of a face key point prediction result obtained by detecting the face key point of the model to be trained based on the angle prediction type, and training the model to be trained to obtain the face key point detection model for generating the face key point detection result of the face image.

6. The method for detecting face key points according to claim 5, wherein the training of the model to be trained comprises:

inputting the sample image to an angle prediction layer to obtain an angle prediction category output by the angle prediction layer;

inputting the angle prediction category and the sample image into a face key point detection layer to obtain a face key point prediction result output by the face key point detection layer;

constructing an angle loss function based on the angle prediction category and the angle category label, and constructing a key point loss function based on the face key point prediction result and the face key point truth value;

and obtaining a total loss function according to the angle loss function and the key point loss function, converging based on the total loss function, and finishing the training.

7. A face key point detection device, comprising:

the data acquisition module acquires a face image;

the face key point detection module is used for inputting the face image into a face key point detection model to obtain a face key point detection result output by the face key point detection model;

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the face keypoint detection method according to any of claims 1 to 6 when executing the program.

9. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the face keypoint detection method according to any one of claims 1 to 6.

10. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, carries out the steps of the face keypoint detection method according to any one of claims 1 to 6.