CN110705355A

CN110705355A - Face pose estimation method based on key point constraint

Info

Publication number: CN110705355A
Application number: CN201910817335.5A
Authority: CN
Inventors: 王枫; 胡庆浩; 程健
Original assignee: Nanjing Artificial Intelligence Chip Innovation Institute Institute Of Automation Chinese Academy Of Sciences
Current assignee: Nanjing Artificial Intelligence Chip Innovation Institute Institute Of Automation Chinese Academy Of Sciences
Priority date: 2019-08-30
Filing date: 2019-08-30
Publication date: 2020-01-17

Abstract

The invention relates to a face pose estimation method based on key point constraint, which comprises the steps of obtaining a portrait to be estimated, establishing a training set, marking key points through the training set, and marking 68 key points on faces in all the training set through a preset neural network model; and normalizing the key points to obtain corresponding key point coordinates, inputting the key point coordinates into a preset human face attitude angle detection neural network, and outputting a human face attitude angle value by the preset human face attitude angle detection neural network. The human face attitude angle detection method based on the neural network carries out key point positioning on a human face image to be detected through the preset key point neural network to obtain 68 key points, and detects the key points through the preset human face attitude angle detection neural network to obtain a human face attitude angle value.

Description

Face pose estimation method based on key point constraint

Technical Field

The invention relates to a human face posture estimation method, and belongs to the technical field of image information processing.

Background

The human face pose estimation plays an important role in non-rigid registration and three-dimensional reconstruction in the intelligent driving vision technology and realization of a new user interaction method. In the past, there are several methods for face modeling, and two main methods are a method based on key points and a parameterized characterization model. In recent years, a method for directly extracting two-dimensional face keys by using a deep learning tool has become a main method for analyzing face pose due to flexibility and robustness in occlusion and large face pose change. The traditional head pose estimation and calculation method is to obtain two-dimensional key points from a face and then use an average human head model autumn festival to solve the corresponding problem from two dimensions to three dimensions. But this method relies entirely on the accuracy of the keypoint detection and the construction of an average head model.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the method is used for solving the problems that the traditional method for estimating the key points of the face detail posture is difficult to realize accurate detection on the key points of the face detail posture, and the detection effect on the face posture with large shielding and posture change is poor.

In order to achieve the purpose, the invention provides the following technical scheme:

the embodiment of the invention provides a face pose estimation method based on key point constraint, which comprises the following steps:

step 1, establishing a training set, wherein the training set comprises a plurality of face samples, and key points are marked on face images in the training set;

step 2, establishing a multitask convolution neural network model, and performing key point positioning on all face samples in a training set through a preset key point neural network model to obtain 68 key points;

and 3, carrying out normalization processing on the key points to obtain corresponding key point coordinates by the established training set, inputting the key point coordinates into the established multitask convolutional neural network model, and outputting a face posture angle value by the multitask convolutional neural network model.

And 4, training the convolutional neural network designed in the step 2 by using the training sample obtained in the step 1, and finishing the training when the error calculated by the training reaches an expected value, and obtaining the parameters of the convolutional neural network model.

And 5, testing the sample in the step 1 by using the model obtained by training in the step 3, and outputting a final face pose estimation result.

Further, the multitask convolutional neural network model established in the step 2 comprises an input layer, a first layer full-connection layer, a second layer full-connection layer and an output layer.

Further, the dimension of the input layer is 1 × 136, the dimension of the first fully-connected layer is 136 × 68, the dimension of the second fully-connected layer is 68 × 3, and the dimension of the output layer is 1 × 3.

Further, the estimation of the face pose specifically includes: and normalizing the key points to obtain corresponding key point coordinates, entering the key point coordinates through an input layer, sequentially processing the key point coordinates through a first full-connection layer and a second full-connection layer, and finally outputting a face pose angle value through an output layer.

Further, the face pose angle value includes a horizontal angle value, a tilt angle value, and a pitch angle value.

Further, the input layer can output the key points and the Euler angles of the human face postures at the same time.

Compared with the prior art, the invention adopting the technical scheme has the following technical effects:

the method is based on the face pose estimation of the key point constraint, introduces the key point information of the face as the supervision information, optimizes the results of three Euler angles obtained by direct regression, can not only simultaneously carry out key point prediction and face pose estimation, but also has more accurate effect compared with the prior face pose estimation method.

The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical solutions of the present invention more clearly understood and to implement them in accordance with the contents of the description, the following detailed description is given with reference to the preferred embodiments of the present invention and the accompanying drawings. The detailed description of the present invention is given in detail by the following examples and the accompanying drawings.

Drawings

FIG. 1 is a schematic flow chart of the estimation method of the present invention;

Detailed Description

The present invention is further described with reference to the accompanying drawings and the detailed description, and it should be noted that any combination of the following described embodiments or technical features can be made to form a new embodiment without conflict.

As shown in fig. 1, a face pose estimation method based on key point constraint of the present invention includes:

step 1, establishing a training set, wherein the training set comprises a plurality of human face samples to be detected, and key points are marked on human face images in the training set;

step 2, establishing a multitask convolutional neural network model, wherein the established multitask convolutional neural network model comprises an input layer, a first layer full-connection layer, a second layer full-connection layer and an output layer; carrying out key point positioning on the face samples in all training sets through a preset key point neural network model to obtain 68 key points; and calculating a rotation matrix according to the 68-point two-dimensional coordinates obtained by normalization, converting the rotation matrix into an Euler angle, calculating the loss by using the Euler angle and the Euler angle obtained by direct network regression and the label simultaneously, wherein the loss of the overall attitude estimation is the sum of two losses.

And 3, carrying out normalization processing on the key points to obtain corresponding key point coordinates by the established training set, inputting the key point coordinates into the established multitask convolutional neural network model, and outputting face attitude angle values by the multitask convolutional neural network model, wherein the face attitude angle values correspond to three Euler angles of pitch, yaw and roll.

The estimation of the face pose specifically comprises the following steps: and normalizing the key points to obtain corresponding key point coordinates, and inputting the key point coordinates into a preset human face posture detection neural network model. The key point coordinates enter through the input layer and are processed through the first full-connection layer and the second full-connection layer in sequence, and finally the face pose angle value is output through the output layer. The face posture angle value comprises a horizontal rotation angle value, a tilt angle value and a pitch angle value; the horizontal turning angle value is the left-right turning range value of the human face, the inclination angle value is the degree value of the inclination of the human face, and the pitching angle value is the upward or downward range value of the human face.

The input layer can simultaneously output key points and the Euler angles of the human face postures; the dimension of the input layer is 1 x 136, the dimension of the first fully-connected layer is 136 x 68, the dimension of the second fully-connected layer is 68 x 3, and the dimension of the output layer is 1 x 3.

The foregoing is merely a preferred embodiment of the invention and is not intended to limit the invention in any manner; those skilled in the art can readily practice the invention as shown and described in the drawings and detailed description herein; however, those skilled in the art should appreciate that they can readily use the disclosed conception and specific embodiments as a basis for designing or modifying other structures for carrying out the same purposes of the present invention without departing from the scope of the invention as defined by the appended claims; meanwhile, any changes, modifications, and evolutions of the equivalent changes of the above embodiments according to the actual techniques of the present invention are still within the protection scope of the technical solution of the present invention.

Claims

1. A face pose estimation method based on key point constraint is characterized by comprising the following steps:

step 1, acquiring an image to be detected; establishing a training set, wherein the training set comprises a plurality of face samples,

marking key points for the face images in the multiple training sets;

step 3, the established training set is used for carrying out normalization processing on the key points to obtain corresponding key point coordinates, the key point coordinates are input into an established multitask convolution neural network model, and the multitask convolution neural network model outputs a face posture angle value;

step 4, training the convolutional neural network designed in the step 2 by using the training sample obtained in the step 1, finishing the training when the error calculated by the training reaches an expected value, and obtaining the parameters of the convolutional neural network model;

and 5, testing the sample in the step 1 by using the model obtained by training in the step 3, and outputting a final result of the face pose angle.

2. The face pose estimation method based on the key point constraint according to claim 1, characterized in that: the multitask convolution neural network model established in the step 2 comprises an input layer, a first layer full connection layer, a second layer full connection layer and an output layer.

3. The face pose estimation method based on the key point constraint according to claim 2, characterized in that: the method is characterized in that: the dimension of the input layer is 1 x 136, the dimension of the first fully-connected layer is 136 x 68, the dimension of the second fully-connected layer is 68 x 3, and the dimension of the output layer is 1 x 3.

4. The face pose estimation method based on the key point constraint according to claim 2, characterized in that: : the detection of the face pose angle specifically comprises the following steps: and normalizing the key points to obtain corresponding key point coordinates, entering the key point coordinates through an input layer, sequentially processing the key point coordinates through a first full-connection layer and a second full-connection layer, and finally outputting a face pose angle value through an output layer.

5. The face pose angle detection method based on neural network as claimed in claim 1, wherein: the face pose angle value comprises a horizontal angle value, a tilt angle value and a pitch angle value.