CN109902553B - Multi-angle face alignment method based on face pixel difference - Google Patents

Multi-angle face alignment method based on face pixel difference Download PDF

Info

Publication number
CN109902553B
CN109902553B CN201910003381.1A CN201910003381A CN109902553B CN 109902553 B CN109902553 B CN 109902553B CN 201910003381 A CN201910003381 A CN 201910003381A CN 109902553 B CN109902553 B CN 109902553B
Authority
CN
China
Prior art keywords
face
key points
points
regression
human face
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910003381.1A
Other languages
Chinese (zh)
Other versions
CN109902553A (en
Inventor
宫恩来
杭丽君
何远彬
赵兴文
叶锋
丁明旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN201910003381.1A priority Critical patent/CN109902553B/en
Publication of CN109902553A publication Critical patent/CN109902553A/en
Application granted granted Critical
Publication of CN109902553B publication Critical patent/CN109902553B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a face alignment method based on face pixel difference from multiple angles; the invention provides a method for back-and-forth prediction of initial key point positions of 5 different angles aiming at faces inclined at different angles in face alignment, so that excellent fitting effect of the faces inclined at different angles is realized, pixel difference of different positions of the faces can be used as representation of different areas to a certain extent, eye difference is most obvious, regression shape selection rules based on maximization of facial pixel difference are provided, and more accurate face alignment effect is realized.

Description

Multi-angle face alignment method based on face pixel difference
Technical Field
The invention relates to the field of face recognition, in particular to a face alignment method in multiple angles of facial pixel difference.
Background
The introduction of deep learning technology and the maturity of machine learning technology make the computer vision related task progress greatly, thereby promoting the application technology in many detection and positioning fields to be perfect. Face alignment is a task of significant research value in these many areas that is both detection-related and inseparable from regression positioning. The method has great significance as extension and expansion of a face detection task and as a basis of a face calibration and face recognition task. In addition to the research of the human face recognition category, the human face alignment technology is established in a plurality of fields. For example, in expression recognition, face alignment provides the possibility of emotion exploration transmitted by human expression. For example, in many applications with a function of beautifying pictures, the unfamiliar human face polishing and beautifying function, the dynamic face changing special effect, and the like, all need to obtain feature points or areas of interest on the face based on human face alignment so as to perform related operations. This means that the face alignment technique requires both a regression shape of the feature points to be accurate enough and a fast enough speed to be suitable for real-time scenes in many applications.
The human face alignment algorithm has a plurality of implementation schemes, and the design idea is formed by combining a human face detection architecture and a human face alignment technology. The mainstream implementation scheme of the face detection part is almost concentrated in the deep learning field, such as classic architecture RCNN, Fast R-CNN and Fast R-CNN of a two-stage network, a depth model is obtained through various candidate frame generation schemes, such as SS, RPN and the like, an interested region is obtained, and then the interested region is input into a classification network for scoring; the other type is a one-stage network with higher real-time speed, such as SSD and YOLO series, the step of generating a candidate frame is omitted, and the regression of classification and a coordinate frame is completed after the features of the whole graph are extracted, so that the model has higher speed-precision balance. The accuracy of the deep network in the detection field is extremely high. Particularly, the introduction of the one-stage network ensures the coordinate regression precision of the target frame and meets the speed requirement in a real-time application scene, thereby laying an unimaginable position of deep learning in the target detection application field at present.
The subsequent algorithm of the face alignment algorithm is to align the face by adopting a depth model, and face detection and face alignment are generally realized uniformly by utilizing various CNN architectures. The scheme can obtain competitive face alignment precision, but the depth model is limited by huge parameters and heavy deep hierarchy, and even through model compression and miniaturization processing, the depth model also causes great obstruction to later-stage integration in hardware. Compared with the human face alignment scheme based on machine learning, most human face alignment schemes are shallow models which are easy to implement, and in the direction, classical technologies such as an LBF scheme and an SDM algorithm optimization strategy all adopt light-weight models which are smaller than depth models, and meanwhile, the human face alignment algorithm which is comparable to the regression accuracy of feature points based on depth learning can be achieved.
In practical applications, the face is not always at a fixed angle. Some left inclinations and some right inclinations are not always centered. While the overall shape mean tends to remain within a few degrees of the plus or minus of a straight face. Therefore, the human face inclined at multiple angles is initialized by uniformly adopting the overall mean value shape, the effect of the human face is accurate, and the human face with a certain inclination angle has extremely poor effect. The indiscriminate initialization scheme causes great difficulty in later regression, and therefore the application of the model is required to be resistant to inclination and robustness of a side face.
Disclosure of Invention
The invention provides a face alignment method in multiple angles with poor facial pixels aiming at the defects of the prior art.
A face alignment method based on face pixel difference from multiple angles comprises the following steps:
step 1), generating a model by a face frame: based on the SSD, a total of eight uniformly distributed feature layers are reselected to perform cascade regression prediction, and a plurality of prediction frame scales which accord with the face proportion are selected to form a robustness model MR-SSD; the selected 8 feature layers are respectively: conv3_3, conv4_3, conv5_3, fc7, conv8_2, conv9_2, conv10_2, conv11_ 2.
Step 2), initializing key points of the human face at multiple angles: selecting 5 key points of the human face, taking the mean value of the key points of the training set as the initialized coordinates of the 5 points, and rotating the key points +/-30 degrees and +/-60 degrees through affine transformation to form another 4 initialized angles;
step 3), random forest training: randomly selecting a plurality of pairs of pixel points in different radius ranges r of key points of the human face, solving the difference value of the pairs of pixel points, training the difference value as the input of a random forest, combining sparse 0, 1 binary features output by leaf nodes of the random forest to obtain a one-dimensional local binary feature vector, wherein a random forest consisting of M decision trees is trained around each key point;
step 4), global linear regression: obtaining local binary feature vectors of the key points through the step 3), performing global linear regression training on all the features, predicting the deviation of the key points by using a regressor obtained by training, and continuously correcting the coordinates of the predicted points;
step 5), selecting a rule based on the regression shape with maximized facial pixel difference: after regression prediction, N pixel points around the initialized 5 eye key points with different angles are selected to calculate the mean value and the mean square error, the initialized angle with the maximum mean square error is most fit with the face, and the predicted point after angle regression is selected as the finally calibrated key point.
Preferably, in step 1), the face prediction frame ratios are respectively: 1:1,1:1.3,1:1.5.
Preferably, in step 2), the key points of the face are selected to be a left eye pupil, a right eye pupil, a nose tip, a left mouth corner and a right mouth corner respectively, a perpendicular bisector of a connecting line of the center face pupil is taken as a 0-degree reference line, five key points of the 0-degree reference line are taken as standard shapes, a right inclination relative to the reference line is defined as a positive direction, a left inclination is defined as a negative direction, and five initialization schemes of angles of 0 °, +/-30 °, +/-60 ° are generated.
Preferably, in step 3), a random forest is trained around each key point, and each random forest is composed of M decision trees.
The invention provides a method for back-and-forth prediction of initial key point positions of 5 different angles aiming at faces inclined at different angles in face alignment, so that excellent fitting effect of the faces inclined at different angles is realized, pixel difference of different positions of the faces can be used as representation of different areas to a certain extent, eye difference is most obvious, regression shape selection rules based on maximization of facial pixel difference are provided, and more accurate face alignment effect is realized.
Drawings
FIG. 1 is an overall architecture of a face alignment method;
FIG. 2 shows the original SSD frame now with the feature layer on the left and the reselected feature layer on the right;
FIG. 3 is an initialization diagram of 5 key points of the human face from different angles;
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention, and not restrictive,
as shown in fig. 1, a multi-angle human face alignment method based on facial pixel difference includes the following specific steps:
step 1: as shown in fig. 2, based on the SSD overall architecture, a con3_3 layer added to a lower layer and a conv5_3 layer of a fifth convolution series are selected, conv4_3, fc7, conv8_2, conv9_2, conv10_2 and conv11_2 uniform level features in an additional layer are fused, a fully connected layer at the end of the VGG architecture is cut out, and a final pooled layer is changed into a convolutional layer; and predicting each picture through an MR-SSD model to obtain a face frame.
Step 2: calculating the coordinate mean value of 5 key points by the face frame obtained in the step 1 and a training set to be used as 0-degree initial coordinates of the key points of the predicted face, and then obtaining initial coordinates at +/-30 degrees and +/-60 degrees through affine transformation, wherein the method for calculating the coordinates through affine transformation comprises the following steps:
Figure BDA0001934501010000051
0 DEG initialization coordinates are (X, Y), wherein
Figure BDA0001934501010000052
Figure BDA0001934501010000053
Respectively are the x-axis coordinate mean value of 5 key points in the training set, and y 1-y 5 are respectively the training setThe mean value of y-axis coordinates of 5 key points, theta is a rotation angle, theta is +/-30 degrees and +/-60 degrees, and the obtained U and V are coordinates of 4 rotated key points;
step 201: the coordinates in step 2 are defined as follows: the midperpendicular of the connecting line of the pupils of the median face is taken as a 0-degree datum line, and five key points of the 0-degree datum line are taken as standard shapes. The inclination to the right with respect to the reference line is defined as a positive direction, and the inclination to the left is defined as a negative direction. As shown in fig. 3, five initialization schemes of minus 30 degrees (cross group), minus 60 degrees (pentagram group), plus 30 degrees (square group), plus 60 degrees (triangle group), plus one standard group face (circle) are generated. The five initialization schemes can perform fitting with high coverage on the faces with different inclination angles, and it can be seen that key points of each group of initialization schemes can cover different areas of the face;
step 202: selecting key points of the human face as a left eye pupil, a right eye pupil, a nose tip, a left mouth corner and a right mouth corner respectively;
and step 3: randomly selecting a plurality of pairs of pixel points within different radius ranges r of the initial shape key points, solving the difference value of the pairs of pixel points, taking the difference value as the input of a random forest for training, and combining sparse 0, 1 binary features output by leaf nodes of the random forest to obtain a one-dimensional local binary feature vector;
step 301: the training process of the random forest is t stages in total, the radius range r is gradually reduced along with the stage, and the regression of the key points is gradually close to the real points;
step 302: feature mapping function of pixel difference obtained by random forest training
Figure BDA0001934501010000061
Further acquiring local binary features of the facial pixel difference;
and 4, step 4: obtaining local binary characteristics of the key points through the step 3, performing global linear regression training on all the characteristics, predicting the deviation of the key points by using a regressor obtained by training, and continuously correcting the coordinates of the predicted points, wherein the deviation is expressed as follows:
Figure BDA0001934501010000062
i is the input image matrix, St-1In the shape of the t-1 th stage,
Figure BDA0001934501010000063
for the feature matching function of this stage, WtIs a linear regression matrix. The regression phase takes the generated LBF characteristics as input to train a linear regression matrix WtThereby obtaining a trained model;
and 5: after 5 initialized groups of key points are regressed, the key points are closer to the real key point positions, and the most appropriate position is selected as the final prediction result. Regression shape selection rules based on facial pixel difference maximization: and selecting pixel point regions with the eye radius of N from the 5 groups of different initialization shapes as key region statistical pixels to respectively obtain a mean value mu and a mean square error sigma. The formula is as follows:
Figure BDA0001934501010000064
Figure BDA0001934501010000065
where mu is all the pixel points x in the designated range of the two eye regions of each initialization schemeiσ is the mean square error objective function. Unlike the past, in the algorithm, the objective function is maximized, and the region with the strongest pixel jitter is obtained, so that the optimal regression shape closer to the eye region is obtained. It represents the degree of deviation of all the pixels in the region from the mean. If the mean square error of pixels around the eyes is obviously larger than the pixel difference values of the rest groups of key points, the eye region can be used as a characteristic for distinguishing different regions of the face. And training five groups of initialization schemes with different inclination angles to obtain five groups of prediction results subjected to algorithm regression. At this time, the key points of the five initialization schemes returnIt is more accurate to return than before. But because each scheme covers the area of the face at a different angle.

Claims (4)

1. A multi-angle human face alignment method based on facial pixel difference is characterized in that: the method comprises the following steps:
step 1), generating a model by a face frame: based on the SSD, a total of eight uniformly distributed feature layers are reselected to perform cascade regression prediction, and a plurality of prediction frame scales which accord with the face proportion are selected to form a robustness model MR-SSD;
step 2), initializing key points of the human face at multiple angles: selecting 5 key points of the human face, taking the mean value of the key points of the training set as the initialized coordinates of the 5 points, and rotating the key points +/-30 degrees and +/-60 degrees through affine transformation to form another 4 initialized angles;
step 3), random forest training: randomly selecting a plurality of pairs of pixel points in different radius ranges r of key points of the human face, solving the difference value of the pairs of pixel points, training the difference value as the input of a random forest, combining sparse 0, 1 binary features output by leaf nodes of the random forest to obtain a one-dimensional local binary feature vector, wherein a random forest consisting of M decision trees is trained around each key point;
step 4), global linear regression: obtaining local binary feature vectors of the key points through the step 3), performing global linear regression training on all the features, predicting the deviation of the key points by using a regressor obtained by training, and continuously correcting the coordinates of the predicted points;
step 5), selecting a rule based on the regression shape with maximized facial pixel difference: after regression prediction, N pixel points around the initialized 5 eye key points with different angles are selected to calculate the mean value and the mean square error, the initialized angle with the maximum mean square error is most fit with the face, and the predicted point after angle regression is selected as the finally calibrated key point.
2. The method for aligning the human face from multiple angles based on the facial pixel difference as claimed in claim 1), wherein in the step 1), the selected 8 feature layers are respectively: conv3_3, conv4_3, conv5_3, fc7, conv8_2, conv9_2, conv10_2, conv11_ 2.
3. The method for aligning human faces from multiple angles based on facial pixel differences as claimed in claim 1, wherein in the step 1), the proportions of the human face prediction frames are respectively as follows: 1:1,1:1.3,1:1.5.
4. The method according to claim 1, wherein in step 2), the key points of the face are selected to be a left eye pupil, a right eye pupil, a nose tip, a left mouth corner and a right mouth corner respectively, the perpendicular bisector of the connecting line of the center face pupils is used as a 0-degree reference line, five key points of the 0-degree reference line are used as standard shapes, the right inclination relative to the reference line is defined as a positive direction, the left inclination is defined as a negative direction, and five initialization schemes of angles of 0 °, +/-30 °, +/-60 ° are generated.
CN201910003381.1A 2019-01-03 2019-01-03 Multi-angle face alignment method based on face pixel difference Active CN109902553B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910003381.1A CN109902553B (en) 2019-01-03 2019-01-03 Multi-angle face alignment method based on face pixel difference

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910003381.1A CN109902553B (en) 2019-01-03 2019-01-03 Multi-angle face alignment method based on face pixel difference

Publications (2)

Publication Number Publication Date
CN109902553A CN109902553A (en) 2019-06-18
CN109902553B true CN109902553B (en) 2020-11-17

Family

ID=66943485

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910003381.1A Active CN109902553B (en) 2019-01-03 2019-01-03 Multi-angle face alignment method based on face pixel difference

Country Status (1)

Country Link
CN (1) CN109902553B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401257B (en) * 2020-03-17 2022-10-04 天津理工大学 Face recognition method based on cosine loss under non-constraint condition
CN113052064B (en) * 2021-03-23 2024-04-02 北京思图场景数据科技服务有限公司 Attention detection method based on face orientation, facial expression and pupil tracking

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105404861A (en) * 2015-11-13 2016-03-16 中国科学院重庆绿色智能技术研究院 Training and detecting methods and systems for key human facial feature point detection model
CN105760836A (en) * 2016-02-17 2016-07-13 厦门美图之家科技有限公司 Multi-angle face alignment method based on deep learning and system thereof and photographing terminal
CN108108677A (en) * 2017-12-12 2018-06-01 重庆邮电大学 One kind is based on improved CNN facial expression recognizing methods

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10796178B2 (en) * 2016-12-15 2020-10-06 Beijing Kuangshi Technology Co., Ltd. Method and device for face liveness detection
US10417483B2 (en) * 2017-01-25 2019-09-17 Imam Abdulrahman Bin Faisal University Facial expression recognition

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105404861A (en) * 2015-11-13 2016-03-16 中国科学院重庆绿色智能技术研究院 Training and detecting methods and systems for key human facial feature point detection model
CN105760836A (en) * 2016-02-17 2016-07-13 厦门美图之家科技有限公司 Multi-angle face alignment method based on deep learning and system thereof and photographing terminal
CN108108677A (en) * 2017-12-12 2018-06-01 重庆邮电大学 One kind is based on improved CNN facial expression recognizing methods

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
《Digitized Feedforward Compensation Method for High-Power-Density Three-Phase Vienna PFC Converter》;Lijun Hang et al.;《 IEEE Transactions on Industrial Electronics》;20130430;第60卷(第4期);第1512-1519页 *
《Face Alignment via Regressing Local Binary Features》;Shaoqing Ren et al.;《IEEE TRANSACTIONS ON IMAGE PROCESSING》;20160331;第25卷(第3期);第1233-1245页 *
《Facial Landmark Detection by Deep Multi-task Learning》;Zhanpeng Zhang et al.;《ECCV 2014: Computer Vision》;20141231;第94-108页 *
《多角度人脸检测与识别方法研究》;曾建凡;《电子设计工程》;20171103;第25卷(第11期);第41-44页 *

Also Published As

Publication number Publication date
CN109902553A (en) 2019-06-18

Similar Documents

Publication Publication Date Title
CN109816012B (en) Multi-scale target detection method fusing context information
Mishkin et al. Repeatability is not enough: Learning affine regions via discriminability
CN104392223B (en) Human posture recognition method in two-dimensional video image
CN103824050A (en) Cascade regression-based face key point positioning method
CN106980809B (en) Human face characteristic point detection method based on ASM
CN109902553B (en) Multi-angle face alignment method based on face pixel difference
US20200327726A1 (en) Method of Generating 3D Facial Model for an Avatar and Related Device
CN111998862B (en) BNN-based dense binocular SLAM method
Jouaber et al. Nnakf: A neural network adapted kalman filter for target tracking
Robles-Kelly et al. String edit distance, random walks and graph matching
CN112541468A (en) Target tracking method based on dual-template response fusion
CN104091148B (en) A kind of man face characteristic point positioning method and device
CN111820545A (en) Method for automatically generating sole glue spraying track by combining offline and online scanning
Sun et al. Multi-stage refinement feature matching using adaptive ORB features for robotic vision navigation
Xi et al. Learning temporal-correlated and channel-decorrelated Siamese networks for visual tracking
CN113393524A (en) Target pose estimation method combining deep learning and contour point cloud reconstruction
CN116823885A (en) End-to-end single target tracking method based on pyramid pooling attention mechanism
Fanani et al. Keypoint trajectory estimation using propagation based tracking
CN114067128A (en) SLAM loop detection method based on semantic features
CN111339342B (en) Three-dimensional model retrieval method based on angle ternary center loss
CN111144497B (en) Image significance prediction method under multitasking depth network based on aesthetic analysis
CN112614161A (en) Three-dimensional object tracking method based on edge confidence
CN109887012B (en) Point cloud registration method combined with self-adaptive search point set
CN113674332B (en) Point cloud registration method based on topological structure and multi-scale features
CN107194947B (en) Target tracking method with self-adaptive self-correction function

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20190618

Assignee: RUIMO INTELLIGENT TECHNOLOGY (SHENZHEN) Co.,Ltd.

Assignor: HANGZHOU DIANZI University

Contract record no.: X2023330000383

Denomination of invention: A Multi angle Face Alignment Method Based on Facial Pixel Difference

Granted publication date: 20201117

License type: Common License

Record date: 20230707