WO2017049677A1 - 一种面部关键点的标注方法 - Google Patents

一种面部关键点的标注方法 Download PDF

Info

Publication number
WO2017049677A1
WO2017049677A1 PCT/CN2015/091886 CN2015091886W WO2017049677A1 WO 2017049677 A1 WO2017049677 A1 WO 2017049677A1 CN 2015091886 W CN2015091886 W CN 2015091886W WO 2017049677 A1 WO2017049677 A1 WO 2017049677A1
Authority
WO
WIPO (PCT)
Prior art keywords
key points
rigid body
facial
coordinates
face
Prior art date
Application number
PCT/CN2015/091886
Other languages
English (en)
French (fr)
Inventor
李轩
周剑
徐一丹
龙学军
陆宏伟
晁志超
Original Assignee
成都通甲优博科技有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 成都通甲优博科技有限责任公司 filed Critical 成都通甲优博科技有限责任公司
Publication of WO2017049677A1 publication Critical patent/WO2017049677A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis

Definitions

  • the invention relates to the field of cooking equipment, and in particular to a rice cooker and a heating control method thereof.
  • Facial key point annotation is a technique that uses an algorithm to mark key positions such as the corners of the eyes, the tip of the nose, and the corners of the mouth on the face image. Facial keypoint annotation technology plays an important role in face recognition, sentiment analysis, face tracking and other fields.
  • the present invention provides a method for labeling facial key points, which can accurately, robustly and accurately mark key points on the face image.
  • a method for labeling facial key points comprising two processes, respectively a training process and a use process, wherein the training process and the use process both include rigid body evolution and non-rigid body evolution, and the rigid body evolution Multiple iterations can be performed with non-rigid body evolution to improve the prediction accuracy of the facial key points.
  • the invention divides the evolution process of coordinates into two steps: rigid body evolution and non-rigid body evolution.
  • the rigid body evolution process can only adjust the initial coordinates of all key points by affine transformation such as scale scaling, translation and rotation.
  • the non-rigid evolution process independently adjusts the new coordinates of each key point so that the coordinates of each key point are further closer to the real coordinates.
  • Rigid body evolution treats all feature points as a template. By scaling, translating and rotating the template, it is an affine transformation, which makes it approach the key point as a whole. The true coordinates. This method can effectively avoid the interference caused by posture changes, illumination changes and partial occlusion, and avoid the evolution process from falling into local extremum.
  • the rigid body evolution process can be realized by only determining six affine transformation parameters, which greatly reduces the volume of the prediction model.
  • Non-rigid body evolution can overcome the inherent differences between expression changes and different people's faces, and further improve the prediction accuracy of key point coordinates.
  • the invention adopts multiple iterations, and each previous iteration has the previous prediction result as an initial value, which further improves the prediction precision.
  • the marking method specifically includes:
  • Step S1 eliminating the affine change difference between the initial template and the real coordinates by the rigid body evolution of the training process
  • Step S2 independently adjusting each of the facial key points in the initial template by non-rigid evolution of the training process to improve positioning accuracy of the facial key points;
  • Step S3 performing prediction of the facial key points according to a linear regression model by the rigid body evolution of the use process, and estimating a new position of the facial key points;
  • Step S4 In the non-rigid body evolution of the use process, adjusting coordinates of the face key points according to the new position to obtain a prediction result.
  • the present invention uses a machine learning method to estimate the position of a key point of a face using features extracted from a face image.
  • the invention first estimates the coordinate mean value of the key points from a large number of training samples, and the coordinate mean value is taken as the initial coordinate.
  • the final key point coordinates are obtained by continuously evolving the initial coordinates, and the evolution model is solved by a linear regression algorithm.
  • the step S1 specifically includes:
  • Step S11 placing the initial template at a center position of the training picture, and solving an optimal affine transformation parameter between the initial template and the real coordinate.
  • the optimal affine transformation parameter is solved by a formula, and the formula is:
  • K is the number of key points of the face, and the set ⁇ (x n (1) , y n (1) ) ⁇ ) and the set ⁇ (x n (2) , y n (2) ) ⁇ are respectively the nth An initial coordinate of the facial key point and the real coordinate, n is a positive integer;
  • the step S1 further includes:
  • Step S12 centering on the initial position of the key point of the face, and intercepting the texture area
  • Step S13 performing feature encoding on the texture region by using a direction gradient histogram operator to obtain a floating point code of length Z;
  • Step S14 performing feature coding on the K face key points in a predefined order to obtain a feature code having a length of ⁇ X ⁇ ;
  • Step S15 performing normalization operation on the feature code of the ZxK to obtain a normalization parameter, where
  • the mean value of the distribution is 0 and the variance is 1.
  • the step S1 further includes:
  • Step S16 After the step S15, the training of the linear regression model is performed according to a formula, wherein the formula is:
  • ⁇ m i is the mth optimal parameter of the ith sample
  • the direction gradient histogram feature code normalized for the ith sample, r m and b m are the linear projection vectors and offset values of the mth parameter to be solved, respectively, and are calculated by r m and b m Linear regression model, m and i are positive integers.
  • the step S2 specifically includes:
  • Step S21 extracting a direction gradient histogram feature from the initial coordinates, forming a feature vector of length ⁇ X ⁇ , and performing the normalization operation;
  • Step S22 Perform a non-rigid body evolution training process of the linear regression model according to the formula.
  • the implementation formula of the non-rigid evolution training process is:
  • ⁇ n i is the offset of the initial coordinates of the int (n/2) key points of the i-th sample and the true coordinates on the X-axis or the Y-axis
  • the direction gradient histogram feature code normalized for the i-th sample, r n and b n are respectively the linear projection vector and the offset value of the n-th offset to be solved, and are calculated by r n and A linear regression model represented by b n .
  • the present invention utilizes a linear regression algorithm to learn parameters in rigid body evolution and non-rigid body evolution. Compared with other machine learning methods, the linear regression algorithm has a small amount of calculation and good real-time performance.
  • the step S3 specifically includes: step S31: adjusting a facial image to a designated pixel, and placing the initial template - in a face of the facial image;
  • Step S32 extract a direction gradient histogram of the facial image, perform matrix multiplication and vector addition operations, and obtain an affine transformation parameter
  • Step S33 Calculate a new position of the face key point according to the affine transformation parameter.
  • the step S4 specifically includes:
  • Step S41 extract corresponding directional gradient histogram features according to the new position estimated in step S33, perform matrix multiplication and vector addition operations, and obtain 2K offsets;
  • Step S42 Adjust coordinates of the face key points according to the offset amount.
  • the marking method further comprises:
  • Step S01 Before the step S1, normalize the facial image to a designated pixel, and mark the real coordinates of the key point of the face;
  • Step S02 Find an initial template according to the formula.
  • the initial template is solved by S ⁇ :
  • the invention gradually estimates the position of the key points from coarse to fine, avoids the interference of the local extremum to the precision, and the final prediction accuracy and robustness are greatly improved; the computational efficiency and the affine transformation of the linear regression algorithm
  • the only six parameters make the present invention advantageous in terms of real-time and model size.
  • the positioning speed of the present invention can exceed 60 frames/second, and the model size can be controlled within 6 Mb.
  • FIG. 1a-1b are flowcharts of a method for positioning a facial key point according to the present invention.
  • FIG. 3 is a schematic view showing the evolution process of a rigid body according to the present invention.
  • FIG. 4 is a schematic diagram of a non-rigid body evolution process of the present invention.
  • Figure 6 is a schematic diagram of the prediction effect of the present invention.
  • the embodiment relates to a method for labeling facial key points by using feature extraction + machine learning technology, which can significantly improve the positioning accuracy of facial key points.
  • the method mainly comprises the following steps: a: establishing an initial coordinate model of a key point; b: solving an optimal two-dimensional affine transformation parameter from an initial coordinate to a real coordinate according to a real coordinate of the key point; c: an image from the initial coordinate according to the initial coordinate The feature is extracted, and a linear mapping model between the feature and the best two-dimensional affine transformation parameters is established through training. d: Using the trained linear mapping model, the affine parameters corresponding to the training samples are solved, and the new coordinates of the initial coordinates after affine transformation are calculated according to the obtained parameters.
  • Steps b to f are one iteration. In order to improve the accuracy, the embodiment includes multiple iterations, and each iteration has the result of the previous iteration as a new initial value.
  • 1a-1b are flowcharts of a method for positioning a facial keypoint according to the present invention.
  • the embodiment includes two processes of training and using.
  • first coordinate initial values of key points need to be defined.
  • the initial value is obtained by averaging the true coordinates of the training samples.
  • 2 is a process for solving the mean value of the key point coordinates of the present invention, and FIG. 2 shows the process of generating the mean template.
  • the face region is first normalized to a size of 128 ⁇ 128 pixels, and the true point of the key point is marked. Coordinates, and then the mean template S ⁇ (initial template) is solved according to formula (1).
  • the mean template S ⁇ is placed at the center of the training picture, and Procrustes analyses is used to solve the optimal affine transformation parameters between the initial template and the real position, and the calculation process is as shown in formula (2).
  • Equation (2) is a typical least squares problem, and the estimated optimal parameters ⁇ 1 to ⁇ 6 can be obtained after the solution is completed.
  • the present invention uses a linear regression algorithm to learn the mapping relationship from image features to parameters ⁇ 1 to ⁇ 6 .
  • the texture region with the size of 19x19 pixels is intercepted with the initial position of the key point as the center, and the HOG operator is used to feature the region to obtain the floating point code of length 144, and then the coding of the k key points of the face is followed. Arranged in a predefined order, resulting in a feature code of 0 with a length of 144xk.
  • the feature code set is normalized so that the mean value of the distribution is 0, the variance is 1, and the corresponding normalization parameters are recorded, which is convenient.
  • the forecasting process is used.
  • the training of the linear regression model is then performed according to formula (3).
  • ⁇ m i is the mth optimal parameter of the ith sample
  • the HOG feature code normalized for the i-th sample, r m and b m are the linear projection vectors and offset values of the m-th parameter to be solved, respectively.
  • a linear regression model represented by r m and b m can be obtained.
  • Rigid body evolution can eliminate the affine transformation difference between the initial template and the real coordinates, but the inherent differences between different expressions, poses and faces still exist, and these differences need to be eliminated by non-rigid evolution.
  • the training of non-rigid evolution is based on the prediction of the evolution of the rigid body. Similar to the training process of rigid body evolution, the training of non-rigid body evolution also needs to extract the Histogram of Oriented Gradient (HOG) feature from the initial point, and form a feature vector with a length of 144xk, and perform normalization operation. The main difference is the training process of the linear regression model.
  • the training of non-rigid body evolution is carried out according to formula (4).
  • ⁇ n i is the offset of the initial coordinates of the int (n/2) key points of the i-th sample and the true coordinates on the X-axis or the Y-axis
  • the HOG feature code normalized for the ith sample, r n and b n are the linear projection vectors and offset values of the nth offset to be solved, respectively.
  • a linear regression model represented by r n and b n can be obtained.
  • the non-rigid body evolution process can independently adjust the coordinates of each key point to further improve the positioning accuracy.
  • the present invention includes a plurality of iterative processes, each iterative process includes a rigid body evolution process and a non-rigid body evolution process, and the training process is the same as the above, and the only initial template in the iterative process comes from the previous time. Iterative prediction results.
  • FIG. 3 is a schematic diagram of the rigid body evolution process of the present invention.
  • the present invention utilizes a linear regression model obtained by the training process to perform key point prediction. Firstly, the test face is scaled to 128x128 pixels, and the mean template is placed in the middle of the face; then the corresponding HOG features are extracted, and the vector r m and the offset value b m are obtained respectively in the rigid body evolution training, and the matrix multiplication and vector addition operations are performed. The six affine transformation parameters corresponding to the sample are obtained, and the new position of the key point is calculated according to the parameter, and the process is as shown in FIG. 3 .
  • FIG. 4 is a schematic diagram of a non-rigid body evolution process of the present invention.
  • the corresponding HOG feature is extracted according to the predicted position, and the vector r n and the offset b obtained in the non-rigid body evolution training are obtained.
  • n Perform matrix multiplication and vector addition operations to obtain 2k offsets corresponding to the samples, and then adjust the coordinates of the key points according to the offset. The process is shown in FIG.
  • FIG. 5 is a schematic diagram of an iterative process of the present invention. Finally, multiple iterations are sequentially performed according to the above process to obtain a final prediction result, which is shown in FIG. 5.
  • Fig. 6 is a schematic view showing the prediction effect of the present invention, and the final effect of the present invention can be seen from Fig. 6.
  • the present invention proposes a real-time, robust, high-precision facial keypoint localization method, which aims to locate key points of a face image captured under mobile platform conditions by using feature extraction and machine learning. .
  • the invention has low requirements on the computing performance and the imaging environment of the picture, and has an advantage in the size of the model, so it can be widely applied to the mobile platform for human-computer interaction, expression analysis, line-of-sight control, A solid foundation for other machine vision applications such as fatigue monitoring.

Abstract

本发明涉及计算机视觉领域,尤其涉及一种面部关键点的标注方法。一种面部关键点的标注方法,该标注方法包括两个过程,分别为训练过程和使用过程,其中,训练过程、使用过程均包括刚体演变和非刚体演变,刚体演变与非刚体演变能够进行多次迭代,步骤S1:通过训练过程的刚体演变消除初始模板与真实坐标之间的仿射变化差异;步骤S2:通过训练过程的非刚体演变对初始模板中每个面部关键点进行独立调节,以提高面部关键点的定位精度;步骤S3:通过使用过程的刚体演变,根据线性回归模型进行面部关键点的预测,推算面部关键点的新位置;步骤S4:在使用过程的非刚体演变中,根据新位置调节面部关键点的坐标,得到预测结果。

Description

一种面部关键点的标注方法 技术领域
本发明涉及烹饪设备领域,尤其是涉及一种电饭煲及其加热控制方法。
背景技术
面部关键点标注是一种利用算法在人脸图像上标注出眼角、鼻尖、嘴角等关键位置的技术。面部关键点标注技术在人脸识别、情感分析、人脸追踪等领域都有重要的作用。
衡量面部关键点标注技术好坏的指标主要有四个:准确性、鲁棒性、实时性和模型大小。已知的面部关键点标注技术在上述指标都存在可改进的空间。随着iOS、Android等移动平台的广泛使用,基于移动平台实现的人脸识别、疲劳监控等机器视觉应用有着广泛的应用前景。与传统的"PC主机+固定摄像头"框架相比,移动平台的计算性能较弱,所拍摄的图片在光照、姿态和遮挡等方面存在着更复杂的变化。这些情况对面部关键点标注技术的准确性、鲁棒性、实时性和模型大小都提出了更为严峻的要求,现有技术中鲁棒性,实时性都较低,面部关键点定位算法较为复杂。
发明内容
针对现有技术中,面部关键点标注技术所存在的问题,本发明提供了一种面部关键点的标注方法,能够对人脸图像上的关键点进行准确、健壮且实时的标注。
本发明采用如下技术方案:
一种面部关键点的标注方法,所述标注方法包括两个过程,分别为训练过程和使用过程,其中,所述训练过程、所述使用过程均包括刚体演变和非刚体演变,所述刚体演变与非刚体演变能够进行多次迭代,以提高所述面部关键点的预测精度。
本发明将坐标的演化过程分为刚体演化和非刚体演化两个步骤。刚体演化过程只能通过尺度缩放、平移和旋转等仿射变换对所有关键点的初始坐标进行统一调整。非刚体演化过程对每个关键点的新坐标再进行独立调整,使得每个关键点的坐标进一步逼近真实坐标。
刚体演化和非刚体演化相结合是本发明的重要创新,刚体演化将所有特征点看成一个模板,通过对模板进行缩放、平移和旋转,即为仿射变换,使其从整体上逼近关键点的真实坐标。这个做法可以有效避免姿态变化、光照变化和部分遮挡等情况而带来的干扰,避免演化过程陷入局部极值。此外,刚体演化过程只需确定6个仿射变换参数即可实现,大大降低了预测模型的体积。非刚体演化能够克服表情变化和不同人脸上的固有差异,进一步提高关键点坐标的预测精度。
本发明采用了多次迭代,每次迭代都以前次预测结果作为初始值,进一步提高了预测精度。
优选的,所述标注方法具体包括:
步骤S1:通过所述训练过程的刚体演变消除初始模板与真实坐标之间的仿射变化差异;
步骤S2:通过所述训练过程的非刚体演变对初始模板中每个所述面部关键点进行独立调节,以提高所述面部关键点的定位精度;
步骤S3:通过所述使用过程的刚体演变根据线性回归模型进行所述面部关键点的预测,推算所述面部关键点的新位置;
步骤S4:在所述使用过程的非刚体演变中,根据所述新位置调节所述面部关键点的坐标,得到预测结果。
本发明使用机器学习方法,利用从人脸图像上提取的特征,来估计面部关键点的位置。本发明首先从大量训练样本中估计出关键点的坐标均值,坐标均值被当作初始坐标,最终的关键点坐标都通过对初始坐标不断演化获得,这个演化模型利用线性回归算法来求解。
优选的,所述步骤S1具体包括:
步骤S11:将所述初始模板放置在训练图片的中心位置,并解得所述初始模板与所述真实坐标之间的最佳仿射变换参数。
优选的,所述步骤S11中:通过公式解得所述最佳仿射变换参数,所述公式为:
Figure PCTCN2015091886-appb-100001
其中,K为所述面部关键点的数量,集合{(xn (1),yn (1))}和集合{(xn (2),yn (2))}分别为第n个所述面部关键点的初始坐标与所述真实坐标,n为正整数;
以及所述初始坐标与所述真实坐标之间的差异通过参数β1至β6决定的仿射变换消除,求解后得出参数β1至β6
优选的,所述步骤S1还包括:
步骤S12:以所述面部关键点的初始位置为中心,并截取纹理区域;
步骤S13:利用方向梯度直方图算子对所述紋理区域进行特征编码,得到长度为Z的浮点编码;
步骤S14:将K个面部关键点的编码按照预定义顺序进行特征编码,得到长度为ΖXΚ的特征编码;
步骤S15:将所述ZxK的特征编码进行归一化操作,得到归一化参数,其中,
所述归一化操作中,分布的均值为0,方差为1。
优选的,所述步骤S1还包括:
步骤S16:于所述步骤S15之后,根据公式进行所述线性回归模型的训练,其中,所述公式为:
Figure PCTCN2015091886-appb-100002
其中,βm i为第i个样本的第m个最佳参数,
Figure PCTCN2015091886-appb-100003
为第i个样本归一化后的方向梯度直方图特征编码,rm和bm分别为待求解的第m个参数的线性投影向量和偏置值,通过计算得到由rm和bm表示的线性回归模型,m和i均为正整数。
优选的,所述步骤S2具体包括:
步骤S21:从所述初始坐标提取方向梯度直方图特征,组成长度为ΖXΚ的特征向量,并进行所述归一化操作;
步骤S22:根据公式进行线性回归模型的非刚体演变训练过程。
优选的,所述非刚体演变训练过程的实施公式为:
Figure PCTCN2015091886-appb-100004
其中,Δn i为第i个样本的第int(n/2)个关键点的所述初始坐标与所述真实坐标在X轴或Y轴上的偏移量,
Figure PCTCN2015091886-appb-100005
为第i个样本归一化后的方向梯度直方图特征编码,rn和bn分别为待求解的第n个偏移量的线性投影向量和偏置值,通过计算,得到由rn和bn表示的线性回归模型。
本发明利用线性回归算法来学习刚体演化和非刚体演化中的参数。与其他机器学习方法相比,线性回归算法的计算量小,有较好的实时性。
优选的,所述步骤S3具体包括:步骤S31:将面部图像调节至指定像素,将所述初始模板-置于所述面部图像的人脸正中;
步骤S32:提取所述面部图像的方向梯度直方图,进行矩阵乘和向量加操作,得到仿射变换参数;
步骤S33:根据所述仿射变换参数推算所述面部关键点的新位置。
优选的,所述步骤S4具体包括:
步骤S41:根据所述步骤S33推算的新位置提取对应的方向梯度直方图特征,进行矩阵乘和向量加操作,得到2K个偏移量;
步骤S42:根据所述偏移量调整所述面部关键点的坐标。
优选的,所述标注方法还包括:
步骤S01:于所述步骤S1之前,将所述面部图像归一化到指定像素,标出所述面部关键点的真实坐标;
步骤S02:根据公式求出初始模板。
优选的,所述初始模板的求解公式为Sμ:
Figure PCTCN2015091886-appb-100006
本发明的有益效果是:
本发明由粗到细地逐步估计关键点位置,避免了局部极值对精度的干扰,使得最终的预测精度和鲁棒性都有很大的提高;线性回归算法的计算高效性和仿射变换仅有的6个参数使得本发明在实时性和模型体积上存在优势。本发明的定位速度可以超过60帧/秒,模型大小可控制在6Mb以内。
附图说明
图1a-1b为本发明面部关键点定位方法的流程图;
图2为本发明面部关键点坐标均值的求解过程;
图3为本发明刚体演化过程示意图;
图4为本发明非刚体演化过程示意图;
图5为本发明迭代过程示意图;
图6为本发明的预测效果示意图。
具体实施方式
需要说明的是,在不冲突的情况下,下述技术方案,技术特征之间可以相互组合。
下面结合附图对本发明的具体实施方式作进一步的说明:
实施例一
本实施例涉及一种利用特征提取+机器学习技术,公开了一种面部关键点标注方法,可以明显提高面部关键点的定位准确性。该方法主要包括以下步骤:a:建立关键点的初始坐标模型;b:根据关键点的真实坐标,求解从初始坐标到真实坐标的最佳二维仿射变换参数;c:根据初始坐标从图像中提取特征,通过训练建立特征到最佳二维仿射变换参数之间的线性映射模型。d:利用已训练的线性映射模型,求解出训练样本对应的仿射参数,并根据求得的参数,计算出初始坐标经过仿射变换后的新坐标。e:根据新坐标从图像中提取特征,通过训练建立特征到真实坐标之间的线性映射模型。f:利用训练得到的模型,预测关键点的新位置。g:步骤b到步骤f为一次迭代,为了提高精度,本实施例包含多次迭代,每次迭代都以前次迭代的结果作为新的初始值。
实施例二
图1a-1b为本发明面部关键点定位方法的流程图,如图1所示,本实施例共包含训练和使用两个过程,在训练过程中,首先需要定义关键点的坐标初始值,这些初始值通过对训练样本的真实坐标求均值获得。图2为本发明面部关键点坐标均值的求解过程,图2显示了均值模板的产生过程,对所有的训练样本,先将人脸区域归一化到128x128像素大小,并标出关键点的真实坐标,而后根据公式(1)求解均值模板Sμ(初始模板)。
Figure PCTCN2015091886-appb-100006
得到均值模板后,首先进行刚体演变的训练。
本实施例将均值模板Sμ、放置在训练图片的中心位置,利用Procrustesanalyses求解出初始模板和真实位置之间的最佳仿射变换参数,计算过程如公式(2)所示。
Figure PCTCN2015091886-appb-100001
其中,k为所述面部关键点的数量,集合{(xn (1),yn (1))}和集合{(xn (2),yn (2))}分别为第n个所述面部关键点的初始坐标与所述真实坐标,其中k和n为正整数。初始模板和真实位置之间的差异通过参数β1至β6决定的仿射变换消除。公式(2)是一个典型的最小二乘问题,求解完毕后可得到估计出的最佳参数β1至β6
求得参数后,本发明利用线性回归算法学习从图像特征到参数β1至β6映射关系。
首先以关键点的初始位置为中心,截取大小为19x19像素的纹理区域,并利用HOG算子对该区域进行特征编码,得到长度为144的浮点编码,而后将面部k个关键点的编码按照预先定义的顺序进行排列,最终得到长度为144xk的特征编码0。为了提高稳定性,当所有训练样本对应的特征编码都已获得后,对特征编码集合进行归一化操作,使其分布的均值为0,方差为1,并记录相应的归一化参数,便于预测过程使用。而后根据公式(3)进行线性回归模型的训练。
Figure PCTCN2015091886-appb-100002
其中,βm i为第i个样本的第m个最佳参数,
Figure PCTCN2015091886-appb-100003
为第i个样本归一化后的HOG特征编码,rm和bm分别为待求解的第m个参数的线性投影向量和偏置值。通过计算,可得到由rm和bm表示的线性回归模型。
刚体演变可以消除初始模板与真实坐标之间的仿射变换差异,但不同表情、姿态以及人脸之间的固有差异仍然存在,这些差异需要通过非刚体演变进行消除。
非刚体演变的训练基于刚体演变的预测结果进行。和刚体演变的训练过程相似,非刚体演变的训练也需要从初始点提取方向梯度直方图(HistogramofOrientedGradient,HOG)特征,组成长度为144xk的特征向量,并进行归一化操作。主要的区别在于线性回归模型的训练过程。非刚体演变的训练根据公式(4)实施。
Figure PCTCN2015091886-appb-100004
其中,Δn i为第i个样本的第int(n/2)个关键点的所述初始坐标与所述真实坐标在X轴或Y轴上的偏移量,
Figure PCTCN2015091886-appb-100005
为第i个样本归一化后的HOG特征编码,rn和bn分别为待求解的第n个偏移量的线性投影向量和偏置值。通过计算,可得到由rn和bn表示的线性回归模型。非刚体演化过程可对每个关键点的坐标进行独立调节,进一步提高定位精度。
为了达到最佳预测效果,本发明包含了多次迭代过程,每次迭代过程包含一个刚体演化过程和非刚体演化过程,其训练过程与上述内容相同,唯一不同在迭代过程的初始模板来自前次迭代的预测结果。
图3为本发明刚体演化过程示意图,如图3所示,在预测过程中,本发明利用训练过程得到的线性回归模型进行关键点预测。首先将测试人脸縮放至128x128像素,将均值模板置于人脸正中;而后提取对应的HOG特征,分别与刚体演化训练中得到向量r和偏置值b、进行矩阵乘和向量加操作,得到样本对应的6个仿射变换参数,再根据参数推算关键点的新位置,该过程如图3所示。
图4为本发明非刚体演化过程示意图,如图4所示,在得到刚体演化的预测结果后,根据预测位置提取对应的HOG特征,与非刚体演化训练中得到的向量rn和偏置bn进行矩阵乘和向量加操作,得到样本对应的2k个偏移量,再根据偏移量调整关键点的坐标,该过程如图4所示。
图5为本发明迭代过程示意图,最后,按照上述过程依次执行多次迭代,得到最终的预测结果,该过程如图5所示。图6为本发明的预测效果示意图,从图6中可以看出本发明的最终效果。
综上所述,本发明提出一种实时、鲁棒、高精度的面部关键点定位方法,旨在利用特征提取和机器学习相结合的技术对移动平台条件下捕获的人脸图片进行关键点定位。与已知方法相比,本发明对计算性能及图片的成像环境要求不高,且在模型大小上更有优势,因此能够广泛地应用在移动平台上,为人机交互、表情分析、视线控制、疲劳监控等其他机器视觉应用打下坚实的基础。
通过说明和附图,给出了具体实施方式的特定结构的典型实施例,基于本发明精神,还可作其他的转换。尽管上述发明提出了现有的较佳实施例,然而,这些内容并不作为局限。
对于本领域的技术人员而言,阅读上述说明后,各种变化和修正无疑将显而易见。因此,所附的权利要求书应看作是涵盖本发明的真实意图和范围的全部变化和修正。在权利要求书范围内任何和所有等价的范围与内容,都应认为仍属本发明的意图和范围内。

Claims (11)

  1. 一种面部关键点的标注方法,其特征在于,所述标注方法包括两个过程,分别为训练过程和使用过程,其中,所述训练过程、所述使用过程均包括刚体演变和非刚体演变,所述刚体演变与非刚体演变能够进行多次迭代,以提高所述面部关键点的预测精度,其中,
    所述标注方法具体包括:
    步骤S1:通过所述训练过程的刚体演变消除初始模板与真实坐标之间的仿射变化差异;
    步骤S2:通过所述训练过程的非刚体演变对初始模板中每个所述面部关键点进行独立调节,以提高所述面部关键点的定位精度;
    步骤S3:通过所述使用过程的刚体演变,根据线性回归模型进行所述面部关键点的预测,推算所述面部关键点的新位置;
    步骤S4:在所述使用过程的非刚体演变中,根据所述新位置调节所述面部关键点的坐标,得到预测结果。
  2. 根据权利要求1所述的面部关键点的标注方法,其特征在于,所述步骤S1具体包括:
    步骤S11:将所述初始模板放置在训练图片的中心位置,并解得所述初始模板与所述真实坐标之间的最佳仿射变换参数。
  3. 根据权利要求2所述的面部关键点的标注方法,其特征在于,所述步骤S11中:通过公式解得所述最佳仿射变换参数,所述公式为:
    Figure PCTCN2015091886-appb-100001
    其中,K为所述面部关键点的数量,集合{(xn (1),yn (1))}和集合{(xn (2),yn (2))}分别为第n个所述面部关键点的初始坐标与所述真实坐标,n为正整数;
    以及所述初始坐标与所述真实坐标之间的差异通过参数β1至β6决定的仿射变换消除,求解后得出参数β1至β6
  4. 根据权利要求3所述的面部关键点的标注方法,其特征在于,所述步骤S1还包括:
    步骤S12:以所述面部关键点的初始位置为中心,并截取纹理区域;
    步骤S13:利用方向梯度直方图算子对所述纹理区域进行特征编码,得到长度为Z的浮点编码;
    步骤S14:将K个面部关键点的编码按照预定义顺序进行特征编码,得到长度为Z×K的特征编码;
    步骤S15:将所述Z×K的特征编码进行归一化操作,得到归一化参数,其中,
    所述归一化操作中,分布的均值为0,方差为1。
  5. 根据权利要求4所述的面部关键点的标注方法,其特征在于,所述步骤S1还包括:
    步骤S16:于所述步骤S15之后,根据公式进行所述线性回归模型的训练,其中,所述公式为:
    Figure PCTCN2015091886-appb-100002
    其中,βm i为第i个样本的第m个最佳参数,
    Figure PCTCN2015091886-appb-100003
    为第i个样本归一化后的方向梯度直方图特征编码,rm和bm分别为待求解的第m个参数的线性投影向量和偏置值,通过计算得到由rm和bm表示的线性回归模型,m和i均为正整数。
  6. 根据权利要求5所述的面部关键点的标注方法,其特征在于,所述步骤S2具体包括:
    步骤S21:从所述初始坐标提取方向梯度直方图特征,组成长度为Z×K的特征向量,并进行所述归一化操作;
    步骤S22:根据公式进行线性回归模型的非刚体演变训练过程。
  7. 根据权利要求6所述的面部关键点的标注方法,其特征在于,所述非刚体演变训练过程的实施公式为:
    Figure PCTCN2015091886-appb-100004
    其中,Δn i为第i个样本的第int(n/2)个关键点的所述初始坐标与所述真实坐标在X轴或Y轴上的偏移量,
    Figure PCTCN2015091886-appb-100005
    为第i个样本归一化后的方向梯度直方图特征编码,rn和bn分别为待求解的第n个偏移量的线性投影向量和偏置值,通过计算,得到由rn和bn表示的线性回归模型。
  8. 根据权利要求7所述的面部关键点的标注方法,其特征在于,所述步骤S3具体包括:
    步骤S31:将面部图像调节至指定像素,将所述初始模板置于所述面部图像的人脸正中;
    步骤S32:提取所述面部图像的方向梯度直方图,进行矩阵乘和向量加操作,得到仿射变换参数;
    步骤S33:根据所述仿射变换参数推算所述面部关键点的新位置。
  9. 根据权利要求8所述的面部关键点的标注方法,其特征在于,所述步骤S4具体包括:
    步骤S41:根据所述步骤S33推算的新位置提取对应的方向梯度直方图特征,进行矩阵乘和向量加操作,得到2K个偏移量;
    步骤S42:根据所述偏移量调整所述面部关键点的坐标。
  10. 根据权利要求9所述的面部关键点的标注方法,其特征在于,所述标注方法还包括:
    步骤S01:于所述步骤S1之前,将所述面部图像归一化到指定像素,标出所述面部关键点的真实坐标;
    步骤S02:根据公式求出初始模板。
  11. 根据权利要求10所述的面部关键点的标注方法,其特征在于,所述初始模板的求解公式为Sμ
    Figure PCTCN2015091886-appb-100006
PCT/CN2015/091886 2015-09-22 2015-11-09 一种面部关键点的标注方法 WO2017049677A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510608688.6 2015-09-22
CN201510608688.6A CN105354531B (zh) 2015-09-22 2015-09-22 一种面部关键点的标注方法

Publications (1)

Publication Number Publication Date
WO2017049677A1 true WO2017049677A1 (zh) 2017-03-30

Family

ID=55330499

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/091886 WO2017049677A1 (zh) 2015-09-22 2015-11-09 一种面部关键点的标注方法

Country Status (2)

Country Link
CN (1) CN105354531B (zh)
WO (1) WO2017049677A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108961149A (zh) * 2017-05-27 2018-12-07 北京旷视科技有限公司 图像处理方法、装置和系统及存储介质
CN110084221A (zh) * 2019-05-08 2019-08-02 南京云智控产业技术研究院有限公司 一种基于深度学习的带中继监督的序列化人脸关键点检测方法
CN111062400A (zh) * 2018-10-16 2020-04-24 浙江宇视科技有限公司 目标匹配方法及装置
CN111241961A (zh) * 2020-01-03 2020-06-05 精硕科技(北京)股份有限公司 人脸检测方法、装置及电子设备
RU2770752C1 (ru) * 2018-11-16 2022-04-21 Биго Текнолоджи Пте. Лтд. Способ и устройство для обучения модели распознавания лица и устройство для определения ключевой точки лица
WO2022197428A1 (en) * 2021-03-15 2022-09-22 Tencent America LLC Methods and systems for constructing facial position map
CN111062400B (zh) * 2018-10-16 2024-04-30 浙江宇视科技有限公司 目标匹配方法及装置

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106056080B (zh) * 2016-05-30 2019-11-22 中控智慧科技股份有限公司 一种可视化的生物计量信息采集装置和方法
CN108171244A (zh) * 2016-12-07 2018-06-15 北京深鉴科技有限公司 对象识别方法和系统
CN106897675B (zh) * 2017-01-24 2021-08-17 上海交通大学 双目视觉深度特征与表观特征相结合的人脸活体检测方法
CN107122705B (zh) * 2017-03-17 2020-05-19 中国科学院自动化研究所 基于三维人脸模型的人脸关键点检测方法
CN107423689B (zh) * 2017-06-23 2020-05-15 中国科学技术大学 智能交互式人脸关键点标注方法
CN108764048B (zh) * 2018-04-28 2021-03-16 中国科学院自动化研究所 人脸关键点检测方法及装置
CN109002769A (zh) * 2018-06-22 2018-12-14 深源恒际科技有限公司 一种基于深度神经网络的牛脸对齐方法及系统
CN109034095A (zh) * 2018-08-10 2018-12-18 杭州登虹科技有限公司 一种人脸对齐检测方法、装置和存储介质
CN109635659B (zh) * 2018-11-12 2020-10-30 东软集团股份有限公司 人脸关键点定位方法、装置、存储介质及电子设备
CN110110695B (zh) * 2019-05-17 2021-03-19 北京字节跳动网络技术有限公司 用于生成信息的方法和装置
CN111981975B (zh) * 2019-05-22 2022-03-08 顺丰科技有限公司 物体体积测量方法、装置、测量设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7903883B2 (en) * 2007-03-30 2011-03-08 Microsoft Corporation Local bi-gram model for object recognition
CN104268591A (zh) * 2014-09-19 2015-01-07 海信集团有限公司 一种面部关键点检测方法及装置
CN104598936A (zh) * 2015-02-28 2015-05-06 北京畅景立达软件技术有限公司 人脸图像面部关键点的定位方法
CN104715227A (zh) * 2013-12-13 2015-06-17 北京三星通信技术研究有限公司 人脸关键点的定位方法和装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100561503C (zh) * 2007-12-28 2009-11-18 北京中星微电子有限公司 一种人脸眼角与嘴角定位与跟踪的方法及装置
CN103632129A (zh) * 2012-08-28 2014-03-12 腾讯科技(深圳)有限公司 一种人脸特征点定位方法及装置
CN103390282B (zh) * 2013-07-30 2016-04-13 百度在线网络技术(北京)有限公司 图像标注方法及其装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7903883B2 (en) * 2007-03-30 2011-03-08 Microsoft Corporation Local bi-gram model for object recognition
CN104715227A (zh) * 2013-12-13 2015-06-17 北京三星通信技术研究有限公司 人脸关键点的定位方法和装置
CN104268591A (zh) * 2014-09-19 2015-01-07 海信集团有限公司 一种面部关键点检测方法及装置
CN104598936A (zh) * 2015-02-28 2015-05-06 北京畅景立达软件技术有限公司 人脸图像面部关键点的定位方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GUO, RUIXIONG ET AL.: "Automatic Localization of Facial Key-Points for 3D Face Modeling", JOURNAL OF COMPUTER APPLICATIONS, vol. 30, no. 10, 31 October 2010 (2010-10-31) *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108961149A (zh) * 2017-05-27 2018-12-07 北京旷视科技有限公司 图像处理方法、装置和系统及存储介质
CN111062400A (zh) * 2018-10-16 2020-04-24 浙江宇视科技有限公司 目标匹配方法及装置
CN111062400B (zh) * 2018-10-16 2024-04-30 浙江宇视科技有限公司 目标匹配方法及装置
RU2770752C1 (ru) * 2018-11-16 2022-04-21 Биго Текнолоджи Пте. Лтд. Способ и устройство для обучения модели распознавания лица и устройство для определения ключевой точки лица
US11922707B2 (en) 2018-11-16 2024-03-05 Bigo Technology Pte. Ltd. Method and apparatus for training face detection model, and apparatus for detecting face key point
CN110084221A (zh) * 2019-05-08 2019-08-02 南京云智控产业技术研究院有限公司 一种基于深度学习的带中继监督的序列化人脸关键点检测方法
CN111241961A (zh) * 2020-01-03 2020-06-05 精硕科技(北京)股份有限公司 人脸检测方法、装置及电子设备
CN111241961B (zh) * 2020-01-03 2023-12-08 北京秒针人工智能科技有限公司 人脸检测方法、装置及电子设备
WO2022197428A1 (en) * 2021-03-15 2022-09-22 Tencent America LLC Methods and systems for constructing facial position map
US11587288B2 (en) 2021-03-15 2023-02-21 Tencent America LLC Methods and systems for constructing facial position map

Also Published As

Publication number Publication date
CN105354531A (zh) 2016-02-24
CN105354531B (zh) 2019-05-21

Similar Documents

Publication Publication Date Title
WO2017049677A1 (zh) 一种面部关键点的标注方法
CN108764048B (zh) 人脸关键点检测方法及装置
CN108549873B (zh) 三维人脸识别方法和三维人脸识别系统
CN106055091B (zh) 一种基于深度信息和校正方式的手部姿态估计方法
US20160275339A1 (en) System and Method for Detecting and Tracking Facial Features In Images
US20170371403A1 (en) Gesture recognition using multi-sensory data
WO2020078111A1 (zh) 一种体重测量方法、设备和计算机可读存储介质
CN110580723A (zh) 一种利用深度学习和计算机视觉进行精准定位的方法
CN108256394A (zh) 一种基于轮廓梯度的目标跟踪方法
CN107292925A (zh) 基于Kinect深度相机测量方法
CN108197605A (zh) 基于深度学习的牦牛身份识别方法
CN107016319B (zh) 一种特征点定位方法和装置
CN107480603B (zh) 基于slam和深度摄像头的同步建图与物体分割方法
CN104700412B (zh) 一种视觉显著图的计算方法
CN104821010A (zh) 基于双目视觉的人手三维信息实时提取方法及系统
CN106599810B (zh) 一种基于栈式自编码的头部姿态估计方法
CN107563323A (zh) 一种视频人脸特征点定位方法
Cheng et al. Real-time and efficient 6-D pose estimation from a single RGB image
CN100383807C (zh) 结合活动形状模型和快速活动外观模型的特征点定位方法
CN109993116B (zh) 一种基于人体骨骼相互学习的行人再识别方法
CN115205926A (zh) 一种基于多任务学习的轻量化鲁棒人脸对齐方法及系统
CN106971176A (zh) 基于稀疏表示的红外人体目标跟踪方法
CN102663728B (zh) 基于字典学习的医学图像交互式联合分割
US9659210B1 (en) System and method for detecting and tracking facial features in images
CN106485739B (zh) 一种基于l2距离的点集配准方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15904553

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15904553

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17/09/2018)

122 Ep: pct application non-entry in european phase

Ref document number: 15904553

Country of ref document: EP

Kind code of ref document: A1