WO2022252118A1 - Head posture measurement method and apparatus - Google Patents

Head posture measurement method and apparatus Download PDF

Info

Publication number
WO2022252118A1
WO2022252118A1 PCT/CN2021/097701 CN2021097701W WO2022252118A1 WO 2022252118 A1 WO2022252118 A1 WO 2022252118A1 CN 2021097701 W CN2021097701 W CN 2021097701W WO 2022252118 A1 WO2022252118 A1 WO 2022252118A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
target object
point
parameterized
cloud data
Prior art date
Application number
PCT/CN2021/097701
Other languages
French (fr)
Chinese (zh)
Inventor
刘杨
郭子衡
黄为
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202180001892.5A priority Critical patent/CN113544744A/en
Priority to PCT/CN2021/097701 priority patent/WO2022252118A1/en
Publication of WO2022252118A1 publication Critical patent/WO2022252118A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • G06T7/337Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face

Definitions

  • the present application relates to the technical field of artificial intelligence, in particular to a head posture measurement method and device.
  • 3D face reconstruction technology is a research hotspot in the field of computer vision and computer graphics.
  • 3D face reconstruction is one of the core technologies in the fields of virtual reality/augmented reality, automatic driving, robotics, etc., and it has great application value in the Driver Monitoring System (DMS).
  • DMS Driver Monitoring System
  • the driver's head data monitored by DMS can be used to analyze the driver's driving behavior. By analyzing the driver's driving behavior, dangerous driving can be avoided. Therefore, convenient and accurate monitoring of driver's head data has great application value.
  • a dedicated measuring device is generally used to measure the driver's head data, for example, the Smarteye system.
  • the Smarteye system marks the 3D key points of the face based on the multi-eye camera to establish the head coordinate system.
  • the head pose measurement part of the Smarteye system consists of 4 high-definition infrared cameras. During the measurement, it is necessary to use the 4 high-definition infrared cameras to simultaneously track the 2D key points of the face, and then project the 2D key points into the 3D space to obtain the 3D key points.
  • this method needs to use a checkerboard calibration board for geometric calibration in the system configuration stage, and the operation is complicated. In addition, the system is expensive and cannot be widely used.
  • Another method in the prior art is to monitor the driver's head data based on an optical tracker.
  • the method requires the driver to wear a marking device on the head, and establishes the transformation relationship from the marking point of the marking device to the head coordinate system.
  • this method relies on the stability of the marking equipment and complex coordinate transformation, etc., which is not only easy to introduce errors, but also complicated to operate.
  • the present application provides a head posture measurement method and device, which can obtain head data conveniently and accurately.
  • the first aspect of the present application provides a head posture measurement method, the method comprising:
  • Obtain the face point cloud data of the target object based on the face point cloud data of the target object and the face two-dimensional key point data of the target object, obtain the point cloud data of the face key points of the target object; Register the point cloud data of the key points with the point cloud data of the key points of the face in the parameterized face model to obtain the first similarity transformation parameters; optimize the first similarity transformation parameters according to the objective function to obtain the second similarity transformation parameters ; Determine the head pose of the target object according to the second similarity change parameter.
  • a head posture measurement method provided by the present application obtains the first similarity transformation parameters by registering the point cloud data of the key points of the face in the parameterized face model with the point cloud data of the key points of the face of the target object , and then use the objective function to optimize the first similarity transformation parameters, which can improve the accuracy of the similarity transformation parameters between the two, make the fit between the two higher, obtain higher fitting accuracy, and obtain a more accurate head monitoring data.
  • the technical solution of the present application there is no need to introduce additional expensive equipment, therefore, the effect of cost saving can also be achieved.
  • the objective function includes a point-plane distance function; wherein, the point-plane distance function is the point in the face point cloud data of the target object to the nearest triangle in the parameterized face model The distance function of the patch; the triangle patch is a triangle formed by three adjacent points in the parameterized face model.
  • the point-to-plane distance function includes:
  • D 2pf is the point-plane distance function
  • s i is the point in the face point cloud data of the target object
  • p(s i , f c(i) ) is the point s i with the closest distance in the parameterized face model
  • the projection point on the triangular face f c(i) is the point closest to si on the jth side of the nearest triangular face f c(i)
  • the degree of fit between the point cloud data of the face of the target object and the point cloud data of the key points of the face in the parameterized face model can be improved, and higher fitting accuracy can be obtained, thereby obtaining accurate header data.
  • the objective function also includes: a key point projection distance function; wherein, the key point projection distance function is the projection of the key points of the face in the parameterized face model on the two-dimensional image of the face point, a function of the distance to the face 2D keypoints on the face 2D image of the target subject.
  • the key point projection distance function includes:
  • D proj is the key point projection distance function
  • u i is the projection point of the face key point in the parameterized face model on the face two-dimensional image
  • v i is the face two-dimensional key point on the face two-dimensional image point
  • n is the number of face key points in the parameterized face model.
  • the distance to the face two-dimensional key points on the face two-dimensional image of the target object can be improved.
  • the degree of fit between the subtle parts of the face (such as the edge of the lips) and the parametric face model makes the fitting more accurate.
  • the objective function further includes: a penalty term function that parameterizes coefficients of the face model; wherein, the penalty term is used to constrain the size of the coefficients.
  • the penalty term function of parameterized face model coefficients includes:
  • E pri ⁇ S *
  • E pri is the penalty term function of the parameterized face model coefficient
  • S is the shape coefficient in the parameterized face model
  • E is the expression coefficient in the parameterized face model
  • P is the pose in the parameterized face model coefficient
  • ⁇ S is the penalty coefficient of the shape coefficient in the parameterized face model
  • ⁇ E is the penalty coefficient of the expression coefficient in the parameterized face model
  • ⁇ P is the penalty coefficient of the pose coefficient in the parameterized face model.
  • the deformation ability of the parametric face model can be constrained, which can reduce the deformity that is easy to occur when only using distance to constrain it.
  • obtaining face point cloud data of the target object includes: obtaining point cloud data of the target object based on the two-dimensional image and depth image of the target object; extracting the point cloud data from the two-dimensional image of the target object A two-dimensional face image of the target object; according to the extracted two-dimensional face image, point cloud data corresponding to the two-dimensional face image is extracted from the point cloud data of the target object.
  • the face two-dimensional image can be obtained simply and quickly.
  • the point cloud data corresponding to the three-dimensional image can be obtained simply and quickly.
  • the two-dimensional image and the depth image of the target object are obtained by a TOF camera.
  • the point cloud data of the face key points of the target object is obtained, including: using the target object
  • the two-dimensional coordinates corresponding to the face two-dimensional key point data are indexed in the face point cloud data of the target object, so as to obtain the point cloud data of the face key point of the target object.
  • the facial key points of the target object are 51 facial key points.
  • the facial key points of the target object are 68 facial key points.
  • the process of registering the point cloud data of the facial key points of the target object with the point cloud data of the facial key points in the parameterized face model is a process of rigid body transformation.
  • the process of optimizing the first similarity transformation parameters according to the objective function with the first similarity transformation parameters as initial values is a non-rigid body transformation process.
  • the gradient descent method is used to optimize the first similarity transformation parameters according to the objective function with the first similarity transformation parameters as initial values to obtain the second similarity transformation parameters.
  • the quasi-Newton method is used, with the first similarity transformation parameters as initial values, and the first similarity transformation parameters are optimized according to the objective function to obtain the second similarity transformation parameters.
  • determining the head pose of the target object according to the second similarity transformation parameter includes: performing a Rodrigues transformation on the second similarity transformation parameter to obtain an Ou used to represent the head pose of the target object. pull angle.
  • the method further includes: determining the concentration of the target object according to the head posture of the target object; and sending an alarm to the target object based on the concentration of the target object.
  • the false alarm can be: when the target object is not highly focused, no alarm is issued at this time, which may cause a safety hazard; another example: when the target object is highly focused, an alarm is issued at this time, which will affect the target object's concentration Spend.
  • the second aspect of the present application provides a head posture measurement device, including:
  • the first obtaining module is used to obtain the facial point cloud data of the target object
  • the second acquisition module is used to obtain point cloud data of the face key points of the target object based on the face point cloud data of the target object and the two-dimensional key point data of the face of the target object;
  • the third acquisition module is used to register the point cloud data of the key points of the face of the target object with the point cloud data of the key points of the face in the parameterized face model to obtain the first similar transformation parameters;
  • a fourth acquisition module configured to optimize the first similarity transformation parameters according to an objective function, and obtain second similarity transformation parameters
  • the first determination module is configured to determine the head posture of the target object according to the second similarity change parameter.
  • the objective function in the fourth acquisition module includes a point-plane distance function
  • the point-plane distance function is the distance function from the point in the face point cloud data of the target object to the nearest triangle patch in the parameterized face model; the triangle patch is the three adjacent triangles in the parameterized face model. The triangle formed by the vertices.
  • the point-to-plane distance function is specifically used for:
  • D 2pf is the point-plane distance function
  • s i is the point in the face point cloud data of the target object
  • p(s i , f c(i) ) is the point s i with the closest distance in the parameterized face model
  • the projection point on the triangular face f c(i) is the point closest to si on the jth side of the nearest triangular face f c(i)
  • the objective function in the fourth acquisition module further includes: a key point projection distance function
  • the key point projection distance function is a function of the distance between the key point of the face in the parameterized face model on the two-dimensional image of the face and the distance to the two-dimensional key point of the face on the two-dimensional face image of the target object .
  • the key point projection distance function is specifically used for:
  • D proj is the key point projection distance function
  • u i is the projection point of the face key point in the parameterized face model on the face two-dimensional image
  • v i is the face two-dimensional key point on the face two-dimensional image point
  • n is the number of face key points in the parameterized face model.
  • the objective function in the fourth acquisition module also includes:
  • the penalty term function of parameterized face model coefficients is specifically used for:
  • E pri ⁇ S *
  • E pri is the penalty term function of the parameterized face model coefficient
  • S is the shape coefficient in the parameterized face model
  • E is the expression coefficient in the parameterized face model
  • P is the pose in the parameterized face model coefficient
  • ⁇ S is the penalty coefficient of the shape coefficient in the parameterized face model
  • ⁇ E is the penalty coefficient of the expression coefficient in the parameterized face model
  • ⁇ P is the penalty coefficient of the pose coefficient in the parameterized face model.
  • the first acquisition module includes:
  • the first acquisition submodule is used to obtain point cloud data of the target object based on the two-dimensional image and the depth image of the target object;
  • the first extraction submodule is used to extract the face two-dimensional image of the target object from the two-dimensional image of the target object;
  • the second extraction sub-module is used to extract the point cloud data corresponding to the two-dimensional face image from the point cloud data of the target object according to the extracted two-dimensional face image.
  • the two-dimensional image and the depth image of the target object are obtained by a TOF camera.
  • the facial key points of the target object are 51 facial key points.
  • the facial key points of the target object are 68 facial key points.
  • the first determining module is specifically configured to: perform a Rodrigues transformation on the second similarity transformation parameters to obtain Euler angles used to represent the head pose of the target object.
  • the second aspect also includes:
  • the second determination module is used to determine the concentration of the target object according to the head posture of the target object
  • the alarm module is configured to send an alarm to the target object based on the concentration of the target object.
  • a third aspect of the present application provides a computing device, including:
  • At least one memory is connected to the processor and stores program instructions. When the program instructions are executed by the at least one processor, the at least one processor executes the method for measuring head posture according to any one of the above first aspects.
  • a fourth aspect of the present application provides a computer-readable storage medium, on which program instructions are stored.
  • the program instructions are executed by a computer, the computer executes the head posture measurement method of any one of the above-mentioned first aspects.
  • a fifth aspect of the present application provides a computer program product.
  • the computing device is made to execute the method for measuring head posture according to any one of the above first aspects.
  • FIG. 1 is a schematic diagram of an application scenario of a head posture measurement method provided in an embodiment of the present application
  • FIG. 2 is a flow chart of a head posture measurement method provided by an embodiment of the present application.
  • FIG. 3 is a flow chart of a method for determining point cloud data of a face provided by an embodiment of the present application
  • FIG. 4 is an example diagram of a method for determining a point-to-plane distance function provided in an embodiment of the present application
  • FIG. 5 is a flowchart of a specific method of the head posture measurement method provided by the embodiment of the present application.
  • FIG. 6 is a schematic diagram of 68 two-dimensional key points of the face provided by the embodiment of the present application.
  • Fig. 7 is the schematic diagram of the human face coordinate system of the FLAME model face key point that the embodiment of the present application provides;
  • FIG. 8 is a structural schematic diagram of a driving assistance device provided in an embodiment of the present application.
  • FIG. 9 is a structural schematic diagram of a head posture measurement device provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a computing device provided by an embodiment of the present application.
  • Image data with depth it includes ordinary RGB color image information and depth information (Depth Map), and RGB image information and Depth image information are registered, that is, there is a one-to-one correspondence between pixels.
  • the acquisition of image data with depth can be realized through the RGB-D camera, and the collected image data with depth can be presented in the form of an RGB image frame and a depth image frame, or can be presented in the form of integrated image data. According to the internal parameters of the camera, the transformation between depth information and point cloud coordinates can be realized.
  • u, v are arbitrary coordinate points in the image coordinate system
  • u 0 , v 0 are the center coordinates of the image respectively.
  • x w , y w , z w represent three-dimensional coordinate points in the world coordinate system.
  • z c represents the z-axis value of the camera coordinates, that is, the distance from the target to the camera, corresponding to the TOF camera, which is the depth value of the [u,v] point.
  • R and T are the 3x3 rotation matrix and the 3x1 translation matrix of the external parameter matrix respectively.
  • Parametric face model through a standard face (or called average face, reference face, basic shape face, statistical face), combined with shape feature vector, pose feature vector, or expression feature vector to represent a face .
  • standard face or called average face, reference face, basic shape face, statistical face
  • shape feature vector shape feature vector
  • pose feature vector or expression feature vector to represent a face .
  • expression feature vector expression feature vector to represent a face .
  • the FLAME model is constructed based on the real human body point cloud data in the 3D human body scanning database (for example: Caesar database), wherein, each real human head grid is obtained by registering the head data of these real human bodies, and the human head grid contains The entire area of the face and head, thus establishing a real face and head database.
  • the human head grid is composed of several (such as 5023) vertices and several (such as 9976) triangular faces, and several (such as 300) shapes (shape), several (such as 100 1) expression (expression) and several (such as 15) posture (pose) principal components, so that a parameterized 3D human head model can be determined accordingly.
  • the shape T of FLAME is defined as the coordinates of each vertex k constituting the grid, which can be described as the following formula (1):
  • T (x 1 ,y 1 ,z 1 ,x 2 ,...,x n ,y n ,z n ) (1)
  • FLAME models the shape and expression separately, and the FLAME face model can be described as the following formula (2):
  • T 0 is a standard face, that is, the average shape part of the face
  • Si represents the association
  • the eigenvector of the variance matrix is the face shape vector parameter (the above-mentioned shape principal component); q is the coefficient corresponding to the face shape vector parameter.
  • the modeling of the face shape part (which can be recorded as T(S) in this application) can be expressed as a linear combination of the basic shape T 0 plus n shape vectors Si, which can be described as the following formula (3):
  • Geometric registration of the 3D face model that is, transforming the 3D face model to the target position, also known as rigid body transformation, or optimization of angle and posture.
  • the 3D position of each vertex of the model is determined.
  • (w x, k , w y, k , w z, k ) represent the target position.
  • the target position is the 3D coordinates of each key point in the face area.
  • the The vertices of the entire 3D face model are initially aligned with the point cloud in the camera coordinate system; Indicates the rotation parameters of the three axes, s indicates the scaling parameters, and t w indicates the translation parameters.
  • a point cloud matching algorithm can be used (point cloud matching is to solve the transformation relationship between two piles of point clouds, that is, to solve the above-mentioned rotation parameters and translation parameters) for geometric registration.
  • Common point cloud matching algorithms such as iterative closest point algorithm (Iterative Closest Point, ICP), normal distribution transformation algorithm (Normal Distribution Transform, NDT), iterative dual correspondence algorithm (Iterative dual correspondences, IDC), etc., the embodiment of this application The ICP algorithm is used in.
  • the head pose measurement method provided in the embodiment of the present application can be applied to any scene requiring high-precision head pose data.
  • the application scenario may be in an autonomous vehicle (Autonomous Vehicle, AV) or an intelligent driving vehicle.
  • AV autonomous Vehicle
  • the driver's head posture is obtained through the head posture measurement method provided in the embodiment of the present application, and then the driver's head posture is analyzed to determine whether the driver's driving behavior is a dangerous driving behavior, Reminding and warning the driver in time can effectively avoid dangerous driving behavior.
  • the application scenario may also be students attending a class in online teaching.
  • the head pose measurement method provided in the embodiment of the present application is applied.
  • the collected RGB image and the head depth image are transmitted to the local 30, and the local 30 processes the image after receiving the image to obtain The driver's head pose and store the obtained head pose in the local memory.
  • the image acquisition device 10 includes but not limited to a camera.
  • the local 30 may include a local computer or a local processing chip or the like.
  • the collected RGB image and the head depth image of the driver 20 are collected by the image acquisition device 10
  • the collected RGB image and the head depth image can also be transmitted to the remote server 40, and the remote server 40
  • the image is processed to obtain the driver's head posture, and the obtained head posture is stored in the remote memory, and the obtained head posture can also be sent back to the local terminal (such as mobile phone, computer, etc.), the obtained head posture can also be returned to the local storage device and the like.
  • the point cloud data of the facial key points of the target object is the target
  • the three-dimensional coordinates corresponding to the key points of the face of the object may also be the three-dimensional key points of the face of the target object;
  • the key points of the face in the parametric face model may also be called the three-dimensional key points in the parametric face model.
  • FIG. 2 it is a flow chart of the head posture measurement method provided by the embodiment of the present application.
  • the process mainly includes steps S110-S150, each step will be introduced in sequence below:
  • the process may include steps S111-S113, and each step will be introduced in turn below:
  • S111 Obtain point cloud data of the target object based on the two-dimensional image and the depth image of the target object.
  • the two-dimensional image of the target object may be an RGB image of the target object, or may be a grayscale image of the target object, or the like. It should be noted here that, in the embodiment of the present application, both the 2D image and the depth image of the target object should at least include the face area of the target object.
  • a time of flight camera (Time of Fight Camera, TOF camera) can be used to obtain the RGB image and the depth image of the target object.
  • S112 Extract a two-dimensional face image of the target object from the two-dimensional image of the target object.
  • the two-dimensional face image of the target object may be extracted by performing semantic segmentation on the two-dimensional image of the target object.
  • the semantic segmentation is to remove regions other than the face of the target object in the two-dimensional image, such as background, hair, torso, and the like.
  • CNN convolutional neural network
  • FCN full convolutional network
  • Semantic segmentation or utilize mask-R-convolutional neural network (Mask RCNN) to perform semantic segmentation on the RGB image of the target object.
  • the pixels in the two-dimensional face image of the target object extracted in step S112 can be The points are registered with the pixels of the depth image to obtain point cloud data representing the face of the target object.
  • S120 Acquire point cloud data of facial key points of the target object based on the face point cloud data of the target object and the two-dimensional key point data of the target object's face.
  • the face point cloud data of the target object can be indexed to obtain the point cloud data of the face key points corresponding to the two-dimensional key points of the face of the target object ( 3D keypoints of the face of the target object).
  • the two-dimensional key points of the face may be extracted from the two-dimensional face image of the target object.
  • the two-dimensional key points of the face may also be extracted from the two-dimensional image of the target object, which is not limited in this embodiment of the present application.
  • ASM Active Shape Model
  • DAN cascaded deep alignment network
  • S130 Register the point cloud data of the facial key points of the target object with the point cloud data of the facial key points in the parameterized face model to obtain a first similarity transformation parameter.
  • the parameterized face model can be a FLAME model; the registration can also be called pose fitting, that is, rigid body transformation is performed on the entire parameterized face model so that the face in the parameterized face model
  • the point cloud data of the key points is registered with the point cloud data of the face key points of the target object.
  • iterative closest point algorithm (Iterative Closest Point, ICP) can be used to compare the point cloud data of the face key points of the target object with the point cloud data of the face key points in the parameterized face model Registration, and obtain the first similarity transformation parameters.
  • the first similarity transformation parameters include scaling factor, rotation matrix and translation vector.
  • S140 Optimizing the first similarity transformation parameters according to the objective function to obtain second similarity transformation parameters.
  • the second similarity transformation parameters include scaling factor, rotation matrix and translation vector.
  • the objective function includes a point-plane distance function.
  • the point-plane distance function is a distance function from a point in the face point cloud data of the target object to the nearest triangular patch in the parameterized face model.
  • the triangle patch is a triangle formed by three adjacent vertices in the parameterized face model. For example: Referring to Figure 4, point P is a point in the face point cloud data of the target object, and each triangle shown in Figure 4 is the triangle patch formed between adjacent points in the parameterized face model, such as point a , b, c form a triangle.
  • the distance between the face point cloud data of the target object and its adjacent triangular patch can also be calculated, and the size of each distance value can be obtained by comparing, and the minimum value can be obtained , which is the distance from a point in the face point cloud data of the target object to the nearest triangular patch in the parameterized face model.
  • the nearest triangular patch to point P in Figure 4 is not easy to determine, the distance from point P to triangular patch abc and the distance from point P to triangular patch abd can be calculated, and by comparing the two distance values, Determine the point-plane distance function value corresponding to point P.
  • the point-to-plane distance function can be determined as follows:
  • D 2pf is the point-plane distance function
  • s i is the point in the face point cloud data of the target object
  • p(s i , f c(i) ) is the point s i with the closest distance in the parameterized face model
  • the projection point on the triangular face f c(i) is the point closest to si on the jth side of the nearest triangular face f c(i)
  • the objective function may also include a key point projection distance function.
  • the key point projection distance function is the projection point of the face key point in the parameterized face model on the face two-dimensional image, to the face two-dimensional key point on the face two-dimensional image of the target object function of the distance.
  • the key point projection distance function is exactly by projecting these 51 facial feature key points to the target object. on the two-dimensional face image, and calculate the distance between these projected points and the corresponding two-dimensional face key points on the two-dimensional face image of the target object.
  • other representative points may be selected for the face key points, for example, 68 face key points are selected.
  • the key point projection distance function can be determined as follows:
  • D proj is the key point projection distance function
  • u i is the projection point of the face key point in the parameterized face model on the face two-dimensional image
  • v i is the face two-dimensional image on the face two-dimensional image dimension key points
  • the objective function may also include a penalty term function that parameterizes coefficients of the face model.
  • the penalty term is used to constrain the size of the coefficient.
  • the penalty term function of the parameterized face model coefficients that can be determined by the following formula includes:
  • E pri ⁇ S *
  • E pri is the penalty term function of the parameterized face model coefficient
  • S is the shape coefficient in the parameterized face model
  • E is the expression coefficient in the parameterized face model
  • P is the pose in the parameterized face model coefficient
  • ⁇ S is the penalty coefficient of the shape coefficient in the parameterized face model
  • ⁇ E is the penalty coefficient of the expression coefficient in the parameterized face model
  • ⁇ P is the penalty coefficient of the pose coefficient in the parameterized face model.
  • the gradient descent method can be used to take the first similarity change parameter in step S140 as an initial value, and optimize the first similarity transformation parameter based on the objective function in this step until the objective function Convergence, the similarity transformation parameter corresponding to the convergence of the objective function is used as the second similarity transformation parameter.
  • the quasi-Newton method can also be used to take the first similarity change parameter in step S140 as an initial value, and optimize the first similarity transformation parameter based on the objective function in this step until the The objective function converges, and the corresponding similarity transformation parameter when the objective function converges is used as the second similarity variation parameter.
  • S150 Determine the head pose of the target object according to the second similarity change parameter.
  • the second similar change parameters include scaling factor, rotation matrix and translation vector.
  • the rotation matrix in the second similarity change parameter may be selected to represent the head pose of the target object.
  • Euler angles may also be obtained by performing a Rodrigues transformation on the rotation matrix in the second similarity transformation parameter, and using the Euler angles to represent the head pose of the target object.
  • the Euler angles may include a pitch angle (pitch), a yaw angle (yaw), a roll angle (roll) and the like.
  • step S150 provides a head posture measurement method, which is basically the same as the head posture measurement method provided in the above-mentioned embodiments, so this embodiment will not repeat the similarities.
  • the difference is that after step S150, it also includes:
  • S160 Determine the concentration of the target object according to the head posture of the target object.
  • the head posture of the target object can be compared with the preset head posture of the target object. High; when the difference between the two exceeds the preset range, it means that the concentration of the target object is low.
  • S170 Send an alarm to the target object based on the concentration level of the target object.
  • an alarm may be sent to the target object to prompt the target object to concentrate.
  • the manner of issuing the alarm may be playing music, playing prompt quotations, etc., which are not specifically limited in this embodiment of the present application.
  • the head posture measurement method provided by the embodiment of the present application is introduced in detail.
  • the method mainly includes steps S210-S270, each step will be introduced in turn below.
  • S210 Acquire the RGB image and the depth image of the target object. Wherein, both the RGB image and the depth image should at least include the face area of the target object.
  • S220 Perform semantic segmentation on the RGB image of the target object obtained in step S210 to obtain a two-dimensional face image of the target object.
  • S230 Determine face point cloud data of the target object based on the two-dimensional face image of the target object, the depth image of the target object, and the internal reference of the TOF camera.
  • the two-dimensional image acquired by the TOF camera has a one-to-one correspondence with each pixel in the depth image, that is, after obtaining the coordinates (u, v) of each pixel and the depth information z c value of the pixel,
  • the point cloud coordinates (x w , y w , z w ) of the pixel can be obtained, and the face point cloud data of the target object can be obtained accordingly.
  • S240 Extract the two-dimensional key points of the face of the RGB image in step S210.
  • FIG. 6 is a schematic diagram of 68 facial two-dimensional key points extracted in this step.
  • the selection of the 68 facial two-dimensional key points follows the human face key point extraction standard established by dlib or opencv.
  • step S250 Based on the face point cloud data of the target object obtained in step S230 and the two-dimensional key points of the face obtained in step S240, determine the three-dimensional coordinates corresponding to the two-dimensional key points of the face of the target object. For the convenience of description, here These 3D coordinates are called 3D keypoints.
  • each key point in the 68 face two-dimensional key points obtained in step S240 is indexed in the face point cloud data of the target object, and the corresponding three-dimensional key points of the 68 face two-dimensional key points are obtained.
  • key points that is, obtain the 3D key points of the face of the target object.
  • step S260 Register the 3D face key point set of the target object obtained in step S250 with the FLAME model face key point set to obtain a first similarity transformation parameter.
  • the face coordinate system of the key points of the face of the FLAME model needs to be established.
  • l 1 is the three-dimensional coordinates of the left corner of the left eye
  • l 2 is the three-dimensional coordinates of the right corner of the left eye
  • l 3 is the three-dimensional coordinates of the left corner of the right eye
  • l 4 is the right corner of the right eye
  • l 5 is the three-dimensional coordinates of the left nose
  • l 6 is the three-dimensional coordinates of the right nose
  • l 7 is the three-dimensional coordinates of the left corner of the mouth
  • l 8 is the three-dimensional coordinates of the right mouth corner
  • o is the coordinate origin.
  • the ICP algorithm can be used to initially register the 3D face key point set of the target object and the FLAME model face key point set to obtain the first similarity transformation parameters (s, R, t); Among them, s is the scaling factor, R is the rotation matrix, and t is the translation vector.
  • step S270 Optimizing the first similarity transformation parameters in step S260 to obtain optimized similarity transformation parameters, so as to realize accurate registration of the 3D facial key points of the target object and the facial key points of the FLAME model.
  • an objective function may be used to optimize the first similar variation parameter.
  • the objective function E can be established as follows:
  • E pri ⁇ S ⁇ S ⁇ 2 + ⁇ E ⁇ E ⁇ 2 + ⁇ P ⁇ P ⁇ 2
  • D p2f is the distance between each point in the face point cloud data of the target object in step S230 and the face formed by the key points of the FLAME model
  • D proj is the three-dimensional key of the face in the FLAME model
  • E pri is the penalty of the FLAME model coefficient item
  • is the first weight coefficient
  • is the second weight coefficient.
  • s i is the point in the face point cloud data of the target object
  • p(s i ,f c(i) ) is the projection of point s i on the nearest triangular surface f c(i) in the parameterized face model point, is the point closest to si on the jth edge of the nearest triangular face f c(i) , is the wth vertex on the nearest triangular face f c(i) .
  • U i is the projection point of the three-dimensional key point in the parameterized face model on the two-dimensional image of the face
  • v i is the two-dimensional key point of the face on the two-dimensional image of the face
  • n is the parameterized The number of 3D keypoints in the face model.
  • S is the shape coefficient in the parametric face model
  • E is the expression coefficient in the parametric face model
  • P is the pose coefficient in the parametric face model
  • ⁇ S is the penalty of the shape coefficient in the parametric face model coefficient
  • ⁇ E is the penalty coefficient of the expression coefficient in the parameterized face model
  • ⁇ P is the penalty coefficient of the pose coefficient in the parameterized face model.
  • Rodrigues transformation can be performed on the rotation matrix in the similarity transformation parameters obtained by fitting the parameterized face model and the point cloud data of the face to obtain the head used to represent the target object.
  • Euler angles of the head pose at least include a pitch angle (pitch), a yaw angle (yaw) and a roll angle (roll).
  • the head posture measurement method further includes: determining the concentration of the target object according to the head posture of the target object. Wherein, this step is the same as step S160 in the above embodiment, so it will not be repeated here.
  • step S170 An alert is sent to the target object based on the concentration level of the target object.
  • this step is the same as step S170 in the above embodiment, so it will not be repeated here.
  • a driving assistance device which may be realized by a software system, may also be realized by a hardware device, and may also be realized by a combination of a software system and a hardware device.
  • FIG. 8 is only an exemplary structural diagram showing a driving assistance device.
  • the driving assistance device includes a driver's head posture detection module 410 and a driving assistance module 420 .
  • the driver's head posture detection module 410 is used to obtain the driver's head posture using the head posture detection method provided in the above-mentioned embodiments. The embodiments will not be described in detail.
  • the driving assistance module 420 is used for determining the concentration of the driver based on the posture of the driver's head, and issuing a warning to the driver based on the concentration.
  • the device may be implemented by a software system, may also be implemented by a hardware device, and may also be implemented by a combination of a software system and a hardware device.
  • FIG. 9 is only an exemplary structural diagram showing a head posture measurement device, and the present application does not limit the division of functional modules in the head posture measurement device.
  • the head posture measurement device can be logically divided into multiple modules, each module can have different functions, and the function of each module can be read by the processor in the computing device and executed in the memory. instructions to implement.
  • the head posture measurement device includes a first acquisition module 510 , a second acquisition module 520 , a third acquisition module 530 , a fourth acquisition module 540 and a first determination module 550 .
  • the head posture measurement device is used to execute the content described in steps S110-S150 shown in FIG. 2 .
  • a first acquiring module 510 configured to acquire facial point cloud data of the target object.
  • the second acquisition module 520 is configured to acquire point cloud data of key points of the face of the target object based on the point cloud data of the face of the target object and the two-dimensional key point data of the face of the target object.
  • the third acquisition module 530 is configured to register the point cloud data of the facial key points of the target object with the point cloud data of the facial key points in the parameterized face model to obtain the first similarity transformation parameters.
  • the fourth obtaining module 540 is configured to optimize the first similarity transformation parameters according to the objective function to obtain second similarity transformation parameters.
  • the first determining module 550 is configured to determine the head pose of the target object according to the second similarity change parameter.
  • the objective function in the fourth acquisition module 540 includes a point-to-plane distance function
  • the point-plane distance function is a distance function from a point in the face point cloud data of the target object to the nearest triangular patch in the parameterized face model;
  • the triangular patch is the parameter The triangle formed by the adjacent three vertices in the human face model.
  • the point-to-plane distance function includes:
  • D 2pf is the point-to-plane distance function
  • s i is the point in the face point cloud data of the target object
  • p(s i ,f c(i) ) is the point s i with the closest distance in the parameterized face model
  • the projection point on the triangular face f c(i) is the point closest to si on the jth side of the nearest triangular face f c(i)
  • the objective function in the fourth acquisition module 540 further includes: a key point projection distance function
  • the key point projection distance function is the projection point of the face key point in the parameterized face model on the face two-dimensional image, and the face two-dimensional image on the face two-dimensional image of the target object. A function of the distance of keypoints.
  • the key point projection distance function includes:
  • D proj is the key point projection distance function
  • u i is the projection point of the face key point in the parameterized face model on the face two-dimensional image
  • v i is the face on the face two-dimensional image
  • n is the number of facial key points in the parameterized face model.
  • the objective function in the fourth acquisition module 540 also includes:
  • a penalty term function for parameterizing the coefficients of the face model wherein, the penalty term is used to constrain the size of the coefficients.
  • the penalty term function of the parameterized face model coefficients includes:
  • E pri ⁇ S *
  • E pri is the penalty term function of the parameterized face model coefficient
  • S is the shape coefficient in the parameterized face model
  • E is the expression coefficient in the parameterized face model
  • P is the pose in the parameterized face model coefficient
  • ⁇ S is the penalty coefficient of the shape coefficient in the parameterized face model
  • ⁇ E is the penalty coefficient of the expression coefficient in the parameterized face model
  • ⁇ P is the penalty coefficient of the pose coefficient in the parameterized face model.
  • the first acquiring module 510 includes:
  • the first acquisition submodule is used to obtain the point cloud data of the target object based on the two-dimensional image and the depth image of the target object;
  • the first extraction submodule is used to extract the face two-dimensional image of the target object from the two-dimensional image of the target object;
  • the second extraction sub-module is configured to extract point cloud data corresponding to the two-dimensional face image from the point cloud data of the target object according to the extracted two-dimensional face image.
  • the two-dimensional image and the depth image of the target object are obtained by a TOF camera.
  • the facial key points of the target object are 51 facial key points.
  • the facial key points of the target object are 68 facial key points.
  • the first determination module 550 is specifically configured to: perform a Rodrigues transformation on the similarity transformation parameters to obtain Euler angles used to represent the head pose of the target object.
  • the head posture measurement device also includes:
  • the second determination module is used to determine the concentration of the target object according to the head posture of the target object
  • An alarm module configured to issue an alarm to the target object based on the concentration of the target object.
  • FIG. 10 is a schematic structural diagram of a computing device 900 provided by an embodiment of the present application.
  • the computing device 900 includes: a processor 910 , a memory 920 , and a communication interface 930 .
  • the communication interface 930 in the computing device 900 shown in FIG. 10 can be used to communicate with other devices.
  • the processor 910 may be connected to the memory 920 .
  • the memory 920 can be used to store the program codes and data. Therefore, the memory 920 may be a storage unit inside the processor 910, or an external storage unit independent of the processor 910, or may include a storage unit inside the processor 910 and an external storage unit independent of the processor 910. part.
  • computing device 900 may further include a bus.
  • the memory 920 and the communication interface 930 may be connected to the processor 910 through a bus.
  • the bus may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus or the like.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the processor 910 may be a central processing unit (central processing unit, CPU).
  • the processor can also be other general-purpose processors, digital signal processors (digital signal processors, DSPs), application specific integrated circuits (Application specific integrated circuits, ASICs), off-the-shelf programmable gate arrays (field programmable gate arrays, FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the processor 910 adopts one or more integrated circuits for executing related programs, so as to implement the technical solutions provided by the embodiments of the present application.
  • the memory 920 may include read-only memory and random-access memory, and provides instructions and data to the processor 910.
  • a portion of processor 910 may also include non-volatile random access memory.
  • processor 910 may also store device type information.
  • the processor 910 executes the computer-executed instructions in the memory 920 to perform the operation steps of the above method.
  • the computing device 900 may correspond to a corresponding body executing the methods according to the various embodiments of the present application, and the above-mentioned and other operations and/or functions of the modules in the computing device 900 are for realizing the present invention For the sake of brevity, the corresponding processes of the methods in the embodiments are not repeated here.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
  • the embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, it is used to perform a head posture measurement method, and the method includes the methods described in the above-mentioned embodiments. at least one of the options.
  • the computer storage medium in the embodiments of the present application may use any combination of one or more computer-readable media.
  • the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
  • a computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer readable storage media include: electrical connections with one or more leads, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer readable signal medium may include a data signal carrying computer readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .
  • Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for performing the operations of the present application may be written in one or more programming languages or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional Procedural Programming Language - such as "C" or a similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through the Internet using an Internet service provider). connect).
  • LAN local area network
  • WAN wide area network
  • connect such as AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

A head posture measurement method, comprising: acquiring face point cloud data of a target object (S110); acquiring point cloud data of face keypoints of the target object on the basis of the face point cloud data of the target object and face two-dimensional keypoint data of the target object (S120); aligning the point cloud data of the face keypoints of the target object with point cloud data of face keypoints in a parameterized face model to obtain a first similarity transform parameter (S130); optimizing the first similarity transform parameter according to an objective function to obtain a second similarity transform parameter (S140); and determining the head posture of the target object according to the second similarity transform parameter (S150).

Description

一种头部姿态测量方法及装置Method and device for measuring head posture 技术领域technical field
本申请涉及人工智能技术领域,特别涉及一种头部姿态测量方法及装置。The present application relates to the technical field of artificial intelligence, in particular to a head posture measurement method and device.
背景技术Background technique
3D人脸重建技术是计算机视觉和计算机图形学领域的研究热点。3D人脸的重建是虚拟现实/增强现实、自动驾驶、机器人等领域的核心技术之一,且在驾驶员状态监测系统(Driver Monitoring System,DMS)中具有极大的应用价值。DMS监测到的驾驶员的头部数据可以用于分析驾驶员的驾驶行为。通过对驾驶员的驾驶行为进行分析,可以避免危险驾驶的发生。因此,方便且准确的监测驾驶员头部数据具有极大的应用价值。3D face reconstruction technology is a research hotspot in the field of computer vision and computer graphics. 3D face reconstruction is one of the core technologies in the fields of virtual reality/augmented reality, automatic driving, robotics, etc., and it has great application value in the Driver Monitoring System (DMS). The driver's head data monitored by DMS can be used to analyze the driver's driving behavior. By analyzing the driver's driving behavior, dangerous driving can be avoided. Therefore, convenient and accurate monitoring of driver's head data has great application value.
现有技术中一般采用专用的测量设备来测量驾驶员的头部数据,例如:Smarteye系统。Smarteye系统基于多目相机来标记人脸3D关键点,从而建立人头坐标系。Smarteye系统的头部姿态测量部分由4个高清红外相机组成,在测量时需要利用4个高清红外相机同时追踪人脸2D关键点,然后将2D关键点投影到3D空间中得到3D关键点。但是,该方法在系统配置阶段需要用棋盘格标定板进行几何标定,操作复杂。另外,该系统价格昂贵,无法大面积推广使用。In the prior art, a dedicated measuring device is generally used to measure the driver's head data, for example, the Smarteye system. The Smarteye system marks the 3D key points of the face based on the multi-eye camera to establish the head coordinate system. The head pose measurement part of the Smarteye system consists of 4 high-definition infrared cameras. During the measurement, it is necessary to use the 4 high-definition infrared cameras to simultaneously track the 2D key points of the face, and then project the 2D key points into the 3D space to obtain the 3D key points. However, this method needs to use a checkerboard calibration board for geometric calibration in the system configuration stage, and the operation is complicated. In addition, the system is expensive and cannot be widely used.
现有技术中另外一种方法是基于光学追踪仪来监测驾驶员的头部数据。该方法需要驾驶员头带标记设备,并建立该标记设备的标记点到头部坐标系的转换关系。但是,该方法依赖于标记设备的稳定性以及复杂的坐标转换等,不仅容易引入误差且操作复杂。Another method in the prior art is to monitor the driver's head data based on an optical tracker. The method requires the driver to wear a marking device on the head, and establishes the transformation relationship from the marking point of the marking device to the head coordinate system. However, this method relies on the stability of the marking equipment and complex coordinate transformation, etc., which is not only easy to introduce errors, but also complicated to operate.
发明内容Contents of the invention
鉴于现有技术的以上问题,本申请提供一种头部姿态测量方法及装置,可以方便且准确的获得头部数据。In view of the above problems in the prior art, the present application provides a head posture measurement method and device, which can obtain head data conveniently and accurately.
为达到上述目的,本申请第一方面提供了一种头部姿态测量方法,该方法包括:In order to achieve the above purpose, the first aspect of the present application provides a head posture measurement method, the method comprising:
获取目标对象的脸部点云数据;基于目标对象的脸部点云数据和目标对象的脸部二维关键点数据,获取目标对象的脸部关键点的点云数据;将目标对象的脸部关键点的点云数据与参数化人脸模型中脸部关键点的点云数据进行配准,获得第一相似变换参数;根据目标函数对第一相似变化参数进行优化,获得第二相似变换参数;根据第二相似变化参数确定目标对象的头部姿态。Obtain the face point cloud data of the target object; based on the face point cloud data of the target object and the face two-dimensional key point data of the target object, obtain the point cloud data of the face key points of the target object; Register the point cloud data of the key points with the point cloud data of the key points of the face in the parameterized face model to obtain the first similarity transformation parameters; optimize the first similarity transformation parameters according to the objective function to obtain the second similarity transformation parameters ; Determine the head pose of the target object according to the second similarity change parameter.
本申请提供的一种头部姿态测量方法,通过将参数化人脸模型中脸部关键点的点云数据与目标对象的脸部关键点的点云数据进行配准,得到第一相似变换参数,再利用目标函数对第一相似变换参数进行优化,可以提高二者间相似变换参数的精确性,使二者间贴合度更高,获得更高的拟合精度,从而获得更准确的头部监测数据。基于本申请的技术方案,不需额外引入昂贵的设备,因此,还可以达到节约成本的效果。A head posture measurement method provided by the present application obtains the first similarity transformation parameters by registering the point cloud data of the key points of the face in the parameterized face model with the point cloud data of the key points of the face of the target object , and then use the objective function to optimize the first similarity transformation parameters, which can improve the accuracy of the similarity transformation parameters between the two, make the fit between the two higher, obtain higher fitting accuracy, and obtain a more accurate head monitoring data. Based on the technical solution of the present application, there is no need to introduce additional expensive equipment, therefore, the effect of cost saving can also be achieved.
作为第一方面的一种可能的实现方式,目标函数包括点面距离函数;其中,点面距离函数为目标对象的脸部点云数据中的点到参数化人脸模型中距离最近的三角面片的距离函数;三角面片是参数化人脸模型中相邻的三个点所构成的三角形。As a possible implementation of the first aspect, the objective function includes a point-plane distance function; wherein, the point-plane distance function is the point in the face point cloud data of the target object to the nearest triangle in the parameterized face model The distance function of the patch; the triangle patch is a triangle formed by three adjacent points in the parameterized face model.
作为第一方面的一种可能的实现方式,点面距离函数包括:As a possible implementation of the first aspect, the point-to-plane distance function includes:
Figure PCTCN2021097701-appb-000001
Figure PCTCN2021097701-appb-000001
其中,D 2pf为点面距离函数,s i为目标对象的脸部点云数据中的点,p(s i,f c(i))为点s i在参数化人脸模型中距离最近的三角面f c(i)上的投影点,
Figure PCTCN2021097701-appb-000002
为最近的三角面f c(i)上第j条边上与s i距离最近的点,
Figure PCTCN2021097701-appb-000003
为最近的三角面f c(i)上第w个顶点。
Among them, D 2pf is the point-plane distance function, s i is the point in the face point cloud data of the target object, p(s i , f c(i) ) is the point s i with the closest distance in the parameterized face model The projection point on the triangular face f c(i) ,
Figure PCTCN2021097701-appb-000002
is the point closest to si on the jth side of the nearest triangular face f c(i) ,
Figure PCTCN2021097701-appb-000003
is the wth vertex on the nearest triangular face f c(i) .
通过计算上述点面距离,可以提高目标对象脸部的点云数据与参数化人脸模型中脸部关键点的点云数据之间的贴合度,获得更高的拟合精度,进而获得准确头部数据。By calculating the above-mentioned point-to-plane distance, the degree of fit between the point cloud data of the face of the target object and the point cloud data of the key points of the face in the parameterized face model can be improved, and higher fitting accuracy can be obtained, thereby obtaining accurate header data.
作为第一方面的一种可能的实现方式,目标函数还包括:关键点投影距离函数;其中,关键点投影距离函数为参数化人脸模型中的脸部关键点在脸部二维图像的投影点,到目标对象的脸部二维图像上的脸部二维关键点的距离的函数。As a possible implementation of the first aspect, the objective function also includes: a key point projection distance function; wherein, the key point projection distance function is the projection of the key points of the face in the parameterized face model on the two-dimensional image of the face point, a function of the distance to the face 2D keypoints on the face 2D image of the target subject.
作为第一方面一种可能的实现方式,关键点投影距离函数包括:As a possible implementation of the first aspect, the key point projection distance function includes:
Figure PCTCN2021097701-appb-000004
Figure PCTCN2021097701-appb-000004
其中,D proj为关键点投影距离函数,u i为参数化人脸模型中的脸部关键点在脸部二维图像的投影点,v i为脸部二维图像上的脸部二维关键点,n为参数化人脸模型中的脸部关键点的数量。 Among them, D proj is the key point projection distance function, u i is the projection point of the face key point in the parameterized face model on the face two-dimensional image, v i is the face two-dimensional key point on the face two-dimensional image point, n is the number of face key points in the parameterized face model.
由上,通过计算参数化人脸模型中的脸部关键点在脸部二维图像的投影点,到目标对象的脸部二维图像上的脸部二维关键点的距离,可以使提高人脸细微部位(例如嘴唇边缘)与参数化人脸模型贴合度,使拟合更为精确。From the above, by calculating the projection point of the face key points in the parameterized face model on the face two-dimensional image, the distance to the face two-dimensional key points on the face two-dimensional image of the target object can be improved. The degree of fit between the subtle parts of the face (such as the edge of the lips) and the parametric face model makes the fitting more accurate.
作为第一方面一种可能的实现方式,目标函数还包括:参数化人脸模型系数的惩罚项函数;其中,惩罚项用于对系数的大小进行约束。As a possible implementation of the first aspect, the objective function further includes: a penalty term function that parameterizes coefficients of the face model; wherein, the penalty term is used to constrain the size of the coefficients.
作为第一方面一种可能的实现方式,参数化人脸模型系数的惩罚项函数包括:As a possible implementation of the first aspect, the penalty term function of parameterized face model coefficients includes:
E pri=λ S*||S|| 2E*||E|| 2P*||P|| 2 E pri =λ S *||S|| 2E *||E|| 2P *||P|| 2
其中,E pri为参数化人脸模型系数的惩罚项函数,S为参数化人脸模型中的形状系数,E为参数化人脸模型中的表情系数,P为参数化人脸模型中的姿态系数,λ S为参数化人脸模型中的形状系数的惩罚系数,λ E为参数化人脸模型中的表情系数的惩罚系数,λ P为参数化人脸模型中的姿态系数的惩罚系数。 Among them, E pri is the penalty term function of the parameterized face model coefficient, S is the shape coefficient in the parameterized face model, E is the expression coefficient in the parameterized face model, and P is the pose in the parameterized face model coefficient, λ S is the penalty coefficient of the shape coefficient in the parameterized face model, λ E is the penalty coefficient of the expression coefficient in the parameterized face model, and λ P is the penalty coefficient of the pose coefficient in the parameterized face model.
通过增加对参数化人脸模型各系数的惩罚项,以此来约束参数化人脸模型的形变能力,可以降低由于仅用距离对其进行约束时,容易产生畸形的情况。By increasing the penalty term for each coefficient of the parametric face model, the deformation ability of the parametric face model can be constrained, which can reduce the deformity that is easy to occur when only using distance to constrain it.
作为第一方面一种可能的实现方式,获取目标对象的脸部点云数据,包括:基于目标对象的二维图像和深度图像获得目标对象的点云数据;从目标对象的二维图像中提取目标对象的脸部二维图像;根据提取得到的脸部二维图像,从目标对象的点云数据中提取脸部二维图像对应的点云数据。As a possible implementation of the first aspect, obtaining face point cloud data of the target object includes: obtaining point cloud data of the target object based on the two-dimensional image and depth image of the target object; extracting the point cloud data from the two-dimensional image of the target object A two-dimensional face image of the target object; according to the extracted two-dimensional face image, point cloud data corresponding to the two-dimensional face image is extracted from the point cloud data of the target object.
由上,通过获得目标对象的点云数据和目标对象的脸部二维图像,再从目标对象的点 云数据中提取脸部二维图像对应的点云数据,可以简单快捷的获得脸部二维图像对应的点云数据。From the above, by obtaining the point cloud data of the target object and the two-dimensional face image of the target object, and then extracting the point cloud data corresponding to the two-dimensional face image from the point cloud data of the target object, the face two-dimensional image can be obtained simply and quickly. The point cloud data corresponding to the three-dimensional image.
作为第一方面一种可能的实现方式,目标对象的二维图像和深度图像通过TOF相机获得。As a possible implementation of the first aspect, the two-dimensional image and the depth image of the target object are obtained by a TOF camera.
作为第一方面一种可能的实现方式,基于目标对象的脸部点云数据和目标对象的脸部二维关键点数据,获取目标对象的脸部关键点的点云数据,包括:利用目标对象的脸部二维关键点数据对应的二维坐标在目标对象的脸部点云数据中进行索引,以获得目标对象的脸部关键点的点云数据。作为第一方面一种可能的实现方式,目标对象的脸部关键点为51个脸部关键点。As a possible implementation of the first aspect, based on the face point cloud data of the target object and the two-dimensional key point data of the face of the target object, the point cloud data of the face key points of the target object is obtained, including: using the target object The two-dimensional coordinates corresponding to the face two-dimensional key point data are indexed in the face point cloud data of the target object, so as to obtain the point cloud data of the face key point of the target object. As a possible implementation of the first aspect, the facial key points of the target object are 51 facial key points.
作为第一方面一种可能的实现方式,目标对象的脸部关键点为68个脸部关键点。As a possible implementation of the first aspect, the facial key points of the target object are 68 facial key points.
作为第一方面的一种可能的实现方式,将目标对象的脸部关键点的点云数据与参数化人脸模型中脸部关键点的点云数据进行配准的过程为刚体变换的过程。As a possible implementation of the first aspect, the process of registering the point cloud data of the facial key points of the target object with the point cloud data of the facial key points in the parameterized face model is a process of rigid body transformation.
作为第一方面的一种可能的实现方式,以第一相似变换参数为初始值,根据目标函数对第一相似变换参数进行优化的过程为非刚体变换的过程。As a possible implementation of the first aspect, the process of optimizing the first similarity transformation parameters according to the objective function with the first similarity transformation parameters as initial values is a non-rigid body transformation process.
作为第一方面的一种可能的实现方式,利用梯度下降法,以第一相似变换参数为初始值,根据目标函数对第一相似变换参数进行优化,获得第二相似变换参数。As a possible implementation of the first aspect, the gradient descent method is used to optimize the first similarity transformation parameters according to the objective function with the first similarity transformation parameters as initial values to obtain the second similarity transformation parameters.
作为第一方面的一种可能的实现方式,利用拟牛顿法,以第一相似变换参数为初始值,根据目标函数对第一相似变换参数进行优化,获得第二相似变换参数。作为第一方面一种可能的实现方式根据第二相似变化参数确定目标对象的头部姿态,包括:对第二相似变换参数进行罗德里格斯变化,获得用于表示目标对象头部姿态的欧拉角。As a possible implementation of the first aspect, the quasi-Newton method is used, with the first similarity transformation parameters as initial values, and the first similarity transformation parameters are optimized according to the objective function to obtain the second similarity transformation parameters. As a possible implementation of the first aspect, determining the head pose of the target object according to the second similarity transformation parameter includes: performing a Rodrigues transformation on the second similarity transformation parameter to obtain an Ou used to represent the head pose of the target object. pull angle.
作为第一方面一种可能的实现方式,还包括:根据目标对象的头部姿态确定目标对象的专注度;基于目标对象的专注度向目标对象发出告警。As a possible implementation manner of the first aspect, the method further includes: determining the concentration of the target object according to the head posture of the target object; and sending an alarm to the target object based on the concentration of the target object.
由上,通过本申请提供的方案获得目标对象的头部姿态更为准确,根据该头部姿态得到目标对象的专注度同样更为精确,可以使基于该专注度的向目标对象发出告警更为贴合实际,可以减少误报情况的发生。示例性的,误报可以为:目标对象专注度不高时,此时未发出告警,可能出现安全隐患;再例如:目标对象专注度较高时,此时发出告警,会影响目标对象的专注度。From the above, it is more accurate to obtain the head posture of the target object through the scheme provided by this application, and it is also more accurate to obtain the concentration degree of the target object according to the head posture, so that it is more accurate to send an alarm to the target object based on the concentration degree. Fitting to reality can reduce the occurrence of false positives. Exemplary, the false alarm can be: when the target object is not highly focused, no alarm is issued at this time, which may cause a safety hazard; another example: when the target object is highly focused, an alarm is issued at this time, which will affect the target object's concentration Spend.
本申请第二方面提供一种头部姿态测量装置,包括:The second aspect of the present application provides a head posture measurement device, including:
第一获取模块,用于获取目标对象的脸部点云数据;The first obtaining module is used to obtain the facial point cloud data of the target object;
第二获取模块,用于基于目标对象的脸部点云数据和目标对象的脸部二维关键点数据,获取目标对象的脸部关键点的点云数据;The second acquisition module is used to obtain point cloud data of the face key points of the target object based on the face point cloud data of the target object and the two-dimensional key point data of the face of the target object;
第三获取模块,用于将目标对象的脸部关键点的点云数据与参数化人脸模型中脸部关键点的点云数据进行配准,获得第一相似变换参数;The third acquisition module is used to register the point cloud data of the key points of the face of the target object with the point cloud data of the key points of the face in the parameterized face model to obtain the first similar transformation parameters;
第四获取模块,用于根据目标函数对所述第一相似变换参数进行优化,获得第二相似变换参数;A fourth acquisition module, configured to optimize the first similarity transformation parameters according to an objective function, and obtain second similarity transformation parameters;
第一确定模块,用于根据第二相似变化参数确定目标对象的头部姿态。The first determination module is configured to determine the head posture of the target object according to the second similarity change parameter.
作为第二方面的一种可能的实现方式,第四获取模块中的目标函数包括点面距离函数;As a possible implementation of the second aspect, the objective function in the fourth acquisition module includes a point-plane distance function;
其中,点面距离函数为目标对象的脸部点云数据中的点到参数化人脸模型中距离最近的三角面片的距离函数;三角面片是参数化人脸模型中相邻的三个顶点所构成的三角形。Among them, the point-plane distance function is the distance function from the point in the face point cloud data of the target object to the nearest triangle patch in the parameterized face model; the triangle patch is the three adjacent triangles in the parameterized face model. The triangle formed by the vertices.
作为第二方面的一种可能的实现方式,点面距离函数具体用于:As a possible implementation of the second aspect, the point-to-plane distance function is specifically used for:
Figure PCTCN2021097701-appb-000005
Figure PCTCN2021097701-appb-000005
其中,D 2pf为点面距离函数,s i为目标对象的脸部点云数据中的点,p(s i,f c(i))为点s i在参数化人脸模型中距离最近的三角面f c(i)上的投影点,
Figure PCTCN2021097701-appb-000006
为最近的三角面f c(i)上第j条边上与s i距离最近的点,
Figure PCTCN2021097701-appb-000007
为最近的三角面f c(i)上第w个顶点。
Among them, D 2pf is the point-plane distance function, s i is the point in the face point cloud data of the target object, p(s i , f c(i) ) is the point s i with the closest distance in the parameterized face model The projection point on the triangular face f c(i) ,
Figure PCTCN2021097701-appb-000006
is the point closest to si on the jth side of the nearest triangular face f c(i) ,
Figure PCTCN2021097701-appb-000007
is the wth vertex on the nearest triangular face f c(i) .
作为第二方面的一种可能的实现方式,第四获取模块中的目标函数还包括:关键点投影距离函数;As a possible implementation of the second aspect, the objective function in the fourth acquisition module further includes: a key point projection distance function;
其中,关键点投影距离函数为参数化人脸模型中的脸部关键点在脸部二维图像的投影点,到目标对象的脸部二维图像上的脸部二维关键点的距离的函数。Among them, the key point projection distance function is a function of the distance between the key point of the face in the parameterized face model on the two-dimensional image of the face and the distance to the two-dimensional key point of the face on the two-dimensional face image of the target object .
作为第二方面的一种可能的实现方式,关键点投影距离函数具体用于:As a possible implementation of the second aspect, the key point projection distance function is specifically used for:
Figure PCTCN2021097701-appb-000008
Figure PCTCN2021097701-appb-000008
其中,D proj为关键点投影距离函数,u i为参数化人脸模型中的脸部关键点在脸部二维图像的投影点,v i为脸部二维图像上的脸部二维关键点,n为参数化人脸模型中的脸部关键点的数量。 Among them, D proj is the key point projection distance function, u i is the projection point of the face key point in the parameterized face model on the face two-dimensional image, v i is the face two-dimensional key point on the face two-dimensional image point, n is the number of face key points in the parameterized face model.
作为第二方面的一种可能的实现方式,第四获取模块中的目标函数还包括:As a possible implementation of the second aspect, the objective function in the fourth acquisition module also includes:
参数化人脸模型系数的惩罚项函数;其中,惩罚项用于对系数的大小进行约束。The penalty term function of parameterized face model coefficients; where the penalty term is used to constrain the size of the coefficients.
作为第二方面的一种可能的实现方式,参数化人脸模型系数的惩罚项函数具体用于:As a possible implementation of the second aspect, the penalty term function of parameterized face model coefficients is specifically used for:
E pri=λ S*||S|| 2E*||E|| 2P*||P|| 2 E pri =λ S *||S|| 2E *||E|| 2P *||P|| 2
其中,E pri为参数化人脸模型系数的惩罚项函数,S为参数化人脸模型中的形状系数,E为参数化人脸模型中的表情系数,P为参数化人脸模型中的姿态系数,λ S为参数化人脸模型中的形状系数的惩罚系数,λ E为参数化人脸模型中的表情系数的惩罚系数,λ P为参数化人脸模型中的姿态系数的惩罚系数。 Among them, E pri is the penalty term function of the parameterized face model coefficient, S is the shape coefficient in the parameterized face model, E is the expression coefficient in the parameterized face model, and P is the pose in the parameterized face model coefficient, λ S is the penalty coefficient of the shape coefficient in the parameterized face model, λ E is the penalty coefficient of the expression coefficient in the parameterized face model, and λ P is the penalty coefficient of the pose coefficient in the parameterized face model.
作为第二方面的一种可能的实现方式,第一获取模块,包括:As a possible implementation of the second aspect, the first acquisition module includes:
第一获取子模块,用于基于目标对象的二维图像和深度图像获得目标对象的点云数据;The first acquisition submodule is used to obtain point cloud data of the target object based on the two-dimensional image and the depth image of the target object;
第一提取子模块,用于从目标对象的二维图像中提取目标对象的脸部二维图像;The first extraction submodule is used to extract the face two-dimensional image of the target object from the two-dimensional image of the target object;
第二提取子模块,用于根据提取得到的脸部二维图像,从目标对象的点云数据中提取脸部二维图像对应的点云数据。The second extraction sub-module is used to extract the point cloud data corresponding to the two-dimensional face image from the point cloud data of the target object according to the extracted two-dimensional face image.
作为第二方面的一种可能的实现方式,目标对象的二维图像和深度图像通过TOF相机获得。As a possible implementation of the second aspect, the two-dimensional image and the depth image of the target object are obtained by a TOF camera.
作为第二方面的一种可能的实现方式,目标对象的脸部关键点为51个脸部关键点。As a possible implementation of the second aspect, the facial key points of the target object are 51 facial key points.
作为第二方面的一种可能的实现方式,目标对象的脸部关键点为68个脸部关键点。As a possible implementation of the second aspect, the facial key points of the target object are 68 facial key points.
作为第二方面的一种可能的实现方式,第一确定模块具体用于:对第二相似变换参数进行罗德里格斯变化,获得用于表示目标对象头部姿态的欧拉角。As a possible implementation of the second aspect, the first determining module is specifically configured to: perform a Rodrigues transformation on the second similarity transformation parameters to obtain Euler angles used to represent the head pose of the target object.
作为第二方面的一种可能的实现方式,还包括:As a possible implementation of the second aspect, it also includes:
第二确定模块,用于根据目标对象的头部姿态确定目标对象的专注度;The second determination module is used to determine the concentration of the target object according to the head posture of the target object;
告警模块,用于基于目标对象的专注度向目标对象发出告警。The alarm module is configured to send an alarm to the target object based on the concentration of the target object.
本申请第三方面提供一种计算设备,包括:A third aspect of the present application provides a computing device, including:
通信接口;Communication Interface;
至少一个处理器,其与通信接口连接;以及at least one processor connected to the communication interface; and
至少一个存储器,其与处理器连接并存储有程序指令,程序指令当被至少一个处理器执行时,使得至少一个处理器执行上述第一方面任一项的头部姿态测量方法。At least one memory is connected to the processor and stores program instructions. When the program instructions are executed by the at least one processor, the at least one processor executes the method for measuring head posture according to any one of the above first aspects.
本申请第四方面提供一种计算机可读存储介质,其上存储有程序指令,程序指令当被计算机执行时,使得计算机执行上述第一方面任一项的头部姿态测量方法。A fourth aspect of the present application provides a computer-readable storage medium, on which program instructions are stored. When the program instructions are executed by a computer, the computer executes the head posture measurement method of any one of the above-mentioned first aspects.
本申请第五方面提供一种计算机程序产品,当计算机程序产品在计算设备上运行时,使得计算设备执行上述第一方面任一项的头部姿态测量方法。A fifth aspect of the present application provides a computer program product. When the computer program product is run on a computing device, the computing device is made to execute the method for measuring head posture according to any one of the above first aspects.
本申请的这些和其它方面在以下(多个)实施例的描述中会更加简明易懂。These and other aspects of the present application will be made more apparent in the following description of the embodiment(s).
附图说明Description of drawings
以下参照附图来进一步说明本申请的各个特征和各个特征之间的联系。附图均为示例性的,一些特征并不以实际比例示出,并且一些附图中可能省略了本申请所涉及领域的惯常的且对于本申请非必要的特征,或是额外示出了对于本申请非必要的特征,附图所示的各个特征的组合并不用以限制本申请。另外,在本说明书全文中,相同的附图标记所指代的内容也是相同的。具体的附图说明如下:The various features of the present application and the connections between the various features are further described below with reference to the accompanying drawings. The drawings are exemplary, some features are not shown to scale, and in some drawings, features customary in the field to which the application pertains and are not necessary for the application may be omitted, or additionally shown for the The application is not an essential feature, and the combination of the various features shown in the drawings is not intended to limit the application. In addition, in the whole specification, the content indicated by the same reference numeral is also the same. The specific accompanying drawings are explained as follows:
图1为本申请实施例提供的头部姿态测量方法的应用场景示意图;FIG. 1 is a schematic diagram of an application scenario of a head posture measurement method provided in an embodiment of the present application;
图2为本申请实施例提供的一种头部姿态测量方法的流程图;FIG. 2 is a flow chart of a head posture measurement method provided by an embodiment of the present application;
图3为本申请实施例提供的一种脸部的点云数据确定方法的流程图;FIG. 3 is a flow chart of a method for determining point cloud data of a face provided by an embodiment of the present application;
图4为本申请实施例提供的点面距离函数确定方法的示例图;FIG. 4 is an example diagram of a method for determining a point-to-plane distance function provided in an embodiment of the present application;
图5为本申请实施例提供的头部姿态测量方法的具体方式流程图;FIG. 5 is a flowchart of a specific method of the head posture measurement method provided by the embodiment of the present application;
图6为本申请实施例提供的脸部68个脸部二维关键点示意图;FIG. 6 is a schematic diagram of 68 two-dimensional key points of the face provided by the embodiment of the present application;
图7为本申请实施例提供的FLAME模型脸部关键点的人脸坐标系示意图;Fig. 7 is the schematic diagram of the human face coordinate system of the FLAME model face key point that the embodiment of the present application provides;
图8为本申请实施例提供的一种辅助驾驶装置的一种结构化示意图;FIG. 8 is a structural schematic diagram of a driving assistance device provided in an embodiment of the present application;
图9为本申请实施例提供的一种头部姿态测量装置的结构化示意图;FIG. 9 is a structural schematic diagram of a head posture measurement device provided by an embodiment of the present application;
图10为本申请实施例提供的计算设备示意图。FIG. 10 is a schematic diagram of a computing device provided by an embodiment of the present application.
具体实施方式Detailed ways
说明书和权利要求书中的词语“第一、第二、第三等”或模块A、模块B、模块C等类似用语,仅用于区别类似的对象,不代表针对对象的特定排序,可以理解地,在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。The words "first, second, third, etc." or similar terms such as module A, module B, and module C in the description and claims are only used to distinguish similar objects, and do not represent a specific ordering of objects. It can be understood that Obviously, where permitted, the specific order or sequence can be interchanged such that the embodiments of the application described herein can be practiced in other sequences than those illustrated or described herein.
在以下的描述中,所涉及的表示步骤的标号,如S110、S120……等,并不表示一定会按此步骤执行,在允许的情况下可以互换前后步骤的顺序,或同时执行。In the following description, the involved reference numerals representing steps, such as S110, S120, etc., do not mean that this step must be executed, and the order of the previous and subsequent steps can be interchanged or executed simultaneously if allowed.
说明书和权利要求书中使用的术语“包括”不应解释为限制于其后列出的内容;它不排除其它的元件或步骤。因此,其应当诠释为指定所提到的所述特征、整体、步骤或部件的存在,但并不排除存在或添加一个或更多其它特征、整体、步骤或部件及其组群。因此,表述“包括装置A和B的设备”不应局限为仅由部件A和B组成的设备。The term "comprising" used in the description and claims should not be interpreted as being restricted to what is listed thereafter; it does not exclude other elements or steps. Therefore, it should be interpreted as specifying the presence of said features, integers, steps or components, but not excluding the presence or addition of one or more other features, integers, steps or components and groups thereof. Therefore, the expression "apparatus comprising means A and B" should not be limited to an apparatus consisting of parts A and B only.
本说明书中提到的“一个实施例”或“实施例”意味着与该实施例结合描述的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在本说明书各处出现的用语“在一个实施例中”或“在实施例中”并不一定都指同一实施例,但可以指同一实施例。此外,在一个或多个实施例中,能够以任何适当的方式组合各特定特征、结构或特性,如从本公开对本领域的普通技术人员显而易见的那样。Reference in this specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, appearances of the phrases "in one embodiment" or "in an embodiment" in various places in this specification do not necessarily all refer to the same embodiment, but may. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to one of ordinary skill in the art from this disclosure, in one or more embodiments.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。如有不一致,以本说明书中所说明的含义或者根据本说明书中记载的内容得出的含义为准。另外,本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. In case of any inconsistency, the meaning stated in this manual or the meaning derived from the content recorded in this manual shall prevail. In addition, the terms used herein are only for the purpose of describing the embodiments of the application, and are not intended to limit the application.
为了准确地对本申请中的技术内容进行叙述,以及为了准确地理解本发明,在对具体实施方式进行说明之前先对本说明书中所使用的术语给出如下的解释说明或定义:In order to accurately describe the technical content in this application, and in order to accurately understand the present invention, the following explanations or definitions are given to the terms used in this specification before describing the specific embodiments:
1)具有深度的图像数据:其包括普通的RGB彩色图像信息和深度信息(Depth Map),且RGB图像信息和Depth图像信息是配准的,即像素点之间具有一对一的对应关系。可以通过RGB-D相机实现具有深度的图像数据的采集,所采集的具有深度的图像数据可以通过一RGB图像帧和一深度图像帧的方式呈现,也可以整合为一个图像数据的方式呈现。根据相机的内参,可实现深度信息与点云坐标的变换。1) Image data with depth: it includes ordinary RGB color image information and depth information (Depth Map), and RGB image information and Depth image information are registered, that is, there is a one-to-one correspondence between pixels. The acquisition of image data with depth can be realized through the RGB-D camera, and the collected image data with depth can be presented in the form of an RGB image frame and a depth image frame, or can be presented in the form of integrated image data. According to the internal parameters of the camera, the transformation between depth information and point cloud coordinates can be realized.
2)基于TOF相机的RGB图像数据与点云数据的映射关系:2) The mapping relationship between RGB image data and point cloud data based on TOF camera:
对于某点云坐标,即世界坐标点M(x w,y w,z w),映射到图像点m(u,v),采用公式(1)表示如下: For a point cloud coordinate, that is, the world coordinate point M(x w ,y w ,z w ), mapped to the image point m(u,v), the formula (1) is used to express as follows:
Figure PCTCN2021097701-appb-000009
Figure PCTCN2021097701-appb-000009
其中,
Figure PCTCN2021097701-appb-000010
是TOF的内参矩阵,u,v为图像坐标系下的任意坐标点,u 0,v 0分别为图像的中心坐标。x w,y w,z w表示世界坐标系下的三维坐标点。z c表示相机坐标的z轴值,即目标到相机的距离,对应TOF相机,即为[u,v]点的深度值。R、T分别为外参矩阵的3x3旋转矩阵和3x1平移矩阵。
in,
Figure PCTCN2021097701-appb-000010
is the internal reference matrix of TOF, u, v are arbitrary coordinate points in the image coordinate system, u 0 , v 0 are the center coordinates of the image respectively. x w , y w , z w represent three-dimensional coordinate points in the world coordinate system. z c represents the z-axis value of the camera coordinates, that is, the distance from the target to the camera, corresponding to the TOF camera, which is the depth value of the [u,v] point. R and T are the 3x3 rotation matrix and the 3x1 translation matrix of the external parameter matrix respectively.
对外参矩阵的设置:由于世界坐标原点和相机原点是重合的,即没有旋转和平移,外参矩阵中的R、T为如下公式(2):Setting of the external parameter matrix: Since the origin of the world coordinates and the origin of the camera are coincident, that is, there is no rotation and translation, the R and T in the external parameter matrix are the following formulas (2):
Figure PCTCN2021097701-appb-000011
Figure PCTCN2021097701-appb-000011
由于相机坐标系和世界坐标系的坐标原点重合,因此相机坐标和世界坐标下的同 一个坐标点具有相同的深度,即z c=z w,于是公式(1)可进一步简化为如下公式(3): Since the coordinate origins of the camera coordinate system and the world coordinate system coincide, the same coordinate point under the camera coordinate system and the world coordinate system has the same depth, that is, z c = z w , so the formula (1) can be further simplified as the following formula (3 ):
Figure PCTCN2021097701-appb-000012
Figure PCTCN2021097701-appb-000012
从以上的变换矩阵公式(3),可以计算得到深度值为z c的图像点[u,v]到世界坐标点[x w,y w,z w]的如下公式(4): From the above transformation matrix formula (3), the following formula (4) from the image point [u, v] with the depth value z c to the world coordinate point [x w , y w , z w ] can be calculated:
Figure PCTCN2021097701-appb-000013
Figure PCTCN2021097701-appb-000013
3)参数化人脸模型:通过一标准脸(或称为平均脸、参考脸、基础形状脸、统计人脸),结合形状特征向量、姿态特征向量、或表情特征向量来表示一人脸的方式。例如3DMM模型、FLAME模型等。3) Parametric face model: through a standard face (or called average face, reference face, basic shape face, statistical face), combined with shape feature vector, pose feature vector, or expression feature vector to represent a face . Such as 3DMM model, FLAME model, etc.
4)FLAME模型:FLAME模型是基于3D人体扫描数据库(例如:Caesar数据库)中的真实人体点云数据构建,其中,将这些真实人体的人头数据注册得到每个真实人头网格,人头网格包含人脸和头部整个区域,由此建立了一个真实的人脸、人头数据库。人头网格由若干(如5023个)顶点和若干(如9976个)三角面组成,并用主成分分析(Principal Component Analysis,PCA)法得到若干(如300个)形状(shape)、若干(如100个)表情(expression)和若干(如15个)姿态(pose)主成分,从而可以据此确定一个参数化3D人头模型。4) FLAME model: The FLAME model is constructed based on the real human body point cloud data in the 3D human body scanning database (for example: Caesar database), wherein, each real human head grid is obtained by registering the head data of these real human bodies, and the human head grid contains The entire area of the face and head, thus establishing a real face and head database. The human head grid is composed of several (such as 5023) vertices and several (such as 9976) triangular faces, and several (such as 300) shapes (shape), several (such as 100 1) expression (expression) and several (such as 15) posture (pose) principal components, so that a parameterized 3D human head model can be determined accordingly.
具体地讲,通过网格的顶点位置定义,将FLAME的形状T定义为构成网格的各个顶点k的坐标,可描述为如下公式(1):Specifically, through the definition of the vertex position of the grid, the shape T of FLAME is defined as the coordinates of each vertex k constituting the grid, which can be described as the following formula (1):
T=(x 1,y 1,z 1,x 2,...,x n,y n,z n)    (1) T=(x 1 ,y 1 ,z 1 ,x 2 ,...,x n ,y n ,z n ) (1)
其中,FLAME单独地对形状和表情建模,FLAME人脸模型可以描述为如下公式(2):Among them, FLAME models the shape and expression separately, and the FLAME face model can be described as the following formula (2):
T=(v;p,q)=T 0+B S(q;S)+B p(q;E)     (2) T = (v; p, q) = T 0 + B S (q; S) + B p (q; E) (2)
其中,T 0是标准脸,即表示人脸的平均形状部分;B S(q;S)表示人脸形状混合参数,例如可以是∑ iq iS i,i=1至n,Si表示协方差阵的特征向量,是人脸形状向量参数(上述形状主成分);q是人脸形状向量参数对应的系数。B p(q;E)表示人脸表情混合参数例如可以是∑ ip iE i,i=1至I,Ei表示协方差阵的特征向量,是人脸表情向量参数(上述表情主成分);p是人脸表情向量参数对应的系数。 Among them, T 0 is a standard face, that is, the average shape part of the face; B S (q; S) represents the mixing parameter of the face shape, for example, it can be ∑ i q i S i , i=1 to n, Si represents the association The eigenvector of the variance matrix is the face shape vector parameter (the above-mentioned shape principal component); q is the coefficient corresponding to the face shape vector parameter. B p (q; E) represents the facial expression mixing parameter such as can be ∑ i p i E i , i=1 to I, Ei represents the eigenvector of covariance array, is the human face expression vector parameter (above-mentioned expression principal component) ; p is the coefficient corresponding to the facial expression vector parameter.
由上,对于人脸形状部分(本申请中可记为T(S))的建模,可表示为基础形状T 0加n个形状向量Si的线性组合,可描述为如下公式(3): From the above, the modeling of the face shape part (which can be recorded as T(S) in this application) can be expressed as a linear combination of the basic shape T 0 plus n shape vectors Si, which can be described as the following formula (3):
Figure PCTCN2021097701-appb-000014
Figure PCTCN2021097701-appb-000014
其中,i∈[1,n]。由于T 0和S i是FLAME提供,因此,当输入初始化的q i后,将q i带入公式(3)即可生成针对人脸形状部分的3D人脸模型。 where i∈[1,n]. Since T 0 and S i are provided by FLAME, after inputting the initialized q i , put q i into formula (3) to generate a 3D face model for the face shape.
5)3D人脸模型的几何配准,即将3D人脸模型变换到目标位置,也称为刚体变换, 或称角度和姿态的优化。在建立上述3D人脸模型后,该模型各顶点的3D位置也就确定了。然后可以通过刚体变换将模型相应顶点k的坐标X k=(x k,y k,z k)变换到目标位置,刚体变换可描述为如下公式(4): 5) Geometric registration of the 3D face model, that is, transforming the 3D face model to the target position, also known as rigid body transformation, or optimization of angle and posture. After the above-mentioned 3D face model is established, the 3D position of each vertex of the model is determined. Then the coordinates X k = (x k , y k , z k ) of the corresponding vertex k of the model can be transformed to the target position through rigid body transformation, which can be described as the following formula (4):
Figure PCTCN2021097701-appb-000015
Figure PCTCN2021097701-appb-000015
其中,(w x,k,w y,k,w z,k)表示目标位置,在本申请实施例中,目标位置为人脸区域的各关键点的3D坐标,通过若干个关键点,实现将整个3D人脸模型的顶点与相机坐标系中点云初步对齐;
Figure PCTCN2021097701-appb-000016
表示三轴的旋转参数, s表示缩放参数,t w表示平移参数。
Among them, (w x, k , w y, k , w z, k ) represent the target position. In the embodiment of this application, the target position is the 3D coordinates of each key point in the face area. Through several key points, the The vertices of the entire 3D face model are initially aligned with the point cloud in the camera coordinate system;
Figure PCTCN2021097701-appb-000016
Indicates the rotation parameters of the three axes, s indicates the scaling parameters, and t w indicates the translation parameters.
其中,可以采用点云匹配算法(点云匹配就是求解两堆点云之间的变换关系,即求解上述旋转参数和平移参数)进行几何配准。常见的点云匹配算法如迭代最近点算法(Iterative Closest Point,ICP)、正态分布变换算法(Normal Distribution Transform,NDT)、迭代对偶对应算法(Iterative dual correspondences,IDC)等等,本申请实施例中使用ICP算法。Among them, a point cloud matching algorithm can be used (point cloud matching is to solve the transformation relationship between two piles of point clouds, that is, to solve the above-mentioned rotation parameters and translation parameters) for geometric registration. Common point cloud matching algorithms such as iterative closest point algorithm (Iterative Closest Point, ICP), normal distribution transformation algorithm (Normal Distribution Transform, NDT), iterative dual correspondence algorithm (Iterative dual correspondences, IDC), etc., the embodiment of this application The ICP algorithm is used in.
下面结合附图对本申请的实施例进行详细说明,首先,介绍本申请实施例提供的头部姿态的测量方法所应用的场景。The embodiments of the present application will be described in detail below with reference to the accompanying drawings. Firstly, the scenarios where the method for measuring the head posture provided by the embodiments of the present application is applied are introduced.
本申请实施例提供的一种头部姿态的测量方法可以应用于任意需要高精度头部姿态数据的场景中。例如,应用场景可以为自动驾驶车辆(Autonomous Vehicle,AV)或智能驾驶车辆中。具体为:通过本申请实施例提供的头部姿态的测量方法获得驾驶员的头部姿态,然后对驾驶员的头部姿态进行分析,可以据此判断驾驶员的驾驶行为是否为危险驾驶行为,及时提醒并告警驾驶员,可以有效避免危险驾驶行为。再例如,应用场景还可以为对网络教学中听课的学生。具体为:通过本申请实施例提供的头部姿态的测量方法获得学生的头部姿态,然后对学生的头部姿态进行分析,可以据此判断学生的注意力情况,并及时告警学生或者调整授课的内容、方式等,以提高学生听课的认真程度。The head pose measurement method provided in the embodiment of the present application can be applied to any scene requiring high-precision head pose data. For example, the application scenario may be in an autonomous vehicle (Autonomous Vehicle, AV) or an intelligent driving vehicle. Specifically, the driver's head posture is obtained through the head posture measurement method provided in the embodiment of the present application, and then the driver's head posture is analyzed to determine whether the driver's driving behavior is a dangerous driving behavior, Reminding and warning the driver in time can effectively avoid dangerous driving behavior. For another example, the application scenario may also be students attending a class in online teaching. Specifically: Obtain the head posture of the student through the measurement method of the head posture provided in the embodiment of the application, and then analyze the head posture of the student, and judge the attention of the student accordingly, and timely warn the student or adjust the teaching The content, method, etc. in order to improve the seriousness of students' lectures.
示例性的,如图1所示,为本申请实施例提供的头部姿态的测量方法所应用的一种场景。利用图像获取装置10采集驾驶员20的头部RGB图像和头部深度图像后,将采集得到的RGB图像和头部深度图像传输至本地30,本地30收到图像后对图像进行处理,以获取驾驶员的头部姿态,并将获得的头部姿态存储于本地存储器。其中,所述图像获取装置10包括但不局限于相机。所述本地30包括可以本地计算机或者本地处理芯片等。Exemplarily, as shown in FIG. 1 , it is a scenario where the head pose measurement method provided in the embodiment of the present application is applied. After the head RGB image and the head depth image of the driver 20 are collected by the image acquisition device 10, the collected RGB image and the head depth image are transmitted to the local 30, and the local 30 processes the image after receiving the image to obtain The driver's head pose and store the obtained head pose in the local memory. Wherein, the image acquisition device 10 includes but not limited to a camera. The local 30 may include a local computer or a local processing chip or the like.
如图1所示,在图像获取装置10采集获得驾驶员20的头部RGB图像和头部深度图像之后,还可以将采集得到的RGB图像和头部深度图像传输至远程服务器40中,远程服务器40收到图像后对图像进行处理,以获取驾驶员的头部姿态,并将获得的头部姿态存储于远程存储器中,还可以将获得的头部姿态回传至本地的终端(例如手机、计算机等),还可以将获得的头部姿态回传至本地的存储器中等。As shown in Figure 1, after the head RGB image and the head depth image of the driver 20 are collected by the image acquisition device 10, the collected RGB image and the head depth image can also be transmitted to the remote server 40, and the remote server 40 After receiving the image, the image is processed to obtain the driver's head posture, and the obtained head posture is stored in the remote memory, and the obtained head posture can also be sent back to the local terminal (such as mobile phone, computer, etc.), the obtained head posture can also be returned to the local storage device and the like.
在详细介绍本申请实施例提供的头部姿态测量方法之前,首先介绍本申请实施例中各技术用语间的关系:在本实施例中,目标对象的脸部关键点的点云数据即为目标对象的脸部关键点对应的三维坐标,也可以成为目标对象的脸部三维关键点;参数化人脸模型中的脸部关键点也可以称为参数化人脸模型中的三维关键点。Before introducing the head posture measurement method provided by the embodiment of the present application in detail, firstly introduce the relationship between the technical terms in the embodiment of the present application: In this embodiment, the point cloud data of the facial key points of the target object is the target The three-dimensional coordinates corresponding to the key points of the face of the object may also be the three-dimensional key points of the face of the target object; the key points of the face in the parametric face model may also be called the three-dimensional key points in the parametric face model.
下面参见各图,对本申请实施例提供的一种头部姿态测量方法进行详细说明。Referring to the figures below, a head posture measurement method provided by the embodiment of the present application will be described in detail.
如图2所示,为本申请实施例提供的头部姿态测量方法的流程图。该过程主要包括步骤S110-S150,下面对各个步骤依次进行介绍:As shown in FIG. 2 , it is a flow chart of the head posture measurement method provided by the embodiment of the present application. The process mainly includes steps S110-S150, each step will be introduced in sequence below:
S110:获取目标对象的脸部点云数据。S110: Obtain face point cloud data of the target object.
作为一种可选的实现方式,如图3所示,该过程可以包括步骤S111-S113,下面对各个步骤依次进行介绍:As an optional implementation, as shown in FIG. 3, the process may include steps S111-S113, and each step will be introduced in turn below:
S111:基于目标对象的二维图像和深度图像获得所述目标对象的点云数据。S111: Obtain point cloud data of the target object based on the two-dimensional image and the depth image of the target object.
其中,所述目标对象的二维图像可以为目标对象的RGB图像,也可以为目标对象的灰度图像等。此处需要注意的是,在本申请的实施例中,目标对象的二维图像和深度图像均应至少包括目标对象的脸部区域。Wherein, the two-dimensional image of the target object may be an RGB image of the target object, or may be a grayscale image of the target object, or the like. It should be noted here that, in the embodiment of the present application, both the 2D image and the depth image of the target object should at least include the face area of the target object.
作为一种可选的实现方式,可以利用飞行时间相机(Time of Fight Camera,TOF相机)获取目标对象的RGB图像和深度图像。As an optional implementation, a time of flight camera (Time of Fight Camera, TOF camera) can be used to obtain the RGB image and the depth image of the target object.
S112:从所述目标对象的二维图像中提取目标对象的脸部二维图像。S112: Extract a two-dimensional face image of the target object from the two-dimensional image of the target object.
作为一种可选的实现方式,可以通过对目标对象的二维图像进行语义分割,以提取出目标对象的脸部二维图像。其中,所述语义分割为剔除掉二维图像中目标对象脸部以外的区域,比如背景、头发、躯干等。As an optional implementation manner, the two-dimensional face image of the target object may be extracted by performing semantic segmentation on the two-dimensional image of the target object. Wherein, the semantic segmentation is to remove regions other than the face of the target object in the two-dimensional image, such as background, hair, torso, and the like.
在本申请的实施例中,包括但不局限于利用卷积神经网络(CNN)对所述目标对象的RGB图像进行语义分割、利用全卷积网络(FCN)对所述目标对象的RGB图像进行语义分割,或利用掩膜-R-卷积神经网络(Mask RCNN)对所述目标对象的RGB图像进行语义分割。In the embodiment of the present application, it includes but is not limited to using a convolutional neural network (CNN) to perform semantic segmentation on the RGB image of the target object, and using a full convolutional network (FCN) to perform semantic segmentation on the RGB image of the target object. Semantic segmentation, or utilize mask-R-convolutional neural network (Mask RCNN) to perform semantic segmentation on the RGB image of the target object.
S113:根据提取得到的所述脸部二维图像,从所述目标对象的点云数据中提取所述脸部二维图像对应的点云数据。S113: According to the extracted two-dimensional face image, extract point cloud data corresponding to the two-dimensional face image from the point cloud data of the target object.
由于二维图像像素点与深度图像像素点之间具有一一对应的关系,因此,作为一种可选的实现方式,可以将步骤S112中提取得到的目标对象的脸部二维图像中的像素点与深度图像的像素点进行配准,进而获得表示目标对象脸部的点云数据。Since there is a one-to-one correspondence between the pixels of the two-dimensional image and the pixels of the depth image, as an optional implementation, the pixels in the two-dimensional face image of the target object extracted in step S112 can be The points are registered with the pixels of the depth image to obtain point cloud data representing the face of the target object.
S120:基于所述目标对象的脸部点云数据和目标对象的脸部二维关键点数据,获取所述目标对象的脸部关键点的点云数据。S120: Acquire point cloud data of facial key points of the target object based on the face point cloud data of the target object and the two-dimensional key point data of the target object's face.
具体的,可以基于目标对象的脸部的二维关键点,在目标对象的脸部点云数据中进行索引,以获得目标对象脸部二维关键点对应的脸部关键点的点云数据(目标对象的脸部三维关键点)。Specifically, based on the two-dimensional key points of the face of the target object, the face point cloud data of the target object can be indexed to obtain the point cloud data of the face key points corresponding to the two-dimensional key points of the face of the target object ( 3D keypoints of the face of the target object).
作为一种可选的实现方式,脸部二维关键点可以从所述目标对象的脸部二维图像中提取。作为另外一种可选的实现方式,脸部二维关键点还可以从所述目标对象的二维图像中提取,本申请实施例不对其进行限制。As an optional implementation manner, the two-dimensional key points of the face may be extracted from the two-dimensional face image of the target object. As another optional implementation manner, the two-dimensional key points of the face may also be extracted from the two-dimensional image of the target object, which is not limited in this embodiment of the present application.
作为一种可选的实现的方式,可以利用主动形状模型(Active Shape Model,ASM)的方法提取RGB图像(二维图像)的脸部二维关键点,也可以利用深度学习的方式提取RGB图像(二维图像)的脸部二维关键点,还可以利用级联深度神经网络(Deep Alignment Network,DAN)提取二维图像的脸部二维关键点。As an optional implementation, you can use the Active Shape Model (ASM) method to extract the two-dimensional key points of the face of the RGB image (two-dimensional image), or you can use the deep learning method to extract the RGB image (two-dimensional image) facial two-dimensional key points, you can also use the cascaded deep alignment network (Deep Alignment Network, DAN) to extract two-dimensional image facial two-dimensional key points.
S130:将所述目标对象的脸部关键点的点云数据与参数化人脸模型中脸部关键点的点云数据进行配准,获得第一相似变换参数。S130: Register the point cloud data of the facial key points of the target object with the point cloud data of the facial key points in the parameterized face model to obtain a first similarity transformation parameter.
在本步骤中,所述参数化人脸模型可以为FLAME模型;所述配准也可以称为姿 态拟合,即对参数化人脸模型整体进行刚体变换,使参数化人脸模型中脸部关键点的点云数据与目标对象的脸部关键点的点云数据进行配准。In this step, the parameterized face model can be a FLAME model; the registration can also be called pose fitting, that is, rigid body transformation is performed on the entire parameterized face model so that the face in the parameterized face model The point cloud data of the key points is registered with the point cloud data of the face key points of the target object.
作为一种可选的实现方式,可以利用迭代最近点算法(Iterative Closest Point,ICP)将目标对象的脸部关键点的点云数据与参数化人脸模型中脸部关键点的点云数据进行配准,并获得第一相似变换参数。其中,该第一相似变换参数包括缩放倍数、旋转矩阵和平移向量。As an optional implementation, iterative closest point algorithm (Iterative Closest Point, ICP) can be used to compare the point cloud data of the face key points of the target object with the point cloud data of the face key points in the parameterized face model Registration, and obtain the first similarity transformation parameters. Wherein, the first similarity transformation parameters include scaling factor, rotation matrix and translation vector.
S140:根据目标函数对所述第一相似变换参数进行优化,获得第二相似变换参数。其中,该第二相似变换参数包括缩放倍数、旋转矩阵和平移向量。S140: Optimizing the first similarity transformation parameters according to the objective function to obtain second similarity transformation parameters. Wherein, the second similarity transformation parameters include scaling factor, rotation matrix and translation vector.
作为一种可选的实现方式,该目标函数包括点面距离函数。其中,点面距离函数为目标对象的脸部点云数据中的点到所述参数化人脸模型中距离最近的三角面片的距离函数。所述三角面片是所述参数化人脸模型中相邻的三个顶点所构成的三角形。例如:参见图4,P点为目标对象的脸部点云数据中的点,图4示出的各个三角形即为参数化人脸模型中各相邻点间构成的三角面片,比如点a、b、c构成的三角形。As an optional implementation, the objective function includes a point-plane distance function. Wherein, the point-plane distance function is a distance function from a point in the face point cloud data of the target object to the nearest triangular patch in the parameterized face model. The triangle patch is a triangle formed by three adjacent vertices in the parameterized face model. For example: Referring to Figure 4, point P is a point in the face point cloud data of the target object, and each triangle shown in Figure 4 is the triangle patch formed between adjacent points in the parameterized face model, such as point a , b, c form a triangle.
其中,当最近的三角面片不容易确定时,还可以计算目标对象的脸部点云数据到其临近的三角面片间的距离,并通过比较得到各距离值的大小,获取其中的最小值,即为目标对象的脸部点云数据中的点到所述参数化人脸模型中最近的三角面片的距离。例如,当与图4中点P最近的三角面片不容易确定时,可以计算点P到三角面片abc的距离和点P到三角面片abd的距离,通过比较两个距离值的大小,确定点P对应的点面距离函数值。Among them, when the nearest triangular patch is not easy to determine, the distance between the face point cloud data of the target object and its adjacent triangular patch can also be calculated, and the size of each distance value can be obtained by comparing, and the minimum value can be obtained , which is the distance from a point in the face point cloud data of the target object to the nearest triangular patch in the parameterized face model. For example, when the nearest triangular patch to point P in Figure 4 is not easy to determine, the distance from point P to triangular patch abc and the distance from point P to triangular patch abd can be calculated, and by comparing the two distance values, Determine the point-plane distance function value corresponding to point P.
在本申请的实施例中,可以按下式确定所述点面距离函数:In the embodiment of the present application, the point-to-plane distance function can be determined as follows:
Figure PCTCN2021097701-appb-000017
Figure PCTCN2021097701-appb-000017
其中,D 2pf为点面距离函数,s i为目标对象的脸部点云数据中的点,p(s i,f c(i))为点s i在参数化人脸模型中距离最近的三角面f c(i)上的投影点,
Figure PCTCN2021097701-appb-000018
为最近的三角面f c(i)上第j条边上与s i距离最近的点,
Figure PCTCN2021097701-appb-000019
为最近的三角面f c(i)上第w个顶点。
Among them, D 2pf is the point-plane distance function, s i is the point in the face point cloud data of the target object, p(s i , f c(i) ) is the point s i with the closest distance in the parameterized face model The projection point on the triangular face f c(i) ,
Figure PCTCN2021097701-appb-000018
is the point closest to si on the jth side of the nearest triangular face f c(i) ,
Figure PCTCN2021097701-appb-000019
is the wth vertex on the nearest triangular face f c(i) .
作为一种可选的实现方式,该目标函数还可以包括关键点投影距离函数。其中,关键点投影距离函数为所述参数化人脸模型中的脸部关键点在所述脸部二维图像的投影点,到目标对象的脸部二维图像上的脸部二维关键点的距离的函数。As an optional implementation manner, the objective function may also include a key point projection distance function. Wherein, the key point projection distance function is the projection point of the face key point in the parameterized face model on the face two-dimensional image, to the face two-dimensional key point on the face two-dimensional image of the target object function of the distance.
示例性的,参数化人脸模型中的5023个顶点中有对应人脸的51个人脸关键点(五官关键点),关键点投影距离函数就是通过将这51个五官关键点投影到目标对象的脸部二维图像上,并计算这些投影点到目标对象的脸部二维图像上对应的脸部二维关键点之间的距离。在其他实施例中,该人脸关键点还可以选取其他具有代表性的点,例如选取68个人脸关键点。Exemplarily, among the 5023 vertices in the parameterized face model, there are 51 face key points (feature key points) corresponding to the face, and the key point projection distance function is exactly by projecting these 51 facial feature key points to the target object. on the two-dimensional face image, and calculate the distance between these projected points and the corresponding two-dimensional face key points on the two-dimensional face image of the target object. In other embodiments, other representative points may be selected for the face key points, for example, 68 face key points are selected.
具体的:可以按下式确定所述关键点投影距离函数:Specifically: the key point projection distance function can be determined as follows:
Figure PCTCN2021097701-appb-000020
Figure PCTCN2021097701-appb-000020
其中,D proj为关键点投影距离函数,u i为所述参数化人脸模型中的脸部关键点在脸部二维图像的投影点,v i为脸部二维图像上的脸部二维关键点,n为所述参数化人脸模型中 的脸部关键点的数量。此处需要注意的是,若五官关键点共有51个,则n=51。 Among them, D proj is the key point projection distance function, u i is the projection point of the face key point in the parameterized face model on the face two-dimensional image, v i is the face two-dimensional image on the face two-dimensional image dimension key points, n is the number of face key points in the parameterized face model. It should be noted here that if there are 51 key points of facial features, then n=51.
作为一种可选的实现方式,该目标函数还可以包括参数化人脸模型系数的惩罚项函数。其中,所述惩罚项用于对所述系数的大小进行约束。As an optional implementation manner, the objective function may also include a penalty term function that parameterizes coefficients of the face model. Wherein, the penalty term is used to constrain the size of the coefficient.
具体的,可以按下式确定所述参数化人脸模型系数的惩罚项函数包括:Specifically, the penalty term function of the parameterized face model coefficients that can be determined by the following formula includes:
E pri=λ S*||S|| 2E*||E|| 2P*||P|| 2 E pri =λ S *||S|| 2E *||E|| 2P *||P|| 2
其中,E pri为参数化人脸模型系数的惩罚项函数,S为参数化人脸模型中的形状系数,E为参数化人脸模型中的表情系数,P为参数化人脸模型中的姿态系数,λ S为参数化人脸模型中的形状系数的惩罚系数,λ E为参数化人脸模型中的表情系数的惩罚系数,λ P为参数化人脸模型中的姿态系数的惩罚系数。 Among them, E pri is the penalty term function of the parameterized face model coefficient, S is the shape coefficient in the parameterized face model, E is the expression coefficient in the parameterized face model, and P is the pose in the parameterized face model coefficient, λ S is the penalty coefficient of the shape coefficient in the parameterized face model, λ E is the penalty coefficient of the expression coefficient in the parameterized face model, and λ P is the penalty coefficient of the pose coefficient in the parameterized face model.
作为一种可选的实现方式,可以利用梯度下降法,将步骤S140中的第一相似变化参数作为初始值,基于本步骤中的目标函数对该第一相似变换参数进行优化,直到该目标函数收敛,将目标函数收敛时对应的相似变换参数作为第二相似变化参数。As an optional implementation, the gradient descent method can be used to take the first similarity change parameter in step S140 as an initial value, and optimize the first similarity transformation parameter based on the objective function in this step until the objective function Convergence, the similarity transformation parameter corresponding to the convergence of the objective function is used as the second similarity transformation parameter.
作为另外一种可选的实现方式,还可以利用拟牛顿法,将步骤S140中的第一相似变化参数作为初始值,基于本步骤中的目标函数对该第一相似变换参数进行优化,直到该目标函数收敛,将目标函数收敛时对应的相似变换参数作为第二相似变化参数。S150:根据所述第二相似变化参数确定所述目标对象的头部姿态。As another optional implementation, the quasi-Newton method can also be used to take the first similarity change parameter in step S140 as an initial value, and optimize the first similarity transformation parameter based on the objective function in this step until the The objective function converges, and the corresponding similarity transformation parameter when the objective function converges is used as the second similarity variation parameter. S150: Determine the head pose of the target object according to the second similarity change parameter.
在本步骤中,所述第二相似变化参数包括缩放倍数、旋转矩阵和平移向量。In this step, the second similar change parameters include scaling factor, rotation matrix and translation vector.
作为一种可选的实现方式,可以选用第二相似变化参数中的旋转矩阵来表示目标对象的头部姿态。As an optional implementation manner, the rotation matrix in the second similarity change parameter may be selected to represent the head pose of the target object.
作为另外一种可选的实现方式,还可以通过对第二相似变化参数中的旋转矩阵进行罗德里格斯变化,获得欧拉角,并利用欧拉角来表示目标对象的头部姿态。其中,所述欧拉角可以包括俯仰角(pitch)、偏航角(yaw)和滚转角(roll)等。As another optional implementation manner, Euler angles may also be obtained by performing a Rodrigues transformation on the rotation matrix in the second similarity transformation parameter, and using the Euler angles to represent the head pose of the target object. Wherein, the Euler angles may include a pitch angle (pitch), a yaw angle (yaw), a roll angle (roll) and the like.
本申请的另一实施例提供一种头部姿态测量方法,该方法与上述实施例提供的头部姿态测量方法基本相同,故本实施例不再对相同之处进行赘述。不同之处在于,在步骤S150之后,还包括:Another embodiment of the present application provides a head posture measurement method, which is basically the same as the head posture measurement method provided in the above-mentioned embodiments, so this embodiment will not repeat the similarities. The difference is that after step S150, it also includes:
S160:根据所述目标对象的头部姿态确定目标对象的专注度。S160: Determine the concentration of the target object according to the head posture of the target object.
作为一种可选的实现方式,可以将目标对象的头部姿态与该目标对象预设的头部姿态进行比较,当二者相差在预设范围内时,则说明该目标对象的专注度较高;当二者相差超过预设范围时,则说明该目标对象的专注度较低。As an optional implementation, the head posture of the target object can be compared with the preset head posture of the target object. High; when the difference between the two exceeds the preset range, it means that the concentration of the target object is low.
S170:基于所述目标对象的专注度向所述目标对象发出告警。S170: Send an alarm to the target object based on the concentration level of the target object.
作为一种可选的实现方式,当目标对象的专注度较低时,可以向目标对象发出告警,以提示目标对象应集中精力。其中,所述发出告警的方式可以为播放音乐、播放提示语录等,本申请实施例不对其作特殊限制。As an optional implementation manner, when the concentration of the target object is low, an alarm may be sent to the target object to prompt the target object to concentrate. Wherein, the manner of issuing the alarm may be playing music, playing prompt quotations, etc., which are not specifically limited in this embodiment of the present application.
下面参照图5-图7,对本申请的另一实施例提供的一种头部姿态测量方法的一种具体实现方式进行详细说明。A specific implementation of a head posture measurement method provided by another embodiment of the present application will be described in detail below with reference to FIGS. 5-7 .
参见图5所示的流程图,对本申请实施例提供的头部姿态测量方法进行详细介绍。该方法主要包括步骤S210-S270,下面对各个步骤依次进行介绍。Referring to the flow chart shown in FIG. 5 , the head posture measurement method provided by the embodiment of the present application is introduced in detail. The method mainly includes steps S210-S270, each step will be introduced in turn below.
S210:获取目标对象的RGB图像和深度图像。其中,所述RGB图像和所述深度图像均应至少包括目标对象的脸部区域。S210: Acquire the RGB image and the depth image of the target object. Wherein, both the RGB image and the depth image should at least include the face area of the target object.
S220:对步骤S210获得的目标对象的RGB图像进行语义分割,获得目标对象的脸部二维图像。S220: Perform semantic segmentation on the RGB image of the target object obtained in step S210 to obtain a two-dimensional face image of the target object.
S230:基于目标对象的脸部二维图像、目标对象的深度图像和TOF相机的内参确定目标对象的脸部点云数据。S230: Determine face point cloud data of the target object based on the two-dimensional face image of the target object, the depth image of the target object, and the internal reference of the TOF camera.
在本步骤中,TOF相机获取的二维图像与深度图像中的各个像素点具有一一对应关系,也即可获得每个像素坐标(u,v)和该像素的深度信息z c值后,参考上述公式(4)可得到该像素的点云坐标(x w,y w,z w),据此可获得目标对象的脸部点云数据。 In this step, the two-dimensional image acquired by the TOF camera has a one-to-one correspondence with each pixel in the depth image, that is, after obtaining the coordinates (u, v) of each pixel and the depth information z c value of the pixel, Referring to the above formula (4), the point cloud coordinates (x w , y w , z w ) of the pixel can be obtained, and the face point cloud data of the target object can be obtained accordingly.
S240:提取步骤S210中RGB图像的脸部二维关键点。S240: Extract the two-dimensional key points of the face of the RGB image in step S210.
示例性的,如图6所示为本步骤所提取得到的68个脸部二维关键点示意图。其中,该68个脸部二维关键点的选取遵循dlib或opencv所建立的人脸关键点提取标准。在其他实施例中,还可以只提取51个五官关键点,不提取脸部轮廓关键点,即只提取图6中点18-点68。Exemplarily, FIG. 6 is a schematic diagram of 68 facial two-dimensional key points extracted in this step. Among them, the selection of the 68 facial two-dimensional key points follows the human face key point extraction standard established by dlib or opencv. In other embodiments, it is also possible to extract only 51 key points of facial features, and not extract key points of facial contour, that is, only point 18-point 68 in FIG. 6 .
S250:基于步骤S230获得的目标对象的脸部点云数据和步骤S240获得的脸部二维关键点,确定目标对象的脸部所述二维关键点对应的三维坐标,这里为了描述方便,将这些三维坐标称为三维关键点。S250: Based on the face point cloud data of the target object obtained in step S230 and the two-dimensional key points of the face obtained in step S240, determine the three-dimensional coordinates corresponding to the two-dimensional key points of the face of the target object. For the convenience of description, here These 3D coordinates are called 3D keypoints.
具体的:分别将步骤S240中获得的68个脸部二维关键点中的各关键点在目标对象的脸部点云数据中进行索引,获得该68个脸部二维关键点所对应的三维关键点,即获得所述目标对象的脸部三维关键点。Specifically: each key point in the 68 face two-dimensional key points obtained in step S240 is indexed in the face point cloud data of the target object, and the corresponding three-dimensional key points of the 68 face two-dimensional key points are obtained. key points, that is, obtain the 3D key points of the face of the target object.
S260:对步骤S250获得的目标对象的脸部三维关键点集合和FLAME模型脸部关键点集合进行配准,获得第一相似变换参数。S260: Register the 3D face key point set of the target object obtained in step S250 with the FLAME model face key point set to obtain a first similarity transformation parameter.
在本步骤中,首先,需要建立FLAME模型脸部关键点的人脸坐标系。In this step, first, the face coordinate system of the key points of the face of the FLAME model needs to be established.
如图7所示,具体的:该人头坐标系的原点定义为
Figure PCTCN2021097701-appb-000021
x轴方向定义为
Figure PCTCN2021097701-appb-000022
y轴方向定义为
Figure PCTCN2021097701-appb-000023
z轴方向定义为p z=x×y。
As shown in Figure 7, specifically: the origin of the head coordinate system is defined as
Figure PCTCN2021097701-appb-000021
The x-axis direction is defined as
Figure PCTCN2021097701-appb-000022
The y-axis direction is defined as
Figure PCTCN2021097701-appb-000023
The z-axis direction is defined as p z =x×y.
其中,l 1为左眼的左侧眼角的三维坐标,l 2为左眼的右侧眼角的三维坐标,l 3为右眼的左侧眼角的三维坐标,l 4为右眼的右侧眼角的三维坐标,l 5为左侧鼻翼的三维坐标,l 6为右侧鼻翼的三维坐标,l 7为嘴角左侧的三维坐标,l 8为嘴角右侧的三维坐标,o为坐标原点。 Among them, l 1 is the three-dimensional coordinates of the left corner of the left eye, l 2 is the three-dimensional coordinates of the right corner of the left eye, l 3 is the three-dimensional coordinates of the left corner of the right eye, l 4 is the right corner of the right eye l 5 is the three-dimensional coordinates of the left nose, l 6 is the three-dimensional coordinates of the right nose, l 7 is the three-dimensional coordinates of the left corner of the mouth, l 8 is the three-dimensional coordinates of the right mouth corner, and o is the coordinate origin.
接下来,将FLAME模型脸部关键点与目标对象的脸部三维关键点进行配准。Next, register the face key points of the FLAME model with the 3D key points of the face of the target object.
作为一种可选的实现方式,可以利用ICP算法对目标对象的脸部三维关键点集合和FLAME模型脸部关键点集合进行初步配准,得到第一相似变换参数(s,R,t);其中,s为缩放倍数,R为旋转矩阵,t为平移向量。As an optional implementation, the ICP algorithm can be used to initially register the 3D face key point set of the target object and the FLAME model face key point set to obtain the first similarity transformation parameters (s, R, t); Among them, s is the scaling factor, R is the rotation matrix, and t is the translation vector.
S270:优化步骤S260中的第一相似变换参数,得到优化后的相似变换参数,以实现目标对象的脸部三维关键点和FLAME模型脸部关键点的精确配准。S270: Optimizing the first similarity transformation parameters in step S260 to obtain optimized similarity transformation parameters, so as to realize accurate registration of the 3D facial key points of the target object and the facial key points of the FLAME model.
作为一种可选的实现方式,可以利用目标函数优化所述第一相似变化参数。As an optional implementation manner, an objective function may be used to optimize the first similar variation parameter.
其中,可以按下式建立所述目标函数E:Wherein, the objective function E can be established as follows:
E=D p2f+αD proj+βE pir E=D p2f +αD proj +βE pir
具体的,
Figure PCTCN2021097701-appb-000024
specific,
Figure PCTCN2021097701-appb-000024
Figure PCTCN2021097701-appb-000025
Figure PCTCN2021097701-appb-000025
E pri=λ S·‖S‖ 2E·‖E‖ 2P·‖P‖ 2 E pri =λ S ‖S‖ 2E ‖E‖ 2P ‖P‖ 2
上式中,D p2f为步骤S230中的目标对象的脸部点云数据中各点到FLAME模型脸部关键点形成的面之间的距离,D proj为所述FLAME模型中的脸部三维关键点在所述脸部二维图像(S210中目标对象的RGB图像)的投影点,到目标对象的脸部二维图像上的脸部二维关键点的距离,E pri为FLAME模型系数的惩罚项,α为第一权重系数,β为第二权重系数。 In the above formula, D p2f is the distance between each point in the face point cloud data of the target object in step S230 and the face formed by the key points of the FLAME model, and D proj is the three-dimensional key of the face in the FLAME model Point the projection point of the face two-dimensional image (the RGB image of the target object in S210), to the distance of the face two-dimensional key point on the face two-dimensional image of the target object, E pri is the penalty of the FLAME model coefficient item, α is the first weight coefficient, and β is the second weight coefficient.
s i为目标对象的脸部点云数据中的点,p(s i,f c(i))为点s i在参数化人脸模型中距离最近的三角面f c(i)上的投影点,
Figure PCTCN2021097701-appb-000026
为距离最近的三角面f c(i)上第j条边上与s i距离最近的点,
Figure PCTCN2021097701-appb-000027
为距离最近的三角面f c(i)上第w个顶点。
s i is the point in the face point cloud data of the target object, p(s i ,f c(i) ) is the projection of point s i on the nearest triangular surface f c(i) in the parameterized face model point,
Figure PCTCN2021097701-appb-000026
is the point closest to si on the jth edge of the nearest triangular face f c(i) ,
Figure PCTCN2021097701-appb-000027
is the wth vertex on the nearest triangular face f c(i) .
u i为所述参数化人脸模型中的三维关键点在所述脸部二维图像的投影点,v i为脸部二维图像上的脸部二维关键点,n为所述参数化人脸模型中的三维关键点的数量。 U i is the projection point of the three-dimensional key point in the parameterized face model on the two-dimensional image of the face, v i is the two-dimensional key point of the face on the two-dimensional image of the face, and n is the parameterized The number of 3D keypoints in the face model.
S为参数化人脸模型中的形状系数,E为参数化人脸模型中的表情系数,P为参数化人脸模型中的姿态系数,λ S为参数化人脸模型中的形状系数的惩罚系数,λ E为参数化人脸模型中的表情系数的惩罚系数,λ P为参数化人脸模型中的姿态系数的惩罚系数。 S is the shape coefficient in the parametric face model, E is the expression coefficient in the parametric face model, P is the pose coefficient in the parametric face model, λ S is the penalty of the shape coefficient in the parametric face model coefficient, λ E is the penalty coefficient of the expression coefficient in the parameterized face model, and λ P is the penalty coefficient of the pose coefficient in the parameterized face model.
S280:根据步骤S270中优化后的相似变换参数获得用于表示目标对象头部姿态的欧拉角。S280: Obtain Euler angles used to represent the head pose of the target object according to the similarity transformation parameters optimized in step S270.
作为一种可选的实现方式,可以对参数化人脸模型与所述脸部的点云数据拟合得到的相似变换参数中的旋转矩阵进行罗德里格斯变换,获得用于表示目标对象头部姿态的欧拉角。其中,该欧拉角至少包括俯仰角(pitch)、偏航角(yaw)和滚转角(roll)。As an optional implementation, Rodrigues transformation can be performed on the rotation matrix in the similarity transformation parameters obtained by fitting the parameterized face model and the point cloud data of the face to obtain the head used to represent the target object. Euler angles of the head pose. Wherein, the Euler angles at least include a pitch angle (pitch), a yaw angle (yaw) and a roll angle (roll).
在本实施例中,一种头部姿态测量方法还包括:根据所述目标对象的头部姿态确定目标对象的专注度。其中,该步骤与上述实施例中步骤S160相同,故不再进行赘述。In this embodiment, the head posture measurement method further includes: determining the concentration of the target object according to the head posture of the target object. Wherein, this step is the same as step S160 in the above embodiment, so it will not be repeated here.
基于所述目标对象的专注度向所述目标对象发出告警。其中,该步骤与上述实施例中步骤S170相同,故不再进行赘述。An alert is sent to the target object based on the concentration level of the target object. Wherein, this step is the same as step S170 in the above embodiment, so it will not be repeated here.
本申请的另一实施例提供一种辅助驾驶装置,该装置可以由软件系统实现,也可以由硬件设备实现,还可以由软件系统和硬件设备结合来实现。Another embodiment of the present application provides a driving assistance device, which may be realized by a software system, may also be realized by a hardware device, and may also be realized by a combination of a software system and a hardware device.
应理解,图8仅是示例性地展示了一种辅助驾驶装置的一种结构化示意图,如图8所示,该辅助驾驶装置包括驾驶员头部姿态检测模块410和辅助驾驶模块420。It should be understood that FIG. 8 is only an exemplary structural diagram showing a driving assistance device. As shown in FIG. 8 , the driving assistance device includes a driver's head posture detection module 410 and a driving assistance module 420 .
具体的,驾驶员头部姿态检测模块410,用于利用上述实施例中提供的头部姿态检测方法获取表示驾驶员的头部姿态,该模块功能的具体实现方式参见上述实施例中介绍,本实施例不再对进行赘述。辅助驾驶模块420,用于驾驶员的头部姿态确定所述驾驶员的专注度,并基于所述专注度向所述驾驶员发出告警。Specifically, the driver's head posture detection module 410 is used to obtain the driver's head posture using the head posture detection method provided in the above-mentioned embodiments. The embodiments will not be described in detail. The driving assistance module 420 is used for determining the concentration of the driver based on the posture of the driver's head, and issuing a warning to the driver based on the concentration.
本申请的另一实施例提供一种头部的三维数据生成装置,该装置可以由软件系统实现,也可以由硬件设备实现,还可以由软件系统和硬件设备结合来实现。Another embodiment of the present application provides a device for generating three-dimensional head data. The device may be implemented by a software system, may also be implemented by a hardware device, and may also be implemented by a combination of a software system and a hardware device.
应理解,图9仅是示例性地展示了一种头部姿态测量装置的一种结构化示意图,本申请并不限定对该头部姿态测量装置中功能模块的划分。如图9所示,该头部姿态测 量装置可以在逻辑上分成多个模块,每个模块可以具有不同的功能,每个模块的功能由可以计算设备中的处理器读取并执行存储器中的指令来实现。示例性的,该头部姿态测量装置包括第一获取模块510、第二获取模块520、第三获取模块530、第四获取模块540和第一确定模块550。在一种可选的实现方式中,该头部姿态测量装置用于执行图2所示的步骤S110-S150中描述的内容。具体的,可以为:第一获取模块510,用于获取目标对象的脸部点云数据。第二获取模块520,用于基于所述目标对象的脸部点云数据和目标对象的脸部二维关键点数据,获取所述目标对象的脸部关键点的点云数据。第三获取模块530,用于将所述目标对象的脸部关键点的点云数据与参数化人脸模型中脸部关键点的点云数据进行配准,获得第一相似变换参数。第四获取模块540用于根据目标函数对所述第一相似变换参数进行优化,获得第二相似变换参数。第一确定模块550用于根据所述第二相似变化参数确定所述目标对象的头部姿态。It should be understood that FIG. 9 is only an exemplary structural diagram showing a head posture measurement device, and the present application does not limit the division of functional modules in the head posture measurement device. As shown in Figure 9, the head posture measurement device can be logically divided into multiple modules, each module can have different functions, and the function of each module can be read by the processor in the computing device and executed in the memory. instructions to implement. Exemplarily, the head posture measurement device includes a first acquisition module 510 , a second acquisition module 520 , a third acquisition module 530 , a fourth acquisition module 540 and a first determination module 550 . In an optional implementation manner, the head posture measurement device is used to execute the content described in steps S110-S150 shown in FIG. 2 . Specifically, it may be: a first acquiring module 510, configured to acquire facial point cloud data of the target object. The second acquisition module 520 is configured to acquire point cloud data of key points of the face of the target object based on the point cloud data of the face of the target object and the two-dimensional key point data of the face of the target object. The third acquisition module 530 is configured to register the point cloud data of the facial key points of the target object with the point cloud data of the facial key points in the parameterized face model to obtain the first similarity transformation parameters. The fourth obtaining module 540 is configured to optimize the first similarity transformation parameters according to the objective function to obtain second similarity transformation parameters. The first determining module 550 is configured to determine the head pose of the target object according to the second similarity change parameter.
可选的,所述第四获取模块540中的所述目标函数包括点面距离函数;Optionally, the objective function in the fourth acquisition module 540 includes a point-to-plane distance function;
其中,所述点面距离函数为所述目标对象的脸部点云数据中的点到所述参数化人脸模型中距离最近的三角面片的距离函数;所述三角面片是所述参数化人脸模型中相邻的三个顶点所构成的三角形。Wherein, the point-plane distance function is a distance function from a point in the face point cloud data of the target object to the nearest triangular patch in the parameterized face model; the triangular patch is the parameter The triangle formed by the adjacent three vertices in the human face model.
作为一种可选的实现方式,所述点面距离函数包括:As an optional implementation, the point-to-plane distance function includes:
Figure PCTCN2021097701-appb-000028
Figure PCTCN2021097701-appb-000028
其中,D 2pf为点面距离函数,s i为目标对象的脸部点云数据中的点,p(s i,f c(i))为点s i在在参数化人脸模型中距离最近的三角面f c(i)上的投影点,
Figure PCTCN2021097701-appb-000029
为最近的三角面f c(i)上第j条边上与s i距离最近的点,
Figure PCTCN2021097701-appb-000030
为最近的三角面f c(i)上第w个顶点。
Among them, D 2pf is the point-to-plane distance function, s i is the point in the face point cloud data of the target object, p(s i ,f c(i) ) is the point s i with the closest distance in the parameterized face model The projection point on the triangular face f c(i) ,
Figure PCTCN2021097701-appb-000029
is the point closest to si on the jth side of the nearest triangular face f c(i) ,
Figure PCTCN2021097701-appb-000030
is the wth vertex on the nearest triangular face f c(i) .
在一些实施例中,所述第四获取模块540中的所述目标函数还包括:关键点投影距离函数;In some embodiments, the objective function in the fourth acquisition module 540 further includes: a key point projection distance function;
其中,所述关键点投影距离函数为所述参数化人脸模型中的脸部关键点在所述脸部二维图像的投影点,到目标对象的脸部二维图像上的脸部二维关键点的距离的函数。Wherein, the key point projection distance function is the projection point of the face key point in the parameterized face model on the face two-dimensional image, and the face two-dimensional image on the face two-dimensional image of the target object. A function of the distance of keypoints.
作为一种可选的实现方式,所述关键点投影距离函数包括:As an optional implementation, the key point projection distance function includes:
Figure PCTCN2021097701-appb-000031
Figure PCTCN2021097701-appb-000031
其中,D proj为关键点投影距离函数,u i为所述参数化人脸模型中的脸部关键点在所述脸部二维图像的投影点,v i为脸部二维图像上的脸部二维关键点,n为所述参数化人脸模型中的脸部关键点的数量。 Wherein, D proj is the key point projection distance function, u i is the projection point of the face key point in the parameterized face model on the face two-dimensional image, v i is the face on the face two-dimensional image The first two-dimensional key points, n is the number of facial key points in the parameterized face model.
在一些实施例中,所述第四获取模块540中的所述目标函数还包括:In some embodiments, the objective function in the fourth acquisition module 540 also includes:
参数化人脸模型系数的惩罚项函数;其中,所述惩罚项用于对所述系数的大小进行约束。A penalty term function for parameterizing the coefficients of the face model; wherein, the penalty term is used to constrain the size of the coefficients.
作为一种可选的实现方式,所述参数化人脸模型系数的惩罚项函数包括:As an optional implementation, the penalty term function of the parameterized face model coefficients includes:
E pri=λ S*||S|| 2E*||E|| 2P*||P|| 2 E pri =λ S *||S|| 2E *||E|| 2P *||P|| 2
其中,E pri为参数化人脸模型系数的惩罚项函数,S为参数化人脸模型中的形状系数,E为参数化人脸模型中的表情系数,P为参数化人脸模型中的姿态系数,λ S为参数化人 脸模型中的形状系数的惩罚系数,λ E为参数化人脸模型中的表情系数的惩罚系数,λ P为参数化人脸模型中的姿态系数的惩罚系数。 Among them, E pri is the penalty term function of the parameterized face model coefficient, S is the shape coefficient in the parameterized face model, E is the expression coefficient in the parameterized face model, and P is the pose in the parameterized face model coefficient, λ S is the penalty coefficient of the shape coefficient in the parameterized face model, λ E is the penalty coefficient of the expression coefficient in the parameterized face model, and λ P is the penalty coefficient of the pose coefficient in the parameterized face model.
作为一种可选的实现方式,所述第一获取模块510,包括:As an optional implementation manner, the first acquiring module 510 includes:
第一获取子模块,用于基于目标对象的二维图像和深度图像获得所述目标对象的点云数据;The first acquisition submodule is used to obtain the point cloud data of the target object based on the two-dimensional image and the depth image of the target object;
第一提取子模块,用于从所述目标对象的二维图像中提取目标对象的脸部二维图像;The first extraction submodule is used to extract the face two-dimensional image of the target object from the two-dimensional image of the target object;
第二提取子模块,用于根据提取得到的所述脸部二维图像从所述目标对象的点云数据中提取所述脸部二维图像对应的点云数据。在一些实施例中,所述目标对象的二维图像和深度图像通过TOF相机获得。The second extraction sub-module is configured to extract point cloud data corresponding to the two-dimensional face image from the point cloud data of the target object according to the extracted two-dimensional face image. In some embodiments, the two-dimensional image and the depth image of the target object are obtained by a TOF camera.
作为一种可选的实现方式,所述目标对象的脸部关键点为51个脸部关键点。As an optional implementation manner, the facial key points of the target object are 51 facial key points.
作为一种可选的实现方式,所述目标对象的脸部关键点为68个脸部关键点。As an optional implementation manner, the facial key points of the target object are 68 facial key points.
在一些实施例中,所述第一确定模块550具体用于:对所述相似变换参数进行罗德里格斯变化,获得用于表示目标对象头部姿态的欧拉角。In some embodiments, the first determination module 550 is specifically configured to: perform a Rodrigues transformation on the similarity transformation parameters to obtain Euler angles used to represent the head pose of the target object.
在一些实施例中,头部姿态测量装置还包括:In some embodiments, the head posture measurement device also includes:
第二确定模块,用于根据所述目标对象的头部姿态确定目标对象的专注度;The second determination module is used to determine the concentration of the target object according to the head posture of the target object;
告警模块,用于基于所述目标对象的专注度向所述目标对象发出告警。An alarm module, configured to issue an alarm to the target object based on the concentration of the target object.
其中,该实施例中各个功能模块的具体实现方式可以参见上述方法实施例中的介绍,本实施例不再对其进行赘述。Wherein, for the specific implementation manner of each functional module in this embodiment, reference may be made to the introduction in the foregoing method embodiments, and details are not repeated in this embodiment.
图10是本申请实施例提供的一种计算设备900的结构性示意性图。该计算设备900包括:处理器910、存储器920、通信接口930。FIG. 10 is a schematic structural diagram of a computing device 900 provided by an embodiment of the present application. The computing device 900 includes: a processor 910 , a memory 920 , and a communication interface 930 .
应理解,该图10中所示的计算设备900中的通信接口930可以用于与其他设备之间进行通信。It should be understood that the communication interface 930 in the computing device 900 shown in FIG. 10 can be used to communicate with other devices.
其中,该处理器910可以与存储器920连接。该存储器920可以用于存储该程序代码和数据。因此,该存储器920可以是处理器910内部的存储单元,也可以是与处理器910独立的外部存储单元,还可以是包括处理器910内部的存储单元和与处理器910独立的外部存储单元的部件。Wherein, the processor 910 may be connected to the memory 920 . The memory 920 can be used to store the program codes and data. Therefore, the memory 920 may be a storage unit inside the processor 910, or an external storage unit independent of the processor 910, or may include a storage unit inside the processor 910 and an external storage unit independent of the processor 910. part.
可选的,计算设备900还可以包括总线。其中,存储器920、通信接口930可以通过总线与处理器910连接。总线可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。所述总线可以分为地址总线、数据总线、控制总线等。Optionally, computing device 900 may further include a bus. Wherein, the memory 920 and the communication interface 930 may be connected to the processor 910 through a bus. The bus may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus or the like. The bus can be divided into address bus, data bus, control bus and so on.
应理解,在本申请实施例中,该处理器910可以采用中央处理单元(central processing unit,CPU)。该处理器还可以是其它通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(Application specific integrated circuit,ASIC)、现成可编程门矩阵(field programmable gate Array,FPGA)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。或者该处理器910采用一个或多个集成电路,用于执行相关程序,以实现本申请实施例所提供的技术方案。It should be understood that, in this embodiment of the present application, the processor 910 may be a central processing unit (central processing unit, CPU). The processor can also be other general-purpose processors, digital signal processors (digital signal processors, DSPs), application specific integrated circuits (Application specific integrated circuits, ASICs), off-the-shelf programmable gate arrays (field programmable gate arrays, FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. Alternatively, the processor 910 adopts one or more integrated circuits for executing related programs, so as to implement the technical solutions provided by the embodiments of the present application.
该存储器920可以包括只读存储器和随机存取存储器,并向处理器910提供指令 和数据。处理器910的一部分还可以包括非易失性随机存取存储器。例如,处理器910还可以存储设备类型的信息。The memory 920 may include read-only memory and random-access memory, and provides instructions and data to the processor 910. A portion of processor 910 may also include non-volatile random access memory. For example, processor 910 may also store device type information.
在计算设备900运行时,所述处理器910执行所述存储器920中的计算机执行指令执行上述方法的操作步骤。When the computing device 900 is running, the processor 910 executes the computer-executed instructions in the memory 920 to perform the operation steps of the above method.
应理解,根据本申请实施例的计算设备900可以对应于执行根据本申请各实施例的方法中的相应主体,并且计算设备900中的各个模块的上述和其它操作和/或功能分别为了实现本实施例各方法的相应流程,为了简洁,在此不再赘述。It should be understood that the computing device 900 according to the embodiment of the present application may correspond to a corresponding body executing the methods according to the various embodiments of the present application, and the above-mentioned and other operations and/or functions of the modules in the computing device 900 are for realizing the present invention For the sake of brevity, the corresponding processes of the methods in the embodiments are not repeated here.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .
本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时用于执行一种头部姿态测量方法,该方法包括上述各个实施例所描述的方案中的至少之一。The embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, it is used to perform a head posture measurement method, and the method includes the methods described in the above-mentioned embodiments. at least one of the options.
本申请实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机 可读存储介质例如可以是,但不限于,电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。The computer storage medium in the embodiments of the present application may use any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer readable storage media include: electrical connections with one or more leads, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In this document, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。A computer readable signal medium may include a data signal carrying computer readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括、但不限于无线、电线、光缆、RF等等,或者上述的任意合适的组合。Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
可以以一种或多种程序设计语言或其组合来编写用于执行本申请操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for performing the operations of the present application may be written in one or more programming languages or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional Procedural Programming Language - such as "C" or a similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In cases involving a remote computer, the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through the Internet using an Internet service provider). connect).
注意,上述仅为本申请的较佳实施例及所运用的技术原理。本领域技术人员会理解,本申请不限于这里所述的特定实施例,对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本申请的保护范围。因此,虽然通过以上实施例对本申请进行了较为详细的说明,但是本申请不仅仅限于以上实施例,在不脱离本申请的构思的情况下,还可以包括更多其他等效实施例,均属于本申请的保护范畴。Note that the above are only preferred embodiments and technical principles used in this application. Those skilled in the art will understand that the present application is not limited to the specific embodiments described herein, and various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the protection scope of the present application. Therefore, although the present application has been described in detail through the above embodiments, the present application is not limited to the above embodiments, and may include more other equivalent embodiments without departing from the concept of the present application, all of which belong to protection scope of this application.

Claims (29)

  1. 一种头部姿态测量方法,其特征在于,包括:A head posture measurement method, characterized in that, comprising:
    获取目标对象的脸部点云数据;Obtain the face point cloud data of the target object;
    基于所述目标对象的脸部点云数据和目标对象的脸部二维关键点数据,获取所述目标对象的脸部关键点的点云数据;Based on the face point cloud data of the target object and the face two-dimensional key point data of the target object, acquiring point cloud data of the face key points of the target object;
    将所述目标对象的脸部关键点的点云数据与参数化人脸模型中脸部关键点的点云数据进行配准,获得第一相似变换参数;Registering the point cloud data of the facial key points of the target object with the point cloud data of the facial key points in the parameterized face model to obtain the first similarity transformation parameters;
    根据目标函数对所述第一相似变换参数进行优化,获得第二相似变换参数;Optimizing the first similarity transformation parameters according to an objective function to obtain second similarity transformation parameters;
    根据所述第二相似变化参数确定所述目标对象的头部姿态。Determine the head pose of the target object according to the second similarity change parameter.
  2. 根据权利要求1所述的方法,其特征在于,所述目标函数包括点面距离函数;The method according to claim 1, wherein the objective function comprises a point-to-plane distance function;
    其中,所述点面距离函数为所述目标对象的脸部点云数据中的点到所述参数化人脸模型中距离最近的三角面片的距离函数;所述三角面片是所述参数化人脸模型中相邻的三个点所构成的三角形。Wherein, the point-plane distance function is a distance function from a point in the face point cloud data of the target object to the nearest triangular patch in the parameterized face model; the triangular patch is the parameter The triangle formed by three adjacent points in the human face model.
  3. 根据权利要求2所述的方法,其特征在于,所述点面距离函数包括:The method according to claim 2, wherein the point-to-plane distance function comprises:
    Figure PCTCN2021097701-appb-100001
    Figure PCTCN2021097701-appb-100001
    其中,D 2pf为点面距离函数,s i为目标对象的脸部点云数据中的点,p(s i,f c(i))为点s i在参数化人脸模型中距离最近的三角面f c(i)上的投影点,
    Figure PCTCN2021097701-appb-100002
    为最近的三角面f c(i)上第j条边上与s i距离最近的点,
    Figure PCTCN2021097701-appb-100003
    为最近的三角面f c(i)上第w个顶点。
    Among them, D 2pf is the point-plane distance function, s i is the point in the face point cloud data of the target object, p(s i , f c(i) ) is the point s i with the closest distance in the parameterized face model The projection point on the triangular face f c(i) ,
    Figure PCTCN2021097701-appb-100002
    is the point closest to si on the jth side of the nearest triangular face f c(i) ,
    Figure PCTCN2021097701-appb-100003
    is the wth vertex on the nearest triangular face f c(i) .
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述目标函数还包括:The method according to any one of claims 1-3, wherein the objective function further comprises:
    关键点投影距离函数;key point projection distance function;
    其中,所述关键点投影距离函数为所述参数化人脸模型中的脸部关键点在脸部二维图像的投影点,到目标对象的脸部二维图像上的脸部二维关键点的距离的函数。Wherein, the key point projection distance function is the projection point of the key point of the face in the parameterized face model on the two-dimensional image of the face, and the two-dimensional key point of the face on the two-dimensional face image of the target object function of the distance.
  5. 根据权利要求4所述的方法,其特征在于,所述关键点投影距离函数包括:The method according to claim 4, wherein the key point projection distance function comprises:
    Figure PCTCN2021097701-appb-100004
    Figure PCTCN2021097701-appb-100004
    其中,D proj为关键点投影距离函数,u i为所述参数化人脸模型中的脸部关键点在脸部二维图像的投影点,v i为脸部二维图像上的脸部二维关键点,n为所述参数化人脸模型中的脸部关键点的数量。 Among them, D proj is the key point projection distance function, u i is the projection point of the face key point in the parameterized face model on the face two-dimensional image, v i is the face two-dimensional image on the face two-dimensional image dimension key points, n is the number of face key points in the parameterized face model.
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述目标函数还包括:The method according to any one of claims 1-5, wherein the objective function further comprises:
    参数化人脸模型系数的惩罚项函数;其中,所述惩罚项用于对所述系数的大小进行约束。A penalty term function for parameterizing the coefficients of the face model; wherein, the penalty term is used to constrain the size of the coefficients.
  7. 根据权利要求6所述的方法,其特征在于,所述参数化人脸模型系数的惩罚项函数包括:The method according to claim 6, wherein the penalty term function of the parameterized face model coefficients comprises:
    E pri=λ S*||S|| 2E*||E|| 2P*||P|| 2 E pri =λ S *||S|| 2E *||E|| 2P *||P|| 2
    其中,E pri为参数化人脸模型系数的惩罚项函数,S为参数化人脸模型中的形状系数,E为参数化人脸模型中的表情系数,P为参数化人脸模型中的姿态系数,λ S为参数化人 脸模型中的形状系数的惩罚系数,λ E为参数化人脸模型中的表情系数的惩罚系数,λ P为参数化人脸模型中的姿态系数的惩罚系数。 Among them, E pri is the penalty term function of the parameterized face model coefficient, S is the shape coefficient in the parameterized face model, E is the expression coefficient in the parameterized face model, and P is the pose in the parameterized face model coefficient, λ S is the penalty coefficient of the shape coefficient in the parameterized face model, λ E is the penalty coefficient of the expression coefficient in the parameterized face model, and λ P is the penalty coefficient of the pose coefficient in the parameterized face model.
  8. 根据权利要求1所述的方法,其特征在于,所述获取目标对象的脸部点云数据,包括:The method according to claim 1, wherein said acquiring the facial point cloud data of the target object comprises:
    基于目标对象的二维图像和深度图像获得所述目标对象的点云数据;Obtaining point cloud data of the target object based on the two-dimensional image and the depth image of the target object;
    从所述目标对象的二维图像中提取目标对象的脸部二维图像;extracting a two-dimensional image of the face of the target object from the two-dimensional image of the target object;
    根据提取得到的所述脸部二维图像,从所述目标对象的点云数据中提取所述脸部二维图像对应的点云数据。According to the extracted two-dimensional facial image, point cloud data corresponding to the two-dimensional facial image is extracted from the point cloud data of the target object.
  9. 根据权利要求1-8任一项所述的方法,其特征在于,所述目标对象的二维图像和深度图像通过TOF相机获得。The method according to any one of claims 1-8, wherein the two-dimensional image and the depth image of the target object are obtained by a TOF camera.
  10. 根据权利要求1所述的方法,其特征在于,所述目标对象的脸部关键点为51个脸部关键点。The method according to claim 1, wherein the facial key points of the target object are 51 facial key points.
  11. 根据权利要求1所述的方法,其特征在于,所述目标对象的脸部关键点为68个脸部关键点。The method according to claim 1, wherein the facial key points of the target object are 68 facial key points.
  12. 根据权利要求1所述的方法,其特征在于,所述根据所述第二相似变化参数确定所述目标对象的头部姿态,包括:The method according to claim 1, wherein the determining the head posture of the target object according to the second similarity change parameter comprises:
    对所述第二相似变换参数进行罗德里格斯变化,获得用于表示目标对象头部姿态的欧拉角。Rodriguez changes are performed on the second similarity transformation parameters to obtain Euler angles used to represent the head pose of the target object.
  13. 根据权利要求1-12任一项所述的方法,其特征在于,还包括:根据所述目标对象的头部姿态确定目标对象的专注度;The method according to any one of claims 1-12, further comprising: determining the concentration of the target object according to the head posture of the target object;
    基于所述目标对象的专注度向所述目标对象发出告警。An alert is sent to the target object based on the concentration level of the target object.
  14. 一种头部姿态测量装置,其特征在于,包括:A head posture measuring device is characterized in that it comprises:
    第一获取模块,用于获取目标对象的脸部点云数据;The first obtaining module is used to obtain the facial point cloud data of the target object;
    第二获取模块,用于基于所述目标对象的脸部点云数据和目标对象的脸部二维关键点数据,获取所述目标对象的脸部关键点的点云数据;The second acquisition module is used to acquire point cloud data of key points of the face of the target object based on the point cloud data of the face of the target object and the two-dimensional key point data of the face of the target object;
    第三获取模块,用于将所述目标对象的脸部关键点的点云数据与参数化人脸模型中脸部关键点的点云数据进行配准,获得第一相似变换参数;The third acquisition module is used to register the point cloud data of the facial key points of the target object with the point cloud data of the facial key points in the parameterized face model to obtain the first similarity transformation parameters;
    第四获取模块,用于根据目标函数对所述第一相似变换参数进行优化,获得第二相似变换参数;A fourth acquisition module, configured to optimize the first similarity transformation parameters according to an objective function, and obtain second similarity transformation parameters;
    第一确定模块,用于根据所述第二相似变化参数确定所述目标对象的头部姿态。A first determining module, configured to determine the head pose of the target object according to the second similarity change parameter.
  15. 根据权利要求14所述的装置,其特征在于,所述第四获取模块中的所述目标函数包括点面距离函数;The device according to claim 14, wherein the objective function in the fourth acquisition module includes a point-to-plane distance function;
    其中,所述点面距离函数为所述目标对象的脸部点云数据中的点到所述参数化人脸模型中距离最近的三角面片的距离函数;所述三角面片是所述参数化人脸模型中相邻的三个顶点所构成的三角形。Wherein, the point-plane distance function is a distance function from a point in the face point cloud data of the target object to the nearest triangular patch in the parameterized face model; the triangular patch is the parameter The triangle formed by the adjacent three vertices in the human face model.
  16. 根据权利要求15所述的装置,其特征在于,所述点面距离函数具体用于:The device according to claim 15, wherein the point-to-plane distance function is specifically used for:
    Figure PCTCN2021097701-appb-100005
    Figure PCTCN2021097701-appb-100005
    其中,D 2pf为点面距离函数,s i为目标对象的脸部点云数据中的点,p(s i,f c(i))为点s i在 参数化人脸模型中距离最近的三角面f c(i)上的投影点,
    Figure PCTCN2021097701-appb-100006
    为最近的三角面f c(i)上第j条边上与s i距离最近的点,
    Figure PCTCN2021097701-appb-100007
    为最近的三角面f c(i)上第w个顶点。
    Among them, D 2pf is the point-plane distance function, s i is the point in the face point cloud data of the target object, p(s i , f c(i) ) is the point s i with the closest distance in the parameterized face model The projection point on the triangular face f c(i) ,
    Figure PCTCN2021097701-appb-100006
    is the point closest to si on the jth side of the nearest triangular face f c(i) ,
    Figure PCTCN2021097701-appb-100007
    is the wth vertex on the nearest triangular face f c(i) .
  17. 根据权利要求14-16任一项所述的装置,其特征在于,所述第四获取模块中的所述目标函数还包括:The device according to any one of claims 14-16, wherein the objective function in the fourth acquisition module further comprises:
    关键点投影距离函数;key point projection distance function;
    其中,所述关键点投影距离函数为所述参数化人脸模型中的脸部关键点在脸部二维图像的投影点,到目标对象的脸部二维图像上的脸部二维关键点的距离的函数。Wherein, the key point projection distance function is the projection point of the key point of the face in the parameterized face model on the two-dimensional image of the face, and the two-dimensional key point of the face on the two-dimensional face image of the target object function of the distance.
  18. 根据权利要求17所述的装置,其特征在于,所述关键点投影距离函数具体用于:The device according to claim 17, wherein the key point projection distance function is specifically used for:
    Figure PCTCN2021097701-appb-100008
    Figure PCTCN2021097701-appb-100008
    其中,D proj为关键点投影距离函数,u i为所述参数化人脸模型中的脸部关键点在脸部二维图像的投影点,v i为脸部二维图像上的脸部二维关键点,n为所述参数化人脸模型中的脸部关键点的数量。 Among them, D proj is the key point projection distance function, u i is the projection point of the face key point in the parameterized face model on the face two-dimensional image, v i is the face two-dimensional image on the face two-dimensional image dimension key points, n is the number of face key points in the parameterized face model.
  19. 根据权利要求14-18任一项所述的装置,其特征在于,所述第四获取模块中的所述目标函数还包括:The device according to any one of claims 14-18, wherein the objective function in the fourth acquisition module further comprises:
    参数化人脸模型系数的惩罚项函数;其中,所述惩罚项用于对所述系数的大小进行约束。A penalty term function for parameterizing the coefficients of the face model; wherein, the penalty term is used to constrain the size of the coefficients.
  20. 根据权利要求19所述的装置,其特征在于,所述参数化人脸模型系数的惩罚项函数具体用于:The device according to claim 19, wherein the penalty term function of the parameterized face model coefficient is specifically used for:
    E pri=λ S*||S|| 2E*||E|| 2P*||P|| 2 E pri =λ S *||S|| 2E *||E|| 2P *||P|| 2
    其中,E pri为参数化人脸模型系数的惩罚项函数,S为参数化人脸模型中的形状系数,E为参数化人脸模型中的表情系数,P为参数化人脸模型中的姿态系数,λ S为参数化人脸模型中的形状系数的惩罚系数,λ E为参数化人脸模型中的表情系数的惩罚系数,λ P为参数化人脸模型中的姿态系数的惩罚系数。 Among them, E pri is the penalty term function of the parameterized face model coefficient, S is the shape coefficient in the parameterized face model, E is the expression coefficient in the parameterized face model, and P is the pose in the parameterized face model coefficient, λ S is the penalty coefficient of the shape coefficient in the parameterized face model, λ E is the penalty coefficient of the expression coefficient in the parameterized face model, and λ P is the penalty coefficient of the pose coefficient in the parameterized face model.
  21. 根据权利要求14所述的装置,其特征在于,所述第一获取模块,包括:The device according to claim 14, wherein the first acquiring module comprises:
    第一获取子模块,用于基于目标对象的二维图像和深度图像获得所述目标对象的点云数据;The first acquisition submodule is used to obtain the point cloud data of the target object based on the two-dimensional image and the depth image of the target object;
    第一提取子模块,用于从所述目标对象的二维图像中提取目标对象的脸部二维图像;The first extraction submodule is used to extract the face two-dimensional image of the target object from the two-dimensional image of the target object;
    第二提取子模块,用于根据提取得到的所述脸部二维图像,从所述目标对象的点云数据中提取所述脸部二维图像对应的点云数据。The second extraction sub-module is configured to extract the point cloud data corresponding to the two-dimensional face image from the point cloud data of the target object according to the extracted two-dimensional face image.
  22. 根据权利要求14-21任一项所述的装置,其特征在于,所述目标对象的二维图像和深度图像通过TOF相机获得。The device according to any one of claims 14-21, wherein the two-dimensional image and the depth image of the target object are obtained by a TOF camera.
  23. 根据权利要求14所述的装置,其特征在于,所述目标对象的脸部关键点为51个脸部关键点。The device according to claim 14, wherein the facial key points of the target object are 51 facial key points.
  24. 根据权利要求14所述的装置,其特征在于,所述目标对象的脸部关键点为68个脸部关键点。The device according to claim 14, wherein the facial key points of the target object are 68 facial key points.
  25. 根据权利要求14所述的装置,其特征在于,所述第一确定模块具体用于:对所述第二相似变换参数进行罗德里格斯变化,获得用于表示目标对象头部姿态的欧拉角。The device according to claim 14, wherein the first determining module is specifically configured to: perform a Rodrigues transformation on the second similarity transformation parameter to obtain an Euler used to represent the head pose of the target object horn.
  26. 根据权利要求14-25任一项所述的装置,其特征在于,还包括:The device according to any one of claims 14-25, further comprising:
    第二确定模块,用于根据所述目标对象的头部姿态确定目标对象的专注度;The second determination module is used to determine the concentration of the target object according to the head posture of the target object;
    告警模块,用于基于所述目标对象的专注度向所述目标对象发出告警。An alarm module, configured to issue an alarm to the target object based on the concentration of the target object.
  27. 一种计算设备,其特征在于,包括:A computing device, comprising:
    通信接口;Communication Interface;
    至少一个处理器,其与所述通信接口连接;以及at least one processor connected to the communication interface; and
    至少一个存储器,其与所述处理器连接并存储有程序指令,所述程序指令当被所述至少一个处理器执行时,使得所述至少一个处理器执行权利要求1-13任一项所述的一种头部姿态测量方法。At least one memory, which is connected to the processor and stores program instructions, and when the program instructions are executed by the at least one processor, the at least one processor executes the program described in any one of claims 1-13. A head pose measurement method.
  28. 一种计算机可读存储介质,其上存储有程序指令,其特征在于,所述程序指令当被计算机执行时,使得所述计算机执行权利要求1-13任一项所述的一种头部姿态测量方法。A computer-readable storage medium on which program instructions are stored, wherein when the program instructions are executed by a computer, the computer executes the head posture described in any one of claims 1-13 Measurement methods.
  29. 一种计算机程序产品,其特征在于,当所述计算机程序产品在计算设备上运行时,使得所述计算设备执行权利要求1-13任一项所述的一种头部姿态测量方法。A computer program product, characterized in that, when the computer program product is run on a computing device, the computing device is made to execute the method for measuring head posture according to any one of claims 1-13.
PCT/CN2021/097701 2021-06-01 2021-06-01 Head posture measurement method and apparatus WO2022252118A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180001892.5A CN113544744A (en) 2021-06-01 2021-06-01 Head posture measuring method and device
PCT/CN2021/097701 WO2022252118A1 (en) 2021-06-01 2021-06-01 Head posture measurement method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/097701 WO2022252118A1 (en) 2021-06-01 2021-06-01 Head posture measurement method and apparatus

Publications (1)

Publication Number Publication Date
WO2022252118A1 true WO2022252118A1 (en) 2022-12-08

Family

ID=78092850

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/097701 WO2022252118A1 (en) 2021-06-01 2021-06-01 Head posture measurement method and apparatus

Country Status (2)

Country Link
CN (1) CN113544744A (en)
WO (1) WO2022252118A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117057165B (en) * 2023-10-11 2023-12-22 南京气象科技创新研究院 Model parameter optimization method based on ground meteorological data cluster

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20150031085A (en) * 2013-09-13 2015-03-23 인하대학교 산학협력단 3D face-modeling device, system and method using Multiple cameras
CN105760809A (en) * 2014-12-19 2016-07-13 联想(北京)有限公司 Method and apparatus for head pose estimation
CN109035329A (en) * 2018-08-03 2018-12-18 厦门大学 Camera Attitude estimation optimization method based on depth characteristic
CN110371132A (en) * 2019-06-18 2019-10-25 华为技术有限公司 Driver's adapter tube appraisal procedure and device
CN110909582A (en) * 2018-09-18 2020-03-24 华为技术有限公司 Face recognition method and device
CN111243093A (en) * 2020-01-07 2020-06-05 腾讯科技(深圳)有限公司 Three-dimensional face grid generation method, device, equipment and storage medium
CN111414798A (en) * 2019-02-03 2020-07-14 沈阳工业大学 Head posture detection method and system based on RGB-D image
CN111985384A (en) * 2020-08-14 2020-11-24 深圳地平线机器人科技有限公司 Method and device for acquiring 3D coordinates of face key points and 3D face model

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20150031085A (en) * 2013-09-13 2015-03-23 인하대학교 산학협력단 3D face-modeling device, system and method using Multiple cameras
CN105760809A (en) * 2014-12-19 2016-07-13 联想(北京)有限公司 Method and apparatus for head pose estimation
CN109035329A (en) * 2018-08-03 2018-12-18 厦门大学 Camera Attitude estimation optimization method based on depth characteristic
CN110909582A (en) * 2018-09-18 2020-03-24 华为技术有限公司 Face recognition method and device
CN111414798A (en) * 2019-02-03 2020-07-14 沈阳工业大学 Head posture detection method and system based on RGB-D image
CN110371132A (en) * 2019-06-18 2019-10-25 华为技术有限公司 Driver's adapter tube appraisal procedure and device
CN111243093A (en) * 2020-01-07 2020-06-05 腾讯科技(深圳)有限公司 Three-dimensional face grid generation method, device, equipment and storage medium
CN111985384A (en) * 2020-08-14 2020-11-24 深圳地平线机器人科技有限公司 Method and device for acquiring 3D coordinates of face key points and 3D face model

Also Published As

Publication number Publication date
CN113544744A (en) 2021-10-22

Similar Documents

Publication Publication Date Title
Liu et al. Autoshape: Real-time shape-aware monocular 3d object detection
CN108549873B (en) Three-dimensional face recognition method and three-dimensional face recognition system
CN109859305B (en) Three-dimensional face modeling and recognizing method and device based on multi-angle two-dimensional face
JP6681729B2 (en) Method for determining 3D pose of object and 3D location of landmark point of object, and system for determining 3D pose of object and 3D location of landmark of object
US9128530B2 (en) Hand pointing estimation for human computer interaction
WO2017219391A1 (en) Face recognition system based on three-dimensional data
CN112861653A (en) Detection method, system, equipment and storage medium for fusing image and point cloud information
US11195064B2 (en) Cross-modal sensor data alignment
EP3300025A1 (en) Image processing device and image processing method
US20200226392A1 (en) Computer vision-based thin object detection
Song et al. Robust 3D face landmark localization based on local coordinate coding
CN112200056A (en) Face living body detection method and device, electronic equipment and storage medium
WO2022252118A1 (en) Head posture measurement method and apparatus
Huang et al. Vision pose estimation from planar dual circles in a single image
Sahin et al. A learning-based variable size part extraction architecture for 6D object pose recovery in depth images
Wang et al. Fusion Algorithm of Laser-Point Clouds and Optical Images
WO2022246605A1 (en) Key point calibration method and apparatus
CN109166172B (en) Clothing model construction method and device, server and storage medium
CN111709269B (en) Human hand segmentation method and device based on two-dimensional joint information in depth image
Han et al. Deformed landmark fitting for sequential faces
Jin et al. DOPE++: 6D pose estimation algorithm for weakly textured objects based on deep neural networks
CN111145081B (en) Three-dimensional model view projection method and system based on spatial volume characteristics
Königshof et al. Learning-based shape estimation with grid map patches for realtime 3d object detection for automated driving
CN111612912A (en) Rapid three-dimensional reconstruction and optimization method based on Kinect2 camera face contour point cloud model
CN112016495A (en) Face recognition method and device and electronic equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21943487

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE