WO2023001095A1 - Face key point interpolation method and apparatus, computer device, and storage medium - Google Patents

Face key point interpolation method and apparatus, computer device, and storage medium Download PDF

Info

Publication number
WO2023001095A1
WO2023001095A1 PCT/CN2022/106211 CN2022106211W WO2023001095A1 WO 2023001095 A1 WO2023001095 A1 WO 2023001095A1 CN 2022106211 W CN2022106211 W CN 2022106211W WO 2023001095 A1 WO2023001095 A1 WO 2023001095A1
Authority
WO
WIPO (PCT)
Prior art keywords
human face
target
matrix
face
target area
Prior art date
Application number
PCT/CN2022/106211
Other languages
French (fr)
Chinese (zh)
Inventor
陈文喻
刘更代
王志勇
Original Assignee
百果园技术(新加坡)有限公司
刘更代
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司, 刘更代 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2023001095A1 publication Critical patent/WO2023001095A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects

Definitions

  • the embodiments of the present application relate to the technical field of computer vision, for example, to a method, device, computer equipment and storage medium for interpolating key points of a human face.
  • augmented reality Augmented Reality, AR
  • face-driven virtual characters such as dolls, animals
  • other image processing are based on the user's face data. of.
  • These face data belong to the two-dimensional projection of the three-dimensional face.
  • the three-dimensional face model and three-dimensional face pose are accurately calculated according to the key points of the face data.
  • the process of integration is relatively complicated and the amount of calculation is high, so it is difficult to process in real time on devices with relatively scarce resources.
  • the embodiment of the present application proposes a face key point interpolation method, device, computer equipment and storage medium to solve the problem of how to reduce the amount of calculation for fitting the face key points.
  • An embodiment of the present application provides an interpolation method for key points of a human face, including: acquiring two-dimensional first human face data, the first human face data having two-dimensional first human face key points; according to the The first human face key point fits the second three-dimensional human face data, and the second human face data has the second three-dimensional human face key point; the local area in the second human face data is selected as the target area Carry out linear deformation to the target area, so that when the second key point of human face in the target area is perspective-projected to the third two-dimensional key point of human face, the third key point of human face is consistent with the key point of human face
  • the first human face key points corresponding to the second human face key points in the target area overlap.
  • the embodiment of the present application also provides an interpolation device for key points of a human face, including: a two-dimensional human face data acquisition module, configured to acquire two-dimensional first human face data, the first human face data has a two-dimensional The key point of the first human face; the three-dimensional human face data fitting module is configured to fit the second three-dimensional human face data according to the first human face key point, and the second human face data has a three-dimensional second human face face key points; the target area selection module is configured to select a local area in the second face data as the target area; the target area deformation module is configured to linearly deform the target area so that the In the case where the second key point of human face in the target area is perspectively projected to a two-dimensional third key point of human face, the third key point of human face corresponds to the second key point of human face in the target area The key points of the first face overlap.
  • a two-dimensional human face data acquisition module configured to acquire two-dimensional first human face data, the first human face data has a two-dimensional The
  • the embodiment of the present application also provides a computer device, the computer device includes: at least one processor; memory, used to store at least one program, when the at least one program is executed by the at least one processor, so that the At least one processor implements the method for interpolating key points of a human face as described above.
  • the embodiment of the present application also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method for interpolating key points of a human face as described above is implemented.
  • Fig. 1 is the flow chart of the interpolation method of a kind of face key point provided by embodiment 1 of the present application;
  • FIG. 2 is an example diagram of a target area provided in Embodiment 1 of the present application.
  • FIG. 3A to FIG. 3E are example diagrams of a reprojection provided in Embodiment 1 of the present application.
  • FIG. 4 is a schematic structural diagram of an interpolation device for key points of a face provided in Embodiment 3 of the present application;
  • FIG. 5 is a schematic structural diagram of a computer device provided in Embodiment 4 of the present application.
  • FIG. 1 is a flow chart of a face key point interpolation method provided in Embodiment 1 of the present application.
  • This embodiment is applicable to the situation that the key points of the human face are interpolated in a linear manner, and the method can be performed by an interpolation device for the key points of the human face, and the interpolation device for the key points of the human face can be implemented by software and/or hardware , can be configured in a computer device, for example, a mobile terminal (such as a mobile phone, a tablet computer, etc.), a wearable device (such as smart glasses, a smart watch, etc.) and the like.
  • the method includes the following steps.
  • Step 101 Obtain two-dimensional first human face data.
  • operating systems such as Android (Android), iOS, and HarmonyOS can be installed, and users can install their required applications in these operating systems, such as live broadcast applications, short video applications, and beauty applications. , conference applications, etc.
  • the computer equipment may be configured with one or more cameras (cameras), and these cameras may be installed on the front of the computer equipment (also called a front camera) or on the back of the computer equipment (also called a rear camera).
  • cameras cameras
  • these applications can call the camera to face the user to collect image data, perform face detection on the image data, and detect the user's two-dimensional face data in the image data.
  • the face data is represented by two-dimensional face key points.
  • the two-dimensional face data is recorded as the first face data
  • the two-dimensional face key points are recorded as the first face key points, that is, in There are two-dimensional first human face key points in the two-dimensional first human face data.
  • Face detection also known as face key point detection, positioning or face alignment, refers to given face data to locate the key areas of the face, including eyebrows, eyes, nose, mouth, facial contours, etc.
  • Face detection can be performed in the following ways:
  • Use manual extraction features such as haar features, use features to train classifiers, and use classifiers for face detection.
  • Convolutional neural networks using cascaded structures for example, Cascade CNN, Multi-task Cascaded Convolutional Networks (MTCNN).
  • the number of face key points can be set by those skilled in the art according to the actual situation.
  • the real-time requirements are low, and relatively dense face key points can be detected, such as 1000.
  • the real-time requirements are high, and it can detect relatively sparse key points of the face, such as 68, 81, and 106, and locate the face
  • the more obvious and important feature points such as eye key points, eyebrow key points, nose key points, mouth key points, outline key points, etc.
  • the camera is called to face the user to collect video data
  • the video data has multiple frames of image data
  • the two-dimensional first human face data is tracked in the multiple frames of image data by methods such as Kalman filtering and optical flow method.
  • set P to be the pose of the face data in the image data.
  • the pose of the face data is:
  • R is the rotation matrix
  • R 0 is the X-axis component of the rotation matrix
  • R 1 is the Y-axis component of the rotation matrix
  • R 2 is the Z-axis component of the rotation matrix
  • T 0 is the X-axis component of the translation vector
  • T 1 is the Y-axis component of the translation vector
  • T 2 is the Z-axis component of the translation vector.
  • a three-dimensional point v perspective projection is:
  • the first face data F can be expressed as:
  • C 0 is the user's expressionless neutral face data
  • C exp is the user's expression shape fusion deformer
  • is the expression of the face data
  • is the user's identity vector.
  • the multi-frame image data in the video data contains the following data:
  • Q is the key point of the first face
  • P is the posture of the face data
  • is the expression of the face data
  • ⁇ P is the perspective projection defined by formula (1)
  • j is the jth key point of the first face
  • (P -1 , ⁇ -1 ) is P and ⁇ of the previous frame of image data
  • ⁇ 0 ⁇ (P, ⁇ )-(P -1 , ⁇ -1 ) ⁇ is the smoothing term
  • ⁇ (P, ⁇ ) ⁇ is the regularization term
  • w j is the weight of the jth key point of the first face.
  • Step 102 fitting the 3D second human face data according to the key points of the first human face.
  • the first face key points can be used, through the face three-dimensional deformation statistical model (3D Morphable Face Model, 3DMM), end-to-end 3D face reconstruction (such as virtual reality network (Virtual Reality Network, VRNet) , position map regression network (Position Map Regression Network, PRNet), two-dimensional assisted self-supervised learning (2D-Assisted Self-Supervised Learning, 2DASL) and other methods to fit three-dimensional face data, recorded as the second face data,
  • the second human face data has three-dimensional human face key points, which are recorded as second human face key points.
  • the second face data can be represented in the form of grids (such as triangles), and these grids have multiple three-dimensional vertices, some of which are key points of the second face.
  • the three-dimensional second face key points are projected to the two-dimensional third face key points, so that all two-dimensional third face key points approach all two-dimensional first face key points Key points of a face.
  • Step 103 selecting a local area in the second face data as a target area.
  • the key points of the third face can be obtained after the second face data is projected onto the face data, which can fit the entire face as a whole, but there is no guarantee that all the second face data Face key point interpolation, part of the third face key point will deviate from the corresponding first face key point, where the so-called interpolation can refer to when the three-dimensional second face key point is projected to the two-dimensional third face key point , the third face keypoint overlaps with the corresponding first face keypoint.
  • the third key point of the face 302 deviates from the corresponding The key point 301 of the first face.
  • the target area is the facial features, such as eyes, mouth, eyebrows, etc.
  • the target area can be pre-recorded in the configuration file, when the application program performs business operations, load the configuration file, and from the configuration file Read the target area to be deformed.
  • Step 104 performing linear deformation on the target area, so that when the second face key point in the target area is perspective-projected to the two-dimensional third face key point, the third face key point and the second face key point in the target area The face key points overlap with the first face key points.
  • Formula (3) can fit the overall first face key point Q i , but cannot achieve complete interpolation, namely:
  • the local area (ie target area) containing the second key point of human face in the face data is deformed.
  • Laplacian deformation is a process of encoding and decoding local detail features of a grid. Coding refers to the conversion of the Euclidean space coordinates of the vertices of the grid to the Laplacian coordinates.
  • the Laplacian coordinates contain the local details of the grid. Therefore, the Laplacian deformation can better maintain the local details of the grid.
  • Finding coordinates in Euclidean space is essentially a process of solving a linear system. Therefore, the Laplacian deformation algorithm has efficient and robust performance.
  • L is the Laplacian matrix
  • contains the Laplacian coordinates corresponding to each vertex
  • ⁇ P is the perspective projection defined by formula (1).
  • is to keep the Laplaican coordinates unchanged before and after deformation
  • the perspective projection is performed on the key points of the second face, and the perspective projection itself is a nonlinear operation. Therefore, the optimization problem (4) is a non-linear optimization problem, and solving the non-linear optimization problem requires a large amount of calculation, and it is difficult to process it in real time on a device with relatively scarce resources.
  • Fig. 2 defines the eyes as the target area in the face data, the vertices 202 located on the boundary of the target area remain unchanged during the linear deformation process, and the vertices located inside the target area are updated during the linear deformation process Their coordinates make these vertices keep the Laplacian coordinates as much as possible.
  • the third face key point after the second face key point 203 is transmitted and projected onto the face data overlaps with the corresponding first face key point, and is located in the target
  • the vertices 201 outside the area are used to calculate the Laplacian coordinates of the vertices 202 located on the boundary of the target area, and do not participate in the linear deformation.
  • the Laplacian vector after linear deformation is:
  • W is the Laplacian matrix of all vertices in the face data, the size is 3m ⁇ 3m, and W 1 is the first sub-Laplacian matrix, that is, remove the target area from the Laplacian matrix W of all vertices The matrix remaining after the first n columns of the vertices of the boundary, of size 3m ⁇ 3(mn), W 2 is the second sub-Laplacian matrix, i.e., in the Laplacian matrix W of all vertices at the boundary of the target region The first n columns of the vertices of , the size is 3m ⁇ 3n.
  • U (u 0,0 ,u 0,1 ,...,u 0,n-1 ) is an array composed of vertices located at the boundary of the target area
  • V (v 0,n ,v 0,n+1 ,... ,v 0,m-1 ) is an array composed of vertices located inside the target area.
  • L 0 is a vector of vertex transformations in the target region.
  • the first reference matrix K i,0 can be defined as follows:
  • ⁇ ( ) is the perspective projection based on the parameter K of the camera, let s 0,i ,s ⁇ ⁇ u,v ⁇ be the vertex before the linear transformation, and apply it to the vertex s after the linear transformation 1, i , s ⁇ ⁇ u, v ⁇ :
  • the goal of interpolation is to minimize the error of reprojection, that is, to minimize the following function:
  • v 0,i (v 0,i,x ,v 0,i,y ,v 0,i,z ) is the vertex after linear deformation.
  • Formula (7) is a nonlinear optimization problem.
  • the Z-axis component of the second face key point changes little before and after Laplacian deformation, that is, :
  • u 0,i (u 0,i,x ,u 0,i,y ,u 0,i,z ) be the second face key point before linear transformation before linear transformation
  • v 1,i,z R 2 v 0,i +T 2 is the Z-axis component of the linearly transformed vertex.
  • formula (8) and formula (9) will simultaneously minimize
  • step 104 includes the following steps.
  • Step 1041 setting deformation of the target area, and mapping vertices inside the target area to three-dimensional target points.
  • Laplacian Laplacian
  • Step 1042 Calculate the difference between the first vector and the second vector as a vector difference.
  • the vertices in the target area may be linearly transformed, and the difference between the first vector and the second vector is calculated in the vector space after the linear transformation as the vector difference.
  • the first vector is the vector transformed by the vertices in the target area
  • the second vector is the vector transformed by the target point.
  • the first vector L 0 can be obtained
  • W is the Laplacian matrix of all vertices in the face data
  • W 1 is the first sub-Laplacian matrix, that is, remove the first n vertices at the boundary of the target area from the Laplacian matrix W of all vertices
  • W 2 is the second sub-Laplacian matrix, that is, the first n columns of vertices that are on the boundary of the target region in the Laplacian matrix W of all vertices.
  • the Laplacian matrix W, the first sub-Laplacian matrix W 1 and the second sub-Laplacian matrix W 2 are initialized before the start of the entire face tracking, and are initialized based on the vertices in the neutral face to avoid
  • the Laplacian matrix W, the first sub-Laplacian matrix W 1 and the second sub-Laplacian matrix W 2 are calculated for each frame of image data, thereby speeding up the speed of linear deformation.
  • Laplacian deformation is performed on the target point to obtain the second vector W 2 U+W 1 V .
  • U (u 0,0 ,u 0,1 ,...,u 0,n-1 ) is the vertex located at the boundary of the target area
  • V (v 0,n ,v 0,n+1 ,...,v 0, m-1 ) are vertices located inside the target area.
  • the second vector is subtracted from the first vector to obtain the vector difference W 1 V-(L 0 ⁇ W 2 U).
  • Step 1043 calculating the difference between the first human face key point and the third human face key point corresponding to the second human face key point in the target area, as the reprojection difference.
  • the target point is projected into the face data, and the key point of the face can be obtained, which is recorded as the third key point of the face, that is, the third key point of the face is a two-dimensional perspective projection of the target point.
  • the reprojection difference is used as a constraint item , used to control the second face key point interpolation.
  • the first reference matrix K i,0 can be calculated, and the first reference matrix is the A product of an identity matrix of key points of a face and a parameter of a camera, and the camera is used to collect the first face data.
  • u 0,i (u 0,i,x ,u 0,i,y ,u 0,i,z ) be the second face key point before linear transformation before linear transformation
  • R is the rotation matrix
  • T is the translation vector, That is, both the rotation matrix R and the translation vector T are used for linear transformation
  • u 1, i, z R 2 u 0, i + T 2
  • the Z-axis component of the second face key point that is, the Z-axis component portion).
  • the first target matrix J i,0 and the second target matrix J i,1 can be calculated.
  • the product of the first target matrix and the target point is subtracted from the second target matrix to obtain the reprojection difference J i,0 v 0,i ⁇ J i,1 .
  • Step 1044 calculate the moving distance of the target point in the Z-axis direction.
  • the moving distance of the target point in the Z-axis direction can be calculated, so as to limit the movement of the grid in the Z-axis direction as a constraint item.
  • the product between the component R 2 of the Z axis of the rotation matrix and the target point v 0,i can be calculated as the first intermediate value
  • the component u 1 of the Z axis after the linear transformation of the second face key point can be calculated
  • the difference between i, z and the component T 2 of the Z axis of the translation vector, as the second intermediate value calculate the difference between the first intermediate value and the second intermediate value, as the target point in the Z axis direction
  • the distance moved is R 2 v 0,i -(u 1,i,z -T 2 ).
  • Step 1045 linearly fusing the vector difference, the reprojection difference and the distance as an objective function.
  • the vector difference, the reprojection difference and the distance can be linearly fused, so as to be set as the objective function.
  • the first weight is configured for the reprojection difference
  • the second weight is configured for the distance
  • the vector difference the sum of the reprojection difference configured with the first weight and the distance configured with the second weight is calculated as the objective function, expressed as as follows:
  • is the first weight, and ⁇ is the second weight.
  • Step 1046 taking the minimization of the objective function as the objective, and solving the objective point.
  • the optimization problem (7) can be approximated by the method of least squares, and the objective function can be minimized, thereby solving the coordinates of the target point.
  • the first sparse matrix A and the second sparse matrix b can be constructed.
  • the first sparse matrix A consists of the first sub-Laplacian matrix, the product between the first weight ⁇ and the first target matrix J i,0 , the product between the second weight ⁇ and the component R2 of the Z axis of the rotation matrix Product, then the first sparse matrix A is expressed as follows:
  • the second sparse matrix b consists of the first vector L 0 minus the difference between the product of the second sub-Laplacian matrix W 2 and the vertex U at the boundary of the target region, the first weight ⁇ and the second target matrix J
  • the target point is solved based on the target relationship through sparse solver methods such as SimplicialLDLT (the built-in direct solver provided by Eigen for direct LDLT decomposition) and the conjugate gradient method (ConjugateGradient).
  • SimplicialLDLT the built-in direct solver provided by Eigen for direct LDLT decomposition
  • ConjugateGradient the conjugate gradient method
  • the vertex in the target area can be
  • iterative algorithm can be used to solve the target relationship (11), setting For a threshold with a larger value, the target point is iteratively updated on the basis of the initial value until the difference between the third face key point and the first face key point corresponding to the second face key point in the target area is less than
  • the preset threshold can improve the calculation speed and reduce the calculation time.
  • this embodiment uses ConjugateGradient to solve, and adopts the initial value V 0 , and the threshold is 1e-5. It can be seen from the table below that this strategy takes half the time of the SimplicialLDLT algorithm, and in the result, there is no obvious the difference.
  • this embodiment After testing, this embodiment performs Laplacian deformation on the two target areas of the left eye and the right eye, and it takes 1 ms on the mobile phone, which can ensure real-time performance.
  • the two-dimensional first human face data is acquired, and the first human face data has two-dimensional first human face key points, and the three-dimensional second human face data is fitted according to the first human face key points, and the second human face data is
  • the face data has a three-dimensional second human face key point, select a local area in the second human face data as the target area, and perform linear deformation on the target area, so that the second human face key point in the target area is perspective projected to
  • the third face key point is two-dimensional
  • the third face key point overlaps with the first face key point corresponding to the second face key point in the target area, and the optimization problem when fitting face data is adjusted It is a linear optimization problem.
  • the linear optimization problem is relatively simple to deal with, and the calculation amount is low, which can greatly reduce the calculation time consumption, and it can be processed in real time on devices with relatively scarce resources.
  • FIG. 4 is a structural block diagram of an interpolation device for key points of a face provided by Embodiment 2 of the present application.
  • the device may include the following modules.
  • the two-dimensional face data obtaining module 401 is configured to obtain two-dimensional first human face data, and the first two-dimensional face data has two-dimensional first human face key points.
  • the 3D face data fitting module 402 is configured to fit the 3D second face data according to the first face key points, and the second face data has the 3D second face key points.
  • the target area selection module 403 is configured to select a partial area in the second face data as the target area.
  • the target area deformation module 404 is configured to perform linear deformation on the target area, so that when the key points of the second human face in the target area are perspective-projected to the two-dimensional key points of the third human face, the third human face The face key points overlap with the first human face key points corresponding to the second human face key points in the target area.
  • the second face data is represented in the form of a grid, and the grid has a plurality of three-dimensional vertices, some of which are key points of the second face;
  • the target area deformation module 404 includes: a vertex mapping module, configured to deform the target area, and map vertices inside the target area to three-dimensional target points; a vector difference calculation module, configured to calculate the first The difference between the vector and the second vector, as a vector difference, the first vector is the vector converted by the vertex in the target area, and the second vector is the vector converted by the target point; reprojection difference calculation module , set to calculate the difference between the first face key point corresponding to the third face key point and the second face key point in the target area, as the reprojection difference, the third face key point
  • the key point is the two-dimensional human face key point of the perspective projection of the target point;
  • the moving distance calculation module is configured to calculate the distance that the target point moves in the Z-axis direction;
  • the linear fusion module is configured to The vector difference, the reprojection difference and the distance are linearly fused as an objective function;
  • the target point solving module is configured to minimize the objective function as a target and solve the target point.
  • the vector difference calculation module is configured to: perform Laplace transformation on the vertex to obtain the first vector; perform Laplace transformation on the target point to obtain the second vector Vector; subtract the second vector from the first vector to obtain a vector difference.
  • the reprojection difference calculation module is configured to: calculate a first reference matrix, the first reference matrix is the first reference matrix corresponding to the second human face key point in the target area The product between the unit matrix of key points of a human face and the parameter of the camera, the camera is set to collect the first human face data; calculate the second reference matrix, the second reference matrix is the first reference matrix and the ratio between the components of the Z axis after the linear transformation of the target point; calculate the first target matrix and the second target matrix, and the first target matrix is the product between the second reference matrix and the rotation matrix , the second target matrix is the inverse of the product between the second reference matrix and a translation vector, both of which are used for linear transformation; the first target matrix and the target The product between the points is subtracted from the second objective matrix to obtain the reprojected difference.
  • the moving distance calculation module is configured to: calculate the product of the Z-axis component of the rotation matrix and the target point as the first intermediate value; calculate the moving distance in the target area The difference between the Z-axis component after the linear transformation of the second face key point and the Z-axis component of the translation vector, as the second intermediate value; calculate the difference between the first intermediate value and the second intermediate value The difference is taken as the moving distance of the target point in the Z-axis direction.
  • the linear fusion module is configured to: configure a first weight for the reprojection difference; configure a second weight for the distance; calculate the vector difference and configure the first weight The sum of the reprojected differences of , and the distance to configure the second weight is used as the objective function.
  • the target point solving module is configured to: construct a first sparse matrix A and a second sparse matrix B, the first sparse matrix includes a first sub-Laplacian matrix, the first The product between the weights and the first objective matrix J i,0 , the product between the second weights and the Z-axis component of the rotation matrix, the first sub-Laplacian matrix being the Laplacian at all vertices
  • the second sparse matrix includes the first vector minus the second sub-Laplacian matrix and the vertices at the boundary of the target area
  • the difference between the products between, the product between the first weight and the second target matrix J i,1 , the product between the second weight and the second intermediate value, the second sub-Laplace matrix is In the Laplacian matrix of all vertices, the first n column matrices of the vertices at the boundary of the
  • the target point solving module is set to: set the vertices inside the target area as the initial value of the target point; iteratively update the target point until the difference between the third key point of human face and the first key point of human face corresponding to the third key point of human face is less than a preset threshold.
  • the two-dimensional face data acquisition module 401 includes: a video data acquisition module, configured to call a camera to collect video data, and the video data has multiple frames of image data; a face tracking module , set to track two-dimensional first face data in the multi-frame image data.
  • the face key point interpolation device provided in the embodiment of the present application can execute the face key point interpolation method provided in any embodiment of the present application, and has corresponding functional modules for executing the method.
  • FIG. 5 is a schematic structural diagram of a computer device provided in Embodiment 3 of the present application.
  • FIG. 5 shows a block diagram of an exemplary computer device 12 suitable for implementing embodiments of the present application.
  • the computer device 12 shown in FIG. 5 is only an example, and should not limit the functions and scope of use of this embodiment of the present application.
  • computer device 12 takes the form of a general-purpose computing device.
  • Components of computer device 12 may include, but are not limited to: one or more processors or processing units 16 , system memory 28 , bus 18 connecting various system components including system memory 28 and processing unit 16 .
  • Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus structures.
  • these architectures include but are not limited to Industry Standard Architecture (Industry Standard Architecture, ISA) bus, Micro Channel Architecture (Micro Channel Architecture, MCA) bus, Enhanced ISA bus, Video Electronics Standards Association (Video Electronics Standards Association, VESA) local bus and peripheral component interconnection (Peripheral Component Interconnection, PCI) bus.
  • Computer device 12 includes a variety of computer system readable media. These media can be any available media that can be accessed by computer device 12 and include both volatile and nonvolatile media, removable and non-removable media.
  • System memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32 .
  • Computer device 12 may also include other removable/non-removable, volatile/nonvolatile computer system storage media.
  • storage system 34 may be configured to read from and write to non-removable, non-volatile magnetic media (not shown in Figure 5, commonly referred to as a "hard drive").
  • the storage system may provide disk drives for reading and writing to removable non-volatile disks (such as "floppy disks"), as well as read-only Memory (Compact Disc Read-Only Memory, CD-ROM), digital video read-only memory (Digital Video Disc Read-Only Memory, DVD-ROM) or other optical media) CD-ROM drive.
  • each drive may be connected to bus 18 via one or more data media interfaces.
  • Memory 28 may include at least one program product having a set (eg, at least one) of program modules configured to perform the functions of various embodiments of the present application.
  • a program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including but not limited to an operating system, one or more application programs, other program modules, and program data , each or a combination of these examples may include implementations of network environments.
  • the program modules 42 generally perform the functions and/or methods of the embodiments described herein.
  • the computer device 12 may also communicate with one or more external devices 14 (e.g., a keyboard, pointing device, display 24, etc.), and with one or more devices that enable a user to interact with the computer device 12, and/or with Any device (eg, network card, modem, etc.) that enables the computing device 12 to communicate with one or more other computing devices. This communication can be performed through an input/output (Input/Output, I/O) interface 22 .
  • the computer device 12 can also communicate with one or more networks (such as a local area network (Local Area Nerwork, LAN), a wide area network (Wide Area Network WAN) and/or a public network such as the Internet) through the network adapter 20. As shown in FIG.
  • network adapter 20 communicates with other modules of computer device 12 via bus 18 . It should be appreciated that although not shown in FIG. 5 , network adapter 20 may use other hardware and/or software modules in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, stand-alone Redundant Arrays Of Independent Disk (RAID) system, tape drive and data backup storage system, etc.
  • RAID Redundant Arrays Of Independent Disk
  • the processing unit 16 executes a variety of functional applications and data processing by running the programs stored in the system memory 28 , such as realizing the interpolation method of key points of the face provided by the embodiment of the present application.
  • Embodiment 4 of the present application also provides a computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the computer program is executed by a processor, multiple processes of the above-mentioned interpolation method for key points of a human face are implemented. In order to avoid Repeat, so I won't go into details here.
  • a computer-readable storage medium may include, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer-readable storage media include: electrical connections with one or more leads, portable computer disks, hard disks, RAM, Read-Only Memory (ROM), Erasable Programmable Read-Only Memory (EPROM) or flash memory, optical fiber, CD-ROM, optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)

Abstract

A face key point interpolation method and apparatus, a computer device, and a storage medium. The method comprises: acquiring two-dimensional first face data (101), the first face data comprising two-dimensional first face key points; fitting three-dimensional second face data according to the first face key points (102), the second face data comprising three-dimensional second face key points; selecting a local area in the second face data as a target area (103); and performing linear deformation on the target area, so that when the second face key points in the target area are subjected to perspective projection to two-dimensional third face key points, the third face key points overlap with the first face key points corresponding to the second face key points in the target area (104).

Description

人脸关键点的插值方法、装置、计算机设备和存储介质Interpolation method, device, computer equipment and storage medium for human face key points
本申请要求在2021年07月23日提交中国专利局、申请号为202110836320.0的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application with application number 202110836320.0 filed with the China Patent Office on July 23, 2021, the entire contents of which are incorporated herein by reference.
技术领域technical field
本申请实施例涉及计算机视觉的技术领域,例如涉及一种人脸关键点的插值方法、装置、计算机设备和存储介质。The embodiments of the present application relate to the technical field of computer vision, for example, to a method, device, computer equipment and storage medium for interpolating key points of a human face.
背景技术Background technique
对于增强现实(Augmented Reality,AR)(如给用户试戴帽子、试戴眼镜、追加胡子等)、人脸驱动虚拟角色(如人偶、动物)等图像处理,是基于用户的人脸数据进行的。For augmented reality (Augmented Reality, AR) (such as trying on hats for users, trying on glasses, adding beards, etc.), face-driven virtual characters (such as dolls, animals) and other image processing are based on the user's face data. of.
这些人脸数据属于三维人脸的二维投影,为了业务需求,根据人脸数据上的人脸关键点准确地计算三维的人脸模型和三维的人脸姿态,这个过程是人脸关键点拟合的过程,处理较为复杂,计算量较高,难以在资源较为紧缺的设备上实时处理。These face data belong to the two-dimensional projection of the three-dimensional face. For business needs, the three-dimensional face model and three-dimensional face pose are accurately calculated according to the key points of the face data. The process of integration is relatively complicated and the amount of calculation is high, so it is difficult to process in real time on devices with relatively scarce resources.
发明内容Contents of the invention
本申请实施例提出了一种人脸关键点的插值方法、装置、计算机设备和存储介质,以解决如何降低拟合人脸关键点的计算量的问题。The embodiment of the present application proposes a face key point interpolation method, device, computer equipment and storage medium to solve the problem of how to reduce the amount of calculation for fitting the face key points.
本申请实施例提供了一种人脸关键点的插值方法,包括:获取二维的第一人脸数据,所述第一人脸数据中具有二维的第一人脸关键点;根据所述第一人脸关键点拟合三维的第二人脸数据,所述第二人脸数据具有三维的第二人脸关键点;选定所述第二人脸数据中局部的区域,作为目标区域;对所述目标区域进行线性变形,使得将所述目标区域中的第二人脸关键点透视投影至二维的第三人脸关键点的情况下,所述第三人脸关键点与所述目标区域中的第二人脸关键点相对应的第一人脸关键点重叠。An embodiment of the present application provides an interpolation method for key points of a human face, including: acquiring two-dimensional first human face data, the first human face data having two-dimensional first human face key points; according to the The first human face key point fits the second three-dimensional human face data, and the second human face data has the second three-dimensional human face key point; the local area in the second human face data is selected as the target area Carry out linear deformation to the target area, so that when the second key point of human face in the target area is perspective-projected to the third two-dimensional key point of human face, the third key point of human face is consistent with the key point of human face The first human face key points corresponding to the second human face key points in the target area overlap.
本申请实施例还提供了一种人脸关键点的插值装置,包括:二维人脸数据获取模块,设置为获取二维的第一人脸数据,所述第一人脸数据中具有二维的第一人脸关键点;三维人脸数据拟合模块,设置为根据所述第一人脸关键点拟合三维的第二人脸数据,所述第二人脸数据具有三维的第二人脸关键点;目标区域选定模块,设置为选定所述第二人脸数据中局部的区域,作为目标区域; 目标区域形变模块,设置为对所述目标区域进行线性变形,使得将所述目标区域中的第二人脸关键点透视投影至二维的第三人脸关键点的情况下,所述第三人脸关键点与所述目标区域中的第二人脸关键点相对应的第一人脸关键点重叠。The embodiment of the present application also provides an interpolation device for key points of a human face, including: a two-dimensional human face data acquisition module, configured to acquire two-dimensional first human face data, the first human face data has a two-dimensional The key point of the first human face; the three-dimensional human face data fitting module is configured to fit the second three-dimensional human face data according to the first human face key point, and the second human face data has a three-dimensional second human face face key points; the target area selection module is configured to select a local area in the second face data as the target area; the target area deformation module is configured to linearly deform the target area so that the In the case where the second key point of human face in the target area is perspectively projected to a two-dimensional third key point of human face, the third key point of human face corresponds to the second key point of human face in the target area The key points of the first face overlap.
本申请实施例还提供了一种计算机设备,所述计算机设备包括:至少一个处理器;存储器,用于存储至少一个程序,当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如上所述的人脸关键点的插值方法。The embodiment of the present application also provides a computer device, the computer device includes: at least one processor; memory, used to store at least one program, when the at least one program is executed by the at least one processor, so that the At least one processor implements the method for interpolating key points of a human face as described above.
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如上所述的人脸关键点的插值方法。The embodiment of the present application also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method for interpolating key points of a human face as described above is implemented.
附图说明Description of drawings
图1为本申请实施例一提供的一种人脸关键点的插值方法的流程图;Fig. 1 is the flow chart of the interpolation method of a kind of face key point provided by embodiment 1 of the present application;
图2为本申请实施例一提供的一种目标区域的示例图;FIG. 2 is an example diagram of a target area provided in Embodiment 1 of the present application;
图3A至图3E是本申请实施例一提供的一种重投影的示例图;FIG. 3A to FIG. 3E are example diagrams of a reprojection provided in Embodiment 1 of the present application;
图4为本申请实施例三提供的一种人脸关键点的插值装置的结构示意图;FIG. 4 is a schematic structural diagram of an interpolation device for key points of a face provided in Embodiment 3 of the present application;
图5为本申请实施例四提供的一种计算机设备的结构示意图。FIG. 5 is a schematic structural diagram of a computer device provided in Embodiment 4 of the present application.
具体实施方式detailed description
下面结合附图和实施例对本申请进行说明。可以理解的是,此处所描述的实施例仅仅用于解释本申请,而非对本申请的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本申请相关的部分的结构。The application will be described below in conjunction with the accompanying drawings and embodiments. It should be understood that the embodiments described here are only used to explain the present application, but not to limit the present application. In addition, it should be noted that, for the convenience of description, only the structures of the parts relevant to the present application are shown in the drawings.
实施例一Embodiment one
图1为本申请实施例一提供的一种人脸关键点的插值方法的流程图。本实施例可适用于通过线性的方式对人脸关键点进行插值的情况,该方法可以由人脸关键点的插值装置来执行,该人脸关键点的插值装置可以由软件和/或硬件实现,可配置在计算机设备中,例如,移动终端(如手机、平板电脑等)、可穿戴设备(如智能眼镜、智能手表等)等。该方法包括如下步骤。FIG. 1 is a flow chart of a face key point interpolation method provided in Embodiment 1 of the present application. This embodiment is applicable to the situation that the key points of the human face are interpolated in a linear manner, and the method can be performed by an interpolation device for the key points of the human face, and the interpolation device for the key points of the human face can be implemented by software and/or hardware , can be configured in a computer device, for example, a mobile terminal (such as a mobile phone, a tablet computer, etc.), a wearable device (such as smart glasses, a smart watch, etc.) and the like. The method includes the following steps.
步骤101、获取二维的第一人脸数据。 Step 101. Obtain two-dimensional first human face data.
在计算机设备中,可以安装安卓(Android)、iOS、鸿蒙系统(HarmonyOS)等操作系统,用户可以在这些操作系统中安装其所需的应用程序,例如,直播 应用、短视频应用、美颜应用、会议应用等。In computer equipment, operating systems such as Android (Android), iOS, and HarmonyOS can be installed, and users can install their required applications in these operating systems, such as live broadcast applications, short video applications, and beauty applications. , conference applications, etc.
计算机设备可以配置有一个或多个摄像头(camera),这些摄像头可以安装在计算机设备的正面(又称前置摄像头)、也可以安装在计算机设备的背部(又称后置摄像头)。The computer equipment may be configured with one or more cameras (cameras), and these cameras may be installed on the front of the computer equipment (also called a front camera) or on the back of the computer equipment (also called a rear camera).
在AR、人脸驱动虚拟角色等业务操作中,这些应用程序可以调用摄像头面向用户以采集图像数据,对该图像数据进行人脸检测,从而在图像数据中检测用户二维的人脸数据,该人脸数据以二维的人脸关键点表示,为便于区分,二维的人脸数据记为第一人脸数据,二维的人脸关键点记为第一人脸关键点,即,在二维的第一人脸数据中具有二维的第一人脸关键点。In business operations such as AR and face-driven virtual characters, these applications can call the camera to face the user to collect image data, perform face detection on the image data, and detect the user's two-dimensional face data in the image data. The face data is represented by two-dimensional face key points. For the convenience of distinction, the two-dimensional face data is recorded as the first face data, and the two-dimensional face key points are recorded as the first face key points, that is, in There are two-dimensional first human face key points in the two-dimensional first human face data.
人脸检测也称为人脸关键点检测、定位或者人脸对齐,是指给定人脸数据,定位出人脸面部的关键区域位置,包括眉毛、眼睛、鼻子、嘴巴、脸部轮廓等。Face detection, also known as face key point detection, positioning or face alignment, refers to given face data to locate the key areas of the face, including eyebrows, eyes, nose, mouth, facial contours, etc.
可通过如下方式进行人脸检测:Face detection can be performed in the following ways:
1、利用人工提取特征,如haar特征,使用特征训练分类器,使用分类器进行人脸检测。1. Use manual extraction features, such as haar features, use features to train classifiers, and use classifiers for face detection.
2、从通用的目标检测算法中继承人脸检测,例如,利用Faster R-CNN来检测人脸。2. Inherit face detection from general target detection algorithms, for example, use Faster R-CNN to detect faces.
3、使用级联结构的卷积神经网络,例如,级联卷积神经网络(Cascade CNN),多任务卷积神经网络(Multi-task Cascaded Convolutional Networks,MTCNN)。3. Convolutional neural networks using cascaded structures, for example, Cascade CNN, Multi-task Cascaded Convolutional Networks (MTCNN).
需要说明的是,人脸关键点的数量,可以由本领域技术人员根据实际情况设置,对于静态的图像处理,实时性要求较低,可以检测较为稠密的人脸关键点,如1000个,除了能定位人脸重要特征点之外,还能准确的描述出五官的轮廓;对于直播等,实时性要求较高,可以检测较为稀疏的人脸关键点,如68、81、106个,定位人脸上比较明显且重要的特征点(如眼睛关键点、眉毛关键点、鼻子关键点、嘴巴关键点、轮廓关键点等),以降低处理量、减少处理时间,,本申请实施例对此不加以限制。It should be noted that the number of face key points can be set by those skilled in the art according to the actual situation. For static image processing, the real-time requirements are low, and relatively dense face key points can be detected, such as 1000. In addition to being able to In addition to locating the important feature points of the face, it can also accurately describe the outline of the facial features; for live broadcast, etc., the real-time requirements are high, and it can detect relatively sparse key points of the face, such as 68, 81, and 106, and locate the face The more obvious and important feature points (such as eye key points, eyebrow key points, nose key points, mouth key points, outline key points, etc.) limit.
示例性地,调用摄像头面向用户以采集视频数据,视频数据中具有多帧图像数据,通过卡尔曼滤波、光流法等方法在多帧图像数据中追踪二维的第一人脸数据。Exemplarily, the camera is called to face the user to collect video data, the video data has multiple frames of image data, and the two-dimensional first human face data is tracked in the multiple frames of image data by methods such as Kalman filtering and optical flow method.
对于透视投影,设定P为图像数据中人脸数据的姿态,在透视投影下,人脸数据的姿态为:For perspective projection, set P to be the pose of the face data in the image data. Under perspective projection, the pose of the face data is:
P={R,T}P={R,T}
R为旋转矩阵
Figure PCTCN2022106211-appb-000001
R 0为旋转矩阵的X轴的分量、R 1为旋转矩阵的Y轴的分量,R 2为旋转矩阵的Z轴的分量,T为平移向量T=(T 0,T 1,T 2),T 0为平移向量的X轴的分量、T 1为平移向量的Y轴的分量,T 2为平移向量的Z轴的分量。
R is the rotation matrix
Figure PCTCN2022106211-appb-000001
R 0 is the X-axis component of the rotation matrix, R 1 is the Y-axis component of the rotation matrix, R 2 is the Z-axis component of the rotation matrix, and T is the translation vector T=(T 0 , T 1 , T 2 ), T 0 is the X-axis component of the translation vector, T 1 is the Y-axis component of the translation vector, and T 2 is the Z-axis component of the translation vector.
在给定摄像头的参数K(如内参、外参等)的情况下,将一个三维的点v透视投影为:In the case of a given camera parameter K (such as internal parameters, external parameters, etc.), a three-dimensional point v perspective projection is:
Figure PCTCN2022106211-appb-000002
Figure PCTCN2022106211-appb-000002
Figure PCTCN2022106211-appb-000003
Figure PCTCN2022106211-appb-000003
在人脸追踪的过程中,第一人脸数据F可以表示为:In the process of face tracking, the first face data F can be expressed as:
F=F(α,δ)=C 0+C expδ   (2) F=F(α,δ)=C 0 +C exp δ (2)
C 0是用户的无表情的中性的人脸数据,C exp是该用户的表情形状融合变形器,δ是人脸数据的表情,α是用户的身份向量。 C 0 is the user's expressionless neutral face data, C exp is the user's expression shape fusion deformer, δ is the expression of the face data, and α is the user's identity vector.
在人脸追踪的过程中,在给定用户的身份向量α的情况下,视频数据中的多帧图像数据包含如下数据:In the process of face tracking, given the user's identity vector α, the multi-frame image data in the video data contains the following data:
{Q|P,δ}{Q|P,δ}
Q是第一人脸关键点,P是人脸数据的姿态,δ是人脸数据的表情。Q is the key point of the first face, P is the posture of the face data, and δ is the expression of the face data.
人脸追踪的过程中,可以采用坐标下降法(Coordinate Descent)求解下面的优化方程,从而得到P和δ:In the process of face tracking, the coordinate descent method (Coordinate Descent) can be used to solve the following optimization equation to obtain P and δ:
(P,δ)=argmin(∑ jw j||Π P(C 0+C expδ) j-Q j|| 20‖(P,δ)-(P -1-1)‖+γ‖(P,δ)‖)   (3) (P,δ)=argmin(∑ j w j ||Π P (C 0 +C exp δ) j -Q j || 20 ‖(P,δ)-(P -1-1 ) ‖+γ‖(P,δ)‖) (3)
Π P为公式(1)定义的透视投影,j为第j个第一人脸关键点,(P -1-1)是前一帧图像数据的P和δ,γ 0‖(P,δ)-(P -1-1)‖是光顺项,γ‖(P,δ)‖是正则项,w j是第j个第一人脸关键点的权重。 Π P is the perspective projection defined by formula (1), j is the jth key point of the first face, (P -1 , δ -1 ) is P and δ of the previous frame of image data, γ 0 ‖(P, δ)-(P -1-1 )‖ is the smoothing term, γ‖(P,δ)‖ is the regularization term, and w j is the weight of the jth key point of the first face.
步骤102、根据第一人脸关键点拟合三维的第二人脸数据。 Step 102, fitting the 3D second human face data according to the key points of the first human face.
在本实施例中,可以使用第一人脸关键点,通过人脸三维形变统计模型(3D Morphable Face Model,3DMM)、端到端3D人脸重建(如虚拟现实网络(Virtual Reality Network,VRNet)、位置贴图回归网络(Position Map Regression Network, PRNet)、二维辅助自监督学习(2D-Assisted Self-Supervised Learning,2DASL))等方法拟合三维的人脸数据,记为第二人脸数据,该第二人脸数据具有三维的人脸关键点,记为第二人脸关键点。In this embodiment, the first face key points can be used, through the face three-dimensional deformation statistical model (3D Morphable Face Model, 3DMM), end-to-end 3D face reconstruction (such as virtual reality network (Virtual Reality Network, VRNet) , position map regression network (Position Map Regression Network, PRNet), two-dimensional assisted self-supervised learning (2D-Assisted Self-Supervised Learning, 2DASL) and other methods to fit three-dimensional face data, recorded as the second face data, The second human face data has three-dimensional human face key points, which are recorded as second human face key points.
一般情况下,如图2所示,该第二人脸数据可以使用网格(如三角形)的形式表示,这些网格具有多个三维的顶点,部分顶点为第二人脸关键点。Generally, as shown in FIG. 2 , the second face data can be represented in the form of grids (such as triangles), and these grids have multiple three-dimensional vertices, some of which are key points of the second face.
在拟合第二人脸数据的过程中,将三维的第二人脸关键点投影到二维的第三人脸关键点,使得所有二维的第三人脸关键点逼近所有二维的第一人脸关键点。In the process of fitting the second face data, the three-dimensional second face key points are projected to the two-dimensional third face key points, so that all two-dimensional third face key points approach all two-dimensional first face key points Key points of a face.
步骤103、选定第二人脸数据中局部的区域,作为目标区域。 Step 103, selecting a local area in the second face data as a target area.
在拟合第二人脸数据的过程中,第二人脸数据投影至人脸数据后得到第三人脸关键点,可以在整体上拟合整个人脸,但是,无法保证所有的第二人脸关键点插值,部分第三人脸关键点会偏离相应的第一人脸关键点,其中,所谓插值,可以指三维的第二人脸关键点投影到二维的第三人脸关键点时,第三人脸关键点与相应的第一人脸关键点重叠。In the process of fitting the second face data, the key points of the third face can be obtained after the second face data is projected onto the face data, which can fit the entire face as a whole, but there is no guarantee that all the second face data Face key point interpolation, part of the third face key point will deviate from the corresponding first face key point, where the so-called interpolation can refer to when the three-dimensional second face key point is projected to the two-dimensional third face key point , the third face keypoint overlaps with the corresponding first face keypoint.
例如,如图3A所示,对于表示脸部轮廓的人脸关键点,三维的第二人脸关键点投影到二维的第三人脸关键点302时,第三人脸关键点302偏离相应的第一人脸关键点301。For example, as shown in FIG. 3A , for the key points of the face representing the contour of the face, when the second three-dimensional key point of the face is projected onto the third key point of the two-dimensional face 302, the third key point of the face 302 deviates from the corresponding The key point 301 of the first face.
为了对这部分第二人脸关键点插值,可以根据业务的需求选定第二人脸数据中局部的区域,作为目标区域去进行变形,即,在拟合第二人脸数据后再做一次局部的变形,使得指定的第二人脸关键点能在透视投影下跟对应的第一人脸关键点重合。In order to interpolate the key points of this part of the second face, you can select a local area in the second face data according to the business requirements, as the target area for deformation, that is, do it again after fitting the second face data Local deformation, so that the specified key points of the second face can coincide with the corresponding key points of the first face under perspective projection.
一般情况下,目标区域为人脸中的五官,例如,眼睛、嘴巴、眉毛等,目标区域可预先记录在配置文件中,在应用程序执行业务操作时,加载该配置文件,并从该配置文件中读取待变形的目标区域。In general, the target area is the facial features, such as eyes, mouth, eyebrows, etc., the target area can be pre-recorded in the configuration file, when the application program performs business operations, load the configuration file, and from the configuration file Read the target area to be deformed.
步骤104、对目标区域进行线性变形,使得将目标区域中的第二人脸关键点透视投影至二维的第三人脸关键点时,第三人脸关键点与目标区域中的第二人脸关键点相对应第一人脸关键点重叠。 Step 104, performing linear deformation on the target area, so that when the second face key point in the target area is perspective-projected to the two-dimensional third face key point, the third face key point and the second face key point in the target area The face key points overlap with the first face key points.
在公式(3)可以拟合整体的第一人脸关键点Q i,但无法做到完全插值,即: Formula (3) can fit the overall first face key point Q i , but cannot achieve complete interpolation, namely:
Π P(C 0+C expδ) j≠Q j Π P (C 0 +C exp δ) j ≠ Q j
在人脸追踪的过程中,有时候插值一些第一人脸关键点,如眼睛的第一人脸关键点,从而可以对眼睛做三维特效。In the process of face tracking, sometimes some key points of the first face are interpolated, such as the first key points of the eyes, so that 3D special effects can be applied to the eyes.
假设J是待插值的第二人脸关键点,适当提高这些第二人脸关键点的权重w j,j∈J,可以提高这些第二人脸关键点的拟合度,但是依然无法保证插值。如果将第二人脸关键点的权重w j(j∈J)设置太大,又会导致结果不稳定,重建出扭曲的人脸数据。 Assuming that J is the key point of the second face to be interpolated, appropriately increasing the weight w j ,j∈J of these key points of the second face can improve the fitting degree of these key points of the second face, but the interpolation cannot be guaranteed . If the weight w j (j∈J) of the second face key point is set too large, the result will be unstable and distorted face data will be reconstructed.
为了插值第二人脸关键点J,对人脸数据中包含第二人脸关键点J的局部区域(即目标区域)进行变形。In order to interpolate the second key point of human face J, the local area (ie target area) containing the second key point of human face in the face data is deformed.
以拉普拉斯(Laplacian)变形为例,Laplacian变形是对网格的局部细节特征的编码和解码的过程。编码是指网格顶点的欧氏空间坐标到Laplacian坐标的转换,Laplacian坐标包含了网格的局部细节特征,因此,Laplacian变形能够较好地保持网格的局部细节,解码是指通过微分坐标反求欧氏空间坐标,实质上是一个求解线性系统的过程,因此,Laplacian变形算法具有高效、鲁棒的性能。Taking Laplacian deformation as an example, Laplacian deformation is a process of encoding and decoding local detail features of a grid. Coding refers to the conversion of the Euclidean space coordinates of the vertices of the grid to the Laplacian coordinates. The Laplacian coordinates contain the local details of the grid. Therefore, the Laplacian deformation can better maintain the local details of the grid. Finding coordinates in Euclidean space is essentially a process of solving a linear system. Therefore, the Laplacian deformation algorithm has efficient and robust performance.
假设V是该目标区域内所有的顶点,采用Laplaican变形对目标区域进行变形的过程中,保持Laplaican坐标不变,同时可以保证插值第二人脸关键点J。Assuming that V is all the vertices in the target area, in the process of deforming the target area by using Laplaican deformation, the Laplaican coordinates are kept unchanged, and the second key point J of the face can be interpolated at the same time.
argmin V(||LV-Δ||+γ∑ j∈J||Π P(C 0+C expδ) j-Q j|| 2)   (4) argmin V (||LV-Δ||+γ∑ j∈J ||Π P (C 0 +C exp δ) j -Q j || 2 ) (4)
L为Laplacian矩阵,Δ包含每个顶点对应的Laplacian坐标,Π P是公式(1)定义的透视投影。 L is the Laplacian matrix, Δ contains the Laplacian coordinates corresponding to each vertex, and ΠP is the perspective projection defined by formula (1).
第一个优化项||LV-Δ||是为了保持变形前后Laplaican坐标不变,第二个优化项∑ j∈J||Π P(C 0+C expδ) j-Q j|| 2是为了进行第二人脸关键点插值,这里对第二人脸关键点进行透视投影,而透视投影本身是非线性操作。因此,优化问题(4)是一个非线性的优化问题,求解非线性的优化问题的计算量较大,难以在资源较为紧缺的设备上实时处理。 The first optimization term ||LV-Δ|| is to keep the Laplaican coordinates unchanged before and after deformation, and the second optimization term ∑ j∈J ||Π P (C 0 +C exp δ) j -Q j || 2 It is for the interpolation of the key points of the second face. Here, the perspective projection is performed on the key points of the second face, and the perspective projection itself is a nonlinear operation. Therefore, the optimization problem (4) is a non-linear optimization problem, and solving the non-linear optimization problem requires a large amount of calculation, and it is difficult to process it in real time on a device with relatively scarce resources.
在本实施例中,在对第二人脸关键点进行插值时,使用线性的优化方法对目标区域进行变形。In this embodiment, when interpolating the second human face key points, a linear optimization method is used to deform the target area.
示例性地,图2定义了眼睛作为人脸数据中的目标区域,位于目标区域边界的顶点202,在线性变形的过程中保持不变,位于目标区域内部的顶点,在线性变形的过程中更新它们的坐标,使得这些顶点尽量保持Laplacian坐标,同时,将第二人脸关键点203透射投影到人脸数据上后的第三人脸关键点与相应的第一人脸关键点重叠,位于目标区域之外的顶点201用于计算位于目标区域边界上顶点202的Laplacian坐标,不参与线性变形。Exemplarily, Fig. 2 defines the eyes as the target area in the face data, the vertices 202 located on the boundary of the target area remain unchanged during the linear deformation process, and the vertices located inside the target area are updated during the linear deformation process Their coordinates make these vertices keep the Laplacian coordinates as much as possible. At the same time, the third face key point after the second face key point 203 is transmitted and projected onto the face data overlaps with the corresponding first face key point, and is located in the target The vertices 201 outside the area are used to calculate the Laplacian coordinates of the vertices 202 located on the boundary of the target area, and do not participate in the linear deformation.
根据定义,线性变形后的Laplacian向量为:By definition, the Laplacian vector after linear deformation is:
Figure PCTCN2022106211-appb-000004
Figure PCTCN2022106211-appb-000004
W为人脸数据中所有顶点的拉普拉斯矩阵,大小为3m×3m,W 1为第一子拉普拉斯矩阵,即,在所有顶点的拉普拉斯矩阵W中移除处于目标区域边界的顶点的前n列后剩余的矩阵,大小为3m×3(m-n),W 2为第二子拉普拉斯矩阵,即,在所有顶点的拉普拉斯矩阵W中处于目标区域边界的顶点的前n列,大小为3m×3n。 W is the Laplacian matrix of all vertices in the face data, the size is 3m×3m, and W 1 is the first sub-Laplacian matrix, that is, remove the target area from the Laplacian matrix W of all vertices The matrix remaining after the first n columns of the vertices of the boundary, of size 3m×3(mn), W 2 is the second sub-Laplacian matrix, i.e., in the Laplacian matrix W of all vertices at the boundary of the target region The first n columns of the vertices of , the size is 3m×3n.
U=(u 0,0,u 0,1,…,u 0,n-1)为位于目标区域边界的顶点所组成的数组,V=(v 0,n,v 0,n+1,…,v 0,m-1)为位于目标区域内部的顶点所组成的数组。 U=(u 0,0 ,u 0,1 ,…,u 0,n-1 ) is an array composed of vertices located at the boundary of the target area, V=(v 0,n ,v 0,n+1 ,… ,v 0,m-1 ) is an array composed of vertices located inside the target area.
为了保持Laplacian向量不变,需要优化插值第二人脸关键点:In order to keep the Laplacian vector unchanged, it is necessary to optimize the interpolation of the second face key points:
min‖W 1V-(L 0-W 2U)‖   (5) min‖W 1 V-(L 0 -W 2 U)‖ (5)
L 0是目标区域中的顶点转换的向量。 L 0 is a vector of vertex transformations in the target region.
假设B为第一人脸关键点的索引,q i=(q i,x,q i,y),i∈B为已知的第一人脸关键点,
Figure PCTCN2022106211-appb-000005
为摄像头的参数,可以定义如下第一参考矩阵K i,0
Suppose B is the index of the first face key point, q i =(q i,x ,q i,y ), i∈B is the known first face key point,
Figure PCTCN2022106211-appb-000005
As the parameters of the camera, the first reference matrix K i,0 can be defined as follows:
Figure PCTCN2022106211-appb-000006
Figure PCTCN2022106211-appb-000006
根据公式(1),Π(·)是基于摄像头的参数K的透视投影,设s 0,i,s∈{u,v}为线性变换前的顶点,将其应用于线性变形后的顶点s 1,i,s∈{u,v}: According to the formula (1), Π( ) is the perspective projection based on the parameter K of the camera, let s 0,i ,s ∈ {u,v} be the vertex before the linear transformation, and apply it to the vertex s after the linear transformation 1, i , s ∈ {u, v}:
Figure PCTCN2022106211-appb-000007
Figure PCTCN2022106211-appb-000007
对于i∈B,s∈{u,v},结合第一参考矩阵K i,0,重新将第二人脸关键点进行投射投影(即重投影)的误差可以改写为: For i∈B,s∈{u,v}, combined with the first reference matrix K i,0 , the error of re-projecting the second face key points (that is, reprojection) can be rewritten as:
Figure PCTCN2022106211-appb-000008
Figure PCTCN2022106211-appb-000008
插值的目标为最小化重投影的误差,即最小化如下函数:The goal of interpolation is to minimize the error of reprojection, that is, to minimize the following function:
Figure PCTCN2022106211-appb-000009
Figure PCTCN2022106211-appb-000009
v 0,i=(v 0,i,x,v 0,i,y,v 0,i,z)为线性变形之后的顶点。 v 0,i =(v 0,i,x ,v 0,i,y ,v 0,i,z ) is the vertex after linear deformation.
结合公式(5)和公式(6),插值的目标是求解如下函数:Combining formula (5) and formula (6), the goal of interpolation is to solve the following function:
Figure PCTCN2022106211-appb-000010
Figure PCTCN2022106211-appb-000010
公式(7)为非线性的优化问题,为了使得人脸追踪得到的网格重投影的误差足够小,可以假设第二人脸关键点的Z轴的分量在Laplacian变形前后的变化很小,即:Formula (7) is a nonlinear optimization problem. In order to make the grid reprojection error obtained by face tracking small enough, it can be assumed that the Z-axis component of the second face key point changes little before and after Laplacian deformation, that is, :
Figure PCTCN2022106211-appb-000011
Figure PCTCN2022106211-appb-000011
设线性变形前,u 0,i=(u 0,i,x,u 0,i,y,u 0,i,z)为线性变换前的第二人脸关键点,u 1,i=(u 1,i,x,u 1,i,y,u 1,i,z)=Ru 0,i+T为线性变换后的第二人脸关键点,设v 0,i=(v 0,i,x,v 0,i,y,v 0,i,z)为变形前的顶点,v 1,i=Rv 0,i+T=(v 1,i,x,v 1,i,y,v 1,i,z)为线性变换后的顶点,v 1,i,z=R 2v 0,i+T 2为线性变换后的顶点的Z轴的分量。 Let u 0,i =(u 0,i,x ,u 0,i,y ,u 0,i,z ) be the second face key point before linear transformation before linear transformation, u 1,i =( u 1,i,x ,u 1,i,y ,u 1,i,z )=Ru 0,i +T is the second face key point after linear transformation, set v 0,i =(v 0, i,x ,v 0,i,y ,v 0,i,z ) are vertices before deformation, v 1,i =Rv 0,i +T=(v 1,i,x ,v 1,i,y ,v 1,i,z ) is the linearly transformed vertex, and v 1,i,z =R 2 v 0,i +T 2 is the Z-axis component of the linearly transformed vertex.
v 1,i,z=R 2v 0,i+T 2为线性变换后的第二人脸关键点的Z轴的分量。 v 1,i,z =R 2 v 0,i +T 2 is the Z-axis component of the second facial key point after linear transformation.
如果公式(7)使用如下方式进行逼近,会得到扭曲结果:If equation (7) is approximated using the following method, a distorted result will be obtained:
Figure PCTCN2022106211-appb-000012
Figure PCTCN2022106211-appb-000012
Figure PCTCN2022106211-appb-000013
Figure PCTCN2022106211-appb-000013
根据公式(6):According to formula (6):
Figure PCTCN2022106211-appb-000014
Figure PCTCN2022106211-appb-000014
最小化||K i(Rv 0,i+T)||即为最小化||v 1,i,z(Π(v 1,i)-q i)||。 To minimize ||K i (Rv 0,i +T)|| is to minimize ||v 1,i,z (Π(v 1,i )-q i )||.
所以,公式(8)、公式(9)会同时最小化||Π(v 1,i)-q i||和||v 1,i,z||,即在让重投影的第三人脸关键点跟与目标区域中的第二人脸关键点相对应的第一人脸关键点重合的同时,最小化||v 1,i,z||,从而导致扭曲。 Therefore, formula (8) and formula (9) will simultaneously minimize ||Π(v 1,i )-q i || and ||v 1,i,z || While the face keypoint coincides with the first face keypoint corresponding to the second face keypoint in the target region, ||v 1,i,z || is minimized, resulting in distortion.
如图3A所示,如果对眼睛的第二人脸关键点进行插值,使得重投影的第三人脸关键点与相应的第一人脸关键点重合,图3C所示的右眼的正面视图、图3E所示的右眼的上方视图均错误地线性逼近右眼,导致扭曲。As shown in Figure 3A, if the second facial keypoint of the eye is interpolated so that the reprojected third facial keypoint coincides with the corresponding first facial keypoint, the frontal view of the right eye shown in Figure 3C , the upper view of the right eye shown in FIG. 3E all incorrectly linearly approximate the right eye, resulting in distortion.
在本申请的一个实施例中,步骤104包括如下步骤。In an embodiment of the present application, step 104 includes the following steps.
步骤1041、设定对目标区域进行形变、将目标区域内部的顶点映射至三维的目标点。Step 1041 , setting deformation of the target area, and mapping vertices inside the target area to three-dimensional target points.
在本实施例中,可以对人脸数据中的目标区域进行变形,如Laplacian(拉普拉斯)变形,从而在三维的空间中,将目标区域内部的顶点映射至目标点v 0,i=(v 0,i,x,v 0,i,y,v 0,i,z),目标点待求解。 In this embodiment, the target area in the face data can be deformed, such as Laplacian (Laplacian) deformation, so that in the three-dimensional space, the vertices inside the target area are mapped to the target point v 0,i = (v 0,i,x ,v 0,i,y ,v 0,i,z ), the target point to be solved.
步骤1042、计算第一向量与第二向量之间的差异,作为向量差异。Step 1042. Calculate the difference between the first vector and the second vector as a vector difference.
在本实施例中,可以对目标区域中的顶点进行线性变换,并在线性变化之后的向量空间中计算第一向量与第二向量之间的差异,作为向量差异。In this embodiment, the vertices in the target area may be linearly transformed, and the difference between the first vector and the second vector is calculated in the vector space after the linear transformation as the vector difference.
第一向量为目标区域中的顶点转换的向量,第二向量为目标点转换的向量。The first vector is the vector transformed by the vertices in the target area, and the second vector is the vector transformed by the target point.
示例性地,对目标区域中的顶点进行拉普拉斯Laplacian变形,可以得到第一向量L 0
Figure PCTCN2022106211-appb-000015
Exemplarily, by performing Laplacian transformation on the vertices in the target area, the first vector L 0 can be obtained,
Figure PCTCN2022106211-appb-000015
W为人脸数据中所有顶点的拉普拉斯矩阵,W 1为第一子拉普拉斯矩阵,即,在所有顶点的拉普拉斯矩阵W中移除处于目标区域边界的顶点的前n列后剩余的矩阵,W 2为第二子拉普拉斯矩阵,即,在所有顶点的拉普拉斯矩阵W中处于目标区域边界的顶点的前n列。 W is the Laplacian matrix of all vertices in the face data, and W 1 is the first sub-Laplacian matrix, that is, remove the first n vertices at the boundary of the target area from the Laplacian matrix W of all vertices The remaining matrix after columns, W 2 is the second sub-Laplacian matrix, that is, the first n columns of vertices that are on the boundary of the target region in the Laplacian matrix W of all vertices.
P=(p 0,p 1,…,p m-1)为目标区域中人脸数据的顶点,首n个顶点为位于目标区域边界的顶点,P 1=(p n,p n+1,…,p m-1)为位于目标区域内部的顶点,P 2=(p 0,p 1,…,p n-1)为位于目标区域边界的顶点。 P=(p 0 ,p 1 ,...,p m-1 ) is the vertices of the face data in the target area, the first n vertices are vertices located at the boundary of the target area, P 1 =(p n ,p n+1 , ...,p m-1 ) are vertices located inside the target area, and P 2 =(p 0 ,p 1 ,...,p n-1 ) are vertices located at the boundary of the target area.
拉普拉斯矩阵W、第一子拉普拉斯矩阵W 1和第二子拉普拉斯矩阵W 2初始化于整个人脸追踪开始前,是基于中性人脸中的顶点进行初始化,避免每帧图像数据均计算拉普拉斯矩阵W、第一子拉普拉斯矩阵W 1和第二子拉普拉斯矩阵W 2,从而加快线性变形的速度。 The Laplacian matrix W, the first sub-Laplacian matrix W 1 and the second sub-Laplacian matrix W 2 are initialized before the start of the entire face tracking, and are initialized based on the vertices in the neutral face to avoid The Laplacian matrix W, the first sub-Laplacian matrix W 1 and the second sub-Laplacian matrix W 2 are calculated for each frame of image data, thereby speeding up the speed of linear deformation.
对目标点进行拉普拉斯Laplacian变形,可以得到第二向量W 2U+W 1V。 Laplacian deformation is performed on the target point to obtain the second vector W 2 U+W 1 V .
U=(u 0,0,u 0,1,…,u 0,n-1)为位于目标区域边界的顶点,V=(v 0,n,v 0,n+1,…,v 0,m-1)为位于目标区域内部的顶点。 U=(u 0,0 ,u 0,1 ,…,u 0,n-1 ) is the vertex located at the boundary of the target area, V=(v 0,n ,v 0,n+1 ,…,v 0, m-1 ) are vertices located inside the target area.
将第一向量减去第二向量,获得向量差异W 1V-(L 0-W 2U)。 The second vector is subtracted from the first vector to obtain the vector difference W 1 V-(L 0 −W 2 U).
步骤1043、计算目标区域中的第二人脸关键点相对应的第一人脸关键点与第三人脸关键点之间的差异,作为重投影差异。Step 1043, calculating the difference between the first human face key point and the third human face key point corresponding to the second human face key point in the target area, as the reprojection difference.
在本实施例中,将目标点投射投影至人脸数据中,可获得人脸关键点,记为第三人脸关键点,即第三人脸关键点为将目标点所透视投影的二维的人脸关键点,计算目标区域中的第二人脸关键点相对应的第一人脸关键点与第三人脸关键点之间的差异,记为重投影差异,重投影差异作为约束项,用于控制第二人脸关键点插值。In this embodiment, the target point is projected into the face data, and the key point of the face can be obtained, which is recorded as the third key point of the face, that is, the third key point of the face is a two-dimensional perspective projection of the target point. Calculate the difference between the first face key point and the third face key point corresponding to the second face key point in the target area, which is recorded as the reprojection difference, and the reprojection difference is used as a constraint item , used to control the second face key point interpolation.
在现中,设q i=(q i,x,q i,y),i∈B为第一人脸关键点,则可以计算第一参考矩阵K i,0,第一参考矩阵为包含第一人脸关键点的单位矩阵与摄像头的参数之间的乘积,摄像头用于采集第一人脸数据。 In the present, if q i =(q i,x ,q i,y ), i∈B is the first face key point, then the first reference matrix K i,0 can be calculated, and the first reference matrix is the A product of an identity matrix of key points of a face and a parameter of a camera, and the camera is used to collect the first face data.
Figure PCTCN2022106211-appb-000016
Figure PCTCN2022106211-appb-000016
Figure PCTCN2022106211-appb-000017
为单位矩阵,
Figure PCTCN2022106211-appb-000018
为摄像头的参数。
Figure PCTCN2022106211-appb-000017
is the identity matrix,
Figure PCTCN2022106211-appb-000018
is the parameter of the camera.
计算第二参考矩阵K i,第二参考矩阵为第一参考矩阵与对目标点线性变换后的Z轴的分量之间的比值。 Calculate the second reference matrix K i , where the second reference matrix is the ratio between the first reference matrix and the Z-axis component after the linear transformation of the target point.
Figure PCTCN2022106211-appb-000019
Figure PCTCN2022106211-appb-000019
设线性变形前,u 0,i=(u 0,i,x,u 0,i,y,u 0,i,z)为线性变换前的第二人脸关键点,u 1,i=(u 1,i,x,u 1,i,y,u 1,i,z)=Ru 0,i+T为线性变换后的第二人脸关键点,R为旋转矩阵,T为平移向量,即,旋转矩阵R与平移向量T均用于线性变换,u 1,i,z=R 2u 0,i+T 2线性变换后第二人脸关键点的Z轴的分量(即Z轴的分量)。 Let u 0,i =(u 0,i,x ,u 0,i,y ,u 0,i,z ) be the second face key point before linear transformation before linear transformation, u 1,i =( u 1,i,x ,u 1,i,y ,u 1,i,z )=Ru 0,i +T is the second face key point after linear transformation, R is the rotation matrix, T is the translation vector, That is, both the rotation matrix R and the translation vector T are used for linear transformation, u 1, i, z = R 2 u 0, i + T 2 After the linear transformation, the Z-axis component of the second face key point (that is, the Z-axis component portion).
此时,可计算第一目标矩阵J i,0、第二目标矩阵J i,1At this time, the first target matrix J i,0 and the second target matrix J i,1 can be calculated.
第一目标矩阵为第二参考矩阵与旋转矩阵之间的乘积J i,0=K iR,第二目标矩阵为第二参考矩阵与平移向量之间乘积的反向J i,1=-K iT。 The first target matrix is the product J i,0 =K i R between the second reference matrix and the rotation matrix, and the second target matrix is the inverse J i,1 =-K of the product between the second reference matrix and the translation vector i T.
将第一目标矩阵与目标点之间的乘积,减去第二目标矩阵,获得重投影差异J i,0v 0,i-J i,1The product of the first target matrix and the target point is subtracted from the second target matrix to obtain the reprojection difference J i,0 v 0,i −J i,1 .
步骤1044、计算目标点在Z轴方向上移动的距离。Step 1044, calculate the moving distance of the target point in the Z-axis direction.
在本实施例中,可计算目标点在Z轴方向上移动的距离,从而作为约束项限制网格在Z轴方向上的移动。In this embodiment, the moving distance of the target point in the Z-axis direction can be calculated, so as to limit the movement of the grid in the Z-axis direction as a constraint item.
在实现中,可以计算旋转矩阵的Z轴的分量R 2与目标点v 0,i之间的乘积,作为第一中间值,计算第二人脸关键点线性变换后的Z轴的分量u 1,i,z与平移向量的Z轴的分量T 2之间的差值,作为第二中间值,计算第一中间值与第二中间值之间的差值,作为目标点在Z轴方向上移动的距离R 2v 0,i-(u 1,i,z-T 2)。 In the implementation, the product between the component R 2 of the Z axis of the rotation matrix and the target point v 0,i can be calculated as the first intermediate value, and the component u 1 of the Z axis after the linear transformation of the second face key point can be calculated , the difference between i, z and the component T 2 of the Z axis of the translation vector, as the second intermediate value, calculate the difference between the first intermediate value and the second intermediate value, as the target point in the Z axis direction The distance moved is R 2 v 0,i -(u 1,i,z -T 2 ).
步骤1045、将向量差异、重投影差异与距离线性融合,作为目标函数。Step 1045, linearly fusing the vector difference, the reprojection difference and the distance as an objective function.
在本实施例中,可以将向量差异、重投影差异与距离进行线性融合,从而设定为目标函数。In this embodiment, the vector difference, the reprojection difference and the distance can be linearly fused, so as to be set as the objective function.
在实现中,对重投影差异配置第一权重,对距离配置第二权重,计算向量差异、配置第一权重的重投影差异与配置第二权重的距离之间的和值,作为目标函数,表示如下:In the implementation, the first weight is configured for the reprojection difference, the second weight is configured for the distance, and the vector difference, the sum of the reprojection difference configured with the first weight and the distance configured with the second weight is calculated as the objective function, expressed as as follows:
Figure PCTCN2022106211-appb-000020
Figure PCTCN2022106211-appb-000020
α为第一权重,β为第二权重。α is the first weight, and β is the second weight.
步骤1046、以最小化目标函数作为目标,求解目标点。Step 1046, taking the minimization of the objective function as the objective, and solving the objective point.
在本实施例中,可以以最小二乘法来逼近优化问题(7),最小化目标函数, 从而求解目标点的坐标,设顶点(目标点)的数组为V=(v 0,n,v 0,n+1,…,v 0,m-1),则求解过程如下表示: In this embodiment, the optimization problem (7) can be approximated by the method of least squares, and the objective function can be minimized, thereby solving the coordinates of the target point. The array of vertices (target points) is set as V=(v 0,n ,v 0 ,n+1 ,…,v 0,m-1 ), then the solution process is expressed as follows:
Figure PCTCN2022106211-appb-000021
Figure PCTCN2022106211-appb-000021
如图3A所示,如果对眼睛的第二人脸关键点进行插值,使得重投影的第三人脸关键点与相应的第一人脸关键点重合,图3C所示的右眼的正面视图、图3E所示的右眼的上方视图均正确地线性逼近右眼,并未产生扭曲。As shown in Figure 3A, if the second facial keypoint of the eye is interpolated so that the reprojected third facial keypoint coincides with the corresponding first facial keypoint, the frontal view of the right eye shown in Figure 3C , and the upper view of the right eye shown in FIG. 3E all linearly approximate the right eye without distortion.
在实现中,可以构建第一稀疏矩阵A、第二稀疏矩阵b。In implementation, the first sparse matrix A and the second sparse matrix b can be constructed.
第一稀疏矩阵A包括第一子拉普拉斯矩阵,第一权重α与第一目标矩阵J i,0之间的乘积,第二权重β与旋转矩阵的Z轴的分量R 2之间的乘积,则第一稀疏矩阵A表示如下: The first sparse matrix A consists of the first sub-Laplacian matrix, the product between the first weight α and the first target matrix J i,0 , the product between the second weight β and the component R2 of the Z axis of the rotation matrix Product, then the first sparse matrix A is expressed as follows:
Figure PCTCN2022106211-appb-000022
Figure PCTCN2022106211-appb-000022
第二稀疏矩阵b包括第一向量L 0减去第二子拉普拉斯矩阵W 2与处于目标区域边界的顶点U之间乘积之间的差值,第一权重α与第二目标矩阵J i,1之间的乘积,第二权重β与第二中间值(u 1,i,z-T 2)之间的乘积,则第二稀疏矩阵b表示如下: The second sparse matrix b consists of the first vector L 0 minus the difference between the product of the second sub-Laplacian matrix W 2 and the vertex U at the boundary of the target region, the first weight α and the second target matrix J The product between i and 1 , the product between the second weight β and the second intermediate value (u 1, i, z -T 2 ), then the second sparse matrix b is expressed as follows:
Figure PCTCN2022106211-appb-000023
Figure PCTCN2022106211-appb-000023
设置目标关系,目标关系为线性方程,即,第一稀疏矩阵与目标函数之间的乘积等于第二稀疏矩阵,即AV=b(11)。The objective relationship is set, and the objective relationship is a linear equation, that is, the product between the first sparse matrix and the objective function is equal to the second sparse matrix, that is, AV=b(11).
从而通过SimplicialLDLT(Eigen提供了的内建直接求解器,用于直接LDLT分解)、共轭梯度法(ConjugateGradient)等稀疏矩阵法(sparse solver)基于目标关系求解目标点。Therefore, the target point is solved based on the target relationship through sparse solver methods such as SimplicialLDLT (the built-in direct solver provided by Eigen for direct LDLT decomposition) and the conjugate gradient method (ConjugateGradient).
考虑到追踪人脸数据得到目标区域内部的顶点V 0=(u 0,n,u 0,n+1,…,u 0,m-1)接近目标点,因此,可以将处于目标区域内部的顶点V 0设置为目标点的初始值,本实施例线性变形的目标是让重投影的第三人脸关键点与重投影的第三人脸关键点相对应的第一人脸关键点重合,即Π(v 1,i)=q i,i∈B。实际上,在人脸数据上,Π(v 1,i)和q i之间保留一定的误差(如1像素以内)即可,也就是说,可以用迭代算法求解目标关系(11),设置数值较大的阈值,在初始值的基础上迭代更新目标点,直至第三人脸关键点与目标区域中的第二人脸关键点相对应的第 一人脸关键点之间的差值小于预设的阈值,从而提高计算速度,减少计算耗时。 Considering that the vertex V 0 =(u 0,n ,u 0,n+1 ,...,u 0,m-1 ) inside the target area obtained by tracking the face data is close to the target point, therefore, the vertex in the target area can be The vertex V0 is set as the initial value of the target point, and the goal of the linear deformation in this embodiment is to make the reprojected third facial key point coincide with the first human face key point corresponding to the reprojected third human face key point, That is, Π(v 1,i )=q i ,i∈B. In fact, on the face data, it is enough to keep a certain error (such as within 1 pixel) between Π(v 1,i ) and q i , that is to say, iterative algorithm can be used to solve the target relationship (11), setting For a threshold with a larger value, the target point is iteratively updated on the basis of the initial value until the difference between the third face key point and the first face key point corresponding to the second face key point in the target area is less than The preset threshold can improve the calculation speed and reduce the calculation time.
示例性地,本实施例采用ConjugateGradient进行求解,并采用初始值V 0,阈值为1e-5,从下表可以看出,这种策略耗时为SimplicialLDLT算法的一半,而且在结果上,没有明显区别。 Exemplarily, this embodiment uses ConjugateGradient to solve, and adopts the initial value V 0 , and the threshold is 1e-5. It can be seen from the table below that this strategy takes half the time of the SimplicialLDLT algorithm, and in the result, there is no obvious the difference.
Figure PCTCN2022106211-appb-000024
Figure PCTCN2022106211-appb-000024
经测试,本实施例对左眼和右眼两个目标区域进行Laplacian变形,在手机端上耗时为1ms,可以保证实时性。After testing, this embodiment performs Laplacian deformation on the two target areas of the left eye and the right eye, and it takes 1 ms on the mobile phone, which can ensure real-time performance.
本实施例获取二维的第一人脸数据,第一人脸数据中具有二维的第一人脸关键点,根据第一人脸关键点拟合三维的第二人脸数据,第二人脸数据具有三维的第二人脸关键点,选定第二人脸数据中局部的区域,作为目标区域,对目标区域进行线性变形,使得将目标区域中的第二人脸关键点透视投影至二维的第三人脸关键点时,第三人脸关键点与目标区域中的第二人脸关键点相对应的第一人脸关键点重叠,将拟合人脸数据时的优化问题调整为线性的优化问题,线性的优化问题处理较为简单,计算量较低,可以大大降低计算耗时,在资源较为紧缺的设备上实时处理。In this embodiment, the two-dimensional first human face data is acquired, and the first human face data has two-dimensional first human face key points, and the three-dimensional second human face data is fitted according to the first human face key points, and the second human face data is The face data has a three-dimensional second human face key point, select a local area in the second human face data as the target area, and perform linear deformation on the target area, so that the second human face key point in the target area is perspective projected to When the third face key point is two-dimensional, the third face key point overlaps with the first face key point corresponding to the second face key point in the target area, and the optimization problem when fitting face data is adjusted It is a linear optimization problem. The linear optimization problem is relatively simple to deal with, and the calculation amount is low, which can greatly reduce the calculation time consumption, and it can be processed in real time on devices with relatively scarce resources.
需要说明的是,对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请实施例并不受所描述的动作顺序的限制,因为依据本申请实施例,一些步骤可以采用其他顺序或者同时进行。本领域技术人员也应该知悉,说明书中所描述的实施例均属于可选实施例,所涉及的动作并不一定是本申请实施例所必须的。It should be noted that, for the method embodiment, for the sake of simple description, it is expressed as a series of action combinations, but those skilled in the art should know that the embodiment of the present application is not limited by the described action sequence, because According to the embodiment of the present application, some steps may be performed in other orders or simultaneously. Those skilled in the art should also know that the embodiments described in the specification are all optional embodiments, and the actions involved are not necessarily required by the embodiments of the present application.
实施例二Embodiment two
图4为本申请实施例二提供的一种人脸关键点的插值装置的结构框图。该装置可以包括如下模块。FIG. 4 is a structural block diagram of an interpolation device for key points of a face provided by Embodiment 2 of the present application. The device may include the following modules.
二维人脸数据获取模块401,设置为获取二维的第一人脸数据,所述第一人脸数据中具有二维的第一人脸关键点。The two-dimensional face data obtaining module 401 is configured to obtain two-dimensional first human face data, and the first two-dimensional face data has two-dimensional first human face key points.
三维人脸数据拟合模块402,设置为根据所述第一人脸关键点拟合三维的第二人脸数据,所述第二人脸数据具有三维的第二人脸关键点。The 3D face data fitting module 402 is configured to fit the 3D second face data according to the first face key points, and the second face data has the 3D second face key points.
目标区域选定模块403,设置为选定所述第二人脸数据中局部的区域,作为目标区域。The target area selection module 403 is configured to select a partial area in the second face data as the target area.
目标区域形变模块404,设置为对所述目标区域进行线性变形,使得将所述目标区域中的第二人脸关键点透视投影至二维的第三人脸关键点时,所述第三人脸关键点与所述目标区域中的第二人脸关键点相对应的第一人脸关键点重叠。The target area deformation module 404 is configured to perform linear deformation on the target area, so that when the key points of the second human face in the target area are perspective-projected to the two-dimensional key points of the third human face, the third human face The face key points overlap with the first human face key points corresponding to the second human face key points in the target area.
在本申请的一个实施例中,所述第二人脸数据以网格的形式表示,所述网格中具有多个三维的顶点,部分顶点为所述第二人脸关键点;In one embodiment of the present application, the second face data is represented in the form of a grid, and the grid has a plurality of three-dimensional vertices, some of which are key points of the second face;
所述目标区域形变模块404包括:顶点映射模块,设置为设定对所述目标区域进行形变、将所述目标区域内部的顶点映射至三维的目标点;向量差异计算模块,设置为计算第一向量与第二向量之间的差异,作为向量差异,所述第一向量为所述目标区域中的顶点转换的向量,所述第二向量为所述目标点转换的向量;重投影差异计算模块,设置为计算所述第三人脸关键点与所述目标区域中的第二人脸关键点相对应的第一人脸关键点之间的差异,作为重投影差异,所述第三人脸关键点为将所述目标点所透视投影的二维的人脸关键点;移动距离计算模块,设置为计算所述目标点在Z轴方向上移动的距离;线性融合模块,设置为将所述向量差异、所述重投影差异与所述距离线性融合,作为目标函数;目标点求解模块,设置为以最小化所述目标函数作为目标,求解所述目标点。The target area deformation module 404 includes: a vertex mapping module, configured to deform the target area, and map vertices inside the target area to three-dimensional target points; a vector difference calculation module, configured to calculate the first The difference between the vector and the second vector, as a vector difference, the first vector is the vector converted by the vertex in the target area, and the second vector is the vector converted by the target point; reprojection difference calculation module , set to calculate the difference between the first face key point corresponding to the third face key point and the second face key point in the target area, as the reprojection difference, the third face key point The key point is the two-dimensional human face key point of the perspective projection of the target point; the moving distance calculation module is configured to calculate the distance that the target point moves in the Z-axis direction; the linear fusion module is configured to The vector difference, the reprojection difference and the distance are linearly fused as an objective function; the target point solving module is configured to minimize the objective function as a target and solve the target point.
在本申请的一个实施例中,所述向量差异计算模块是设置为:对所述顶点进行拉普拉斯变形,得到第一向量;对所述目标点进行拉普拉斯变形,得到第二向量;将所述第一向量减去所述第二向量,获得向量差异。In one embodiment of the present application, the vector difference calculation module is configured to: perform Laplace transformation on the vertex to obtain the first vector; perform Laplace transformation on the target point to obtain the second vector Vector; subtract the second vector from the first vector to obtain a vector difference.
在本申请的一个实施例中,所述重投影差异计算模块是设置为:计算第一 参考矩阵,所述第一参考矩阵为包含所述目标区域中的第二人脸关键点相对应的第一人脸关键点的单位矩阵与摄像头的参数之间的乘积,所述摄像头设置为采集所述第一人脸数据;计算第二参考矩阵,所述第二参考矩阵为所述第一参考矩阵与对所述目标点线性变换后的Z轴的分量之间的比值;计算第一目标矩阵、第二目标矩阵,所述第一目标矩阵为所述第二参考矩阵与旋转矩阵之间的乘积,所述第二目标矩阵为所述第二参考矩阵与平移向量之间乘积的反向,所述旋转矩阵与所述平移向量均用于线性变换;将所述第一目标矩阵与所述目标点之间的乘积,减去所述第二目标矩阵,获得重投影差异。In an embodiment of the present application, the reprojection difference calculation module is configured to: calculate a first reference matrix, the first reference matrix is the first reference matrix corresponding to the second human face key point in the target area The product between the unit matrix of key points of a human face and the parameter of the camera, the camera is set to collect the first human face data; calculate the second reference matrix, the second reference matrix is the first reference matrix and the ratio between the components of the Z axis after the linear transformation of the target point; calculate the first target matrix and the second target matrix, and the first target matrix is the product between the second reference matrix and the rotation matrix , the second target matrix is the inverse of the product between the second reference matrix and a translation vector, both of which are used for linear transformation; the first target matrix and the target The product between the points is subtracted from the second objective matrix to obtain the reprojected difference.
在本申请的一个实施例中,所述移动距离计算模块是设置为:计算旋转矩阵的Z轴的分量与所述目标点之间的乘积,作为第一中间值;计算所述目标区域中的第二人脸关键点线性变换后的Z轴的分量与平移向量的Z轴的分量之间的差值,作为第二中间值;计算所述第一中间值与所述第二中间值之间的差值,作为所述目标点在Z轴方向上移动的距离。In one embodiment of the present application, the moving distance calculation module is configured to: calculate the product of the Z-axis component of the rotation matrix and the target point as the first intermediate value; calculate the moving distance in the target area The difference between the Z-axis component after the linear transformation of the second face key point and the Z-axis component of the translation vector, as the second intermediate value; calculate the difference between the first intermediate value and the second intermediate value The difference is taken as the moving distance of the target point in the Z-axis direction.
在本申请的一个实施例中,所述线性融合模块是设置为:对所述重投影差异配置第一权重;对所述距离配置第二权重;计算所述向量差异、配置所述第一权重的重投影差异与配置第二权重的距离之间的和值,作为目标函数。In one embodiment of the present application, the linear fusion module is configured to: configure a first weight for the reprojection difference; configure a second weight for the distance; calculate the vector difference and configure the first weight The sum of the reprojected differences of , and the distance to configure the second weight is used as the objective function.
在本申请的一个实施例中,所述目标点求解模块是设置为:构建第一稀疏矩阵A、第二稀疏矩阵B,所述第一稀疏矩阵包括第一子拉普拉斯矩阵,第一权重与第一目标矩阵J i,0之间的乘积,第二权重与旋转矩阵的Z轴的分量之间的乘积,所述第一子拉普拉斯矩阵为在所有顶点的拉普拉斯矩阵中移除处于所述目标区域边界的顶点的前n列后剩余的矩阵,所述第二稀疏矩阵包括第一向量减去第二子拉普拉斯矩阵与处于所述目标区域边界的顶点之间乘积之间的差值,第一权重与第二目标矩阵J i,1之间的乘积,第二权重与第二中间值之间的乘积,所述第二子拉普拉斯矩阵为在所述所有顶点的拉普拉斯矩阵中处于所述目标区域边界的顶点的前n列矩阵;设置目标关系,所述目标关系为所述第一稀疏矩阵与所述目标函数之间的乘积等于所述第二稀疏矩阵;基于所述目标关系求解所述目标点。 In one embodiment of the present application, the target point solving module is configured to: construct a first sparse matrix A and a second sparse matrix B, the first sparse matrix includes a first sub-Laplacian matrix, the first The product between the weights and the first objective matrix J i,0 , the product between the second weights and the Z-axis component of the rotation matrix, the first sub-Laplacian matrix being the Laplacian at all vertices The remaining matrix after removing the first n columns of the vertices at the boundary of the target area in the matrix, the second sparse matrix includes the first vector minus the second sub-Laplacian matrix and the vertices at the boundary of the target area The difference between the products between, the product between the first weight and the second target matrix J i,1 , the product between the second weight and the second intermediate value, the second sub-Laplace matrix is In the Laplacian matrix of all vertices, the first n column matrices of the vertices at the boundary of the target area; set the target relationship, the target relationship is the product between the first sparse matrix and the target function is equal to the second sparse matrix; solving the target point based on the target relationship.
在本申请的一个实施例中,所述目标点求解模块是设置为:将处于所述目标区域内部的顶点设置为所述目标点的初始值;在所述初始值的基础上迭代更新所述目标点,直至所述第三人脸关键点与与所述第三人脸关键点相对应的第 一人脸关键点之间的差值小于预设的阈值。In one embodiment of the present application, the target point solving module is set to: set the vertices inside the target area as the initial value of the target point; iteratively update the target point until the difference between the third key point of human face and the first key point of human face corresponding to the third key point of human face is less than a preset threshold.
在本申请的一个实施例中,所述二维人脸数据获取模块401包括:视频数据采集模块,设置为调用摄像头以采集视频数据,所述视频数据中具有多帧图像数据;人脸追踪模块,设置为在所述多帧图像数据中追踪二维的第一人脸数据。In one embodiment of the present application, the two-dimensional face data acquisition module 401 includes: a video data acquisition module, configured to call a camera to collect video data, and the video data has multiple frames of image data; a face tracking module , set to track two-dimensional first face data in the multi-frame image data.
本申请实施例所提供的人脸关键点的插值装置可执行本申请任意实施例所提供的人脸关键点的插值方法,具备执行方法相应的功能模块。The face key point interpolation device provided in the embodiment of the present application can execute the face key point interpolation method provided in any embodiment of the present application, and has corresponding functional modules for executing the method.
实施例三Embodiment three
图5为本申请实施例三提供的一种计算机设备的结构示意图。图5示出了适于用来实现本申请实施方式的示例性计算机设备12的框图。图5显示的计算机设备12仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。FIG. 5 is a schematic structural diagram of a computer device provided in Embodiment 3 of the present application. FIG. 5 shows a block diagram of an exemplary computer device 12 suitable for implementing embodiments of the present application. The computer device 12 shown in FIG. 5 is only an example, and should not limit the functions and scope of use of this embodiment of the present application.
如图5所示,计算机设备12以通用计算设备的形式表现。计算机设备12的组件可以包括但不限于:一个或者多个处理器或者处理单元16,系统存储器28,连接不同系统组件(包括系统存储器28和处理单元16)的总线18。As shown in FIG. 5, computer device 12 takes the form of a general-purpose computing device. Components of computer device 12 may include, but are not limited to: one or more processors or processing units 16 , system memory 28 , bus 18 connecting various system components including system memory 28 and processing unit 16 .
总线18表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器,外围总线,图形加速端口,处理器或者使用多种总线结构中的任意总线结构的局域总线。举例来说,这些体系结构包括但不限于工业标准体系结构(Industry Standard Architecture,ISA)总线,微通道体系结构(Micro Channel Architecture,MCA)总线,增强型ISA总线、视频电子标准协会(Video Electronics Standards Association,VESA)局域总线以及外围组件互连(Peripheral Component Interconnection,PCI)总线。 Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus structures. For example, these architectures include but are not limited to Industry Standard Architecture (Industry Standard Architecture, ISA) bus, Micro Channel Architecture (Micro Channel Architecture, MCA) bus, Enhanced ISA bus, Video Electronics Standards Association (Video Electronics Standards Association, VESA) local bus and peripheral component interconnection (Peripheral Component Interconnection, PCI) bus.
计算机设备12包括多种计算机系统可读介质。这些介质可以是任何能够被计算机设备12访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。 Computer device 12 includes a variety of computer system readable media. These media can be any available media that can be accessed by computer device 12 and include both volatile and nonvolatile media, removable and non-removable media.
系统存储器28可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器(Random Access Memory,RAM)30和/或高速缓存存储器32。计算机设备12还可以包括其它可移动/不可移动的、易失性/非易失性计算机系统存储介质。仅作为举例,存储系统34可以设置为读写不可移动的、非易失性 磁介质(图5未显示,通常称为“硬盘驱动器”)。尽管图5中未示出,存储系统可以提供用于对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘(例如便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM),数字视频只读存储器(Digital Video Disc Read-Only Memory,DVD-ROM)或者其它光介质)读写的光盘驱动器。在这些情况下,每个驱动器可以通过一个或者多个数据介质接口与总线18相连。存储器28可以包括至少一个程序产品,该程序产品具有一组(例如至少一个)程序模块,这些程序模块被配置以执行本申请多个实施例的功能。 System memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32 . Computer device 12 may also include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be configured to read from and write to non-removable, non-volatile magnetic media (not shown in Figure 5, commonly referred to as a "hard drive"). Although not shown in Figure 5, the storage system may provide disk drives for reading and writing to removable non-volatile disks (such as "floppy disks"), as well as read-only Memory (Compact Disc Read-Only Memory, CD-ROM), digital video read-only memory (Digital Video Disc Read-Only Memory, DVD-ROM) or other optical media) CD-ROM drive. In these cases, each drive may be connected to bus 18 via one or more data media interfaces. Memory 28 may include at least one program product having a set (eg, at least one) of program modules configured to perform the functions of various embodiments of the present application.
具有一组(至少一个)程序模块42的程序/实用工具40,可以存储在例如存储器28中,这样的程序模块42包括但不限于操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或一种组合中可能包括网络环境的实现。程序模块42通常执行本申请所描述的实施例中的功能和/或方法。A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in memory 28, such program modules 42 including but not limited to an operating system, one or more application programs, other program modules, and program data , each or a combination of these examples may include implementations of network environments. The program modules 42 generally perform the functions and/or methods of the embodiments described herein.
计算机设备12也可以与一个或多个外部设备14(例如键盘、指向设备、显示器24等)通信,还可与一个或者多个使得用户能与该计算机设备12交互的设备通信,和/或与使得该计算机设备12能与一个或多个其它计算设备进行通信的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过输入/输出(Input/Output,I/O)接口22进行。计算机设备12还可以通过网络适配器20与一个或者多个网络(例如局域网(Local Area Nerwork,LAN),广域网(Wide Area NetworkWAN)和/或公共网络,例如因特网)通信。如图5所示,网络适配器20通过总线18与计算机设备12的其它模块通信。应当明白,尽管图5中未示出,网络适配器20可以结合计算机设备12使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、独立磁盘冗余阵列(Redundant Arrays Of Independent Disk,RAID)系统、磁带驱动器以及数据备份存储系统等。The computer device 12 may also communicate with one or more external devices 14 (e.g., a keyboard, pointing device, display 24, etc.), and with one or more devices that enable a user to interact with the computer device 12, and/or with Any device (eg, network card, modem, etc.) that enables the computing device 12 to communicate with one or more other computing devices. This communication can be performed through an input/output (Input/Output, I/O) interface 22 . The computer device 12 can also communicate with one or more networks (such as a local area network (Local Area Nerwork, LAN), a wide area network (Wide Area Network WAN) and/or a public network such as the Internet) through the network adapter 20. As shown in FIG. 5 , network adapter 20 communicates with other modules of computer device 12 via bus 18 . It should be appreciated that although not shown in FIG. 5 , network adapter 20 may use other hardware and/or software modules in conjunction with computer device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, stand-alone Redundant Arrays Of Independent Disk (RAID) system, tape drive and data backup storage system, etc.
处理单元16通过运行存储在系统存储器28中的程序,从而执行多种功能应用以及数据处理,例如实现本申请实施例所提供的人脸关键点的插值方法。The processing unit 16 executes a variety of functional applications and data processing by running the programs stored in the system memory 28 , such as realizing the interpolation method of key points of the face provided by the embodiment of the present application.
实施例四Embodiment four
本申请实施例四还提供一种计算机可读存储介质,计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现上述人脸关键点的插值 方法的多个过程,为避免重复,这里不再赘述。Embodiment 4 of the present application also provides a computer-readable storage medium. A computer program is stored on the computer-readable storage medium. When the computer program is executed by a processor, multiple processes of the above-mentioned interpolation method for key points of a human face are implemented. In order to avoid Repeat, so I won't go into details here.
计算机可读存储介质例如可以包括但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、RAM、只读存储器(Read-Only Memory,ROM)、可擦式可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)或闪存、光纤、CD-ROM、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。A computer-readable storage medium, for example, may include, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer-readable storage media include: electrical connections with one or more leads, portable computer disks, hard disks, RAM, Read-Only Memory (ROM), Erasable Programmable Read-Only Memory (EPROM) or flash memory, optical fiber, CD-ROM, optical storage device, magnetic storage device, or any suitable combination of the above. In the present application, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

Claims (11)

  1. 一种人脸关键点的插值方法,包括:A method for interpolating face key points, comprising:
    获取二维的第一人脸数据,所述第一人脸数据中具有二维的第一人脸关键点;Acquiring two-dimensional first human face data, the first human face data having two-dimensional key points of the first human face;
    根据所述第一人脸关键点拟合三维的第二人脸数据,所述第二人脸数据具有三维的第二人脸关键点;Fitting three-dimensional second human face data according to the first human face key points, the second human face data having three-dimensional second human face key points;
    选定所述第二人脸数据中局部的区域,作为目标区域;Selecting a partial area in the second face data as a target area;
    对所述目标区域进行线性变形,使得将所述目标区域中的第二人脸关键点透视投影至二维的第三人脸关键点的情况下,所述第三人脸关键点与所述目标区域中的第二人脸关键点相对应的第一人脸关键点重叠。performing linear deformation on the target area, so that when the second key point of human face in the target area is perspective-projected to a two-dimensional third key point of human face, the third key point of human face is consistent with the The first human face key points corresponding to the second human face key points in the target area overlap.
  2. 根据权利要求1所述的方法,其中,所述第二人脸数据以网格的形式表示,所述网格中具有多个三维的顶点,部分顶点为所述第二人脸关键点;The method according to claim 1, wherein the second face data is represented in the form of a grid, the grid has a plurality of three-dimensional vertices, and some of the vertices are key points of the second face;
    所述对所述目标区域进行线性变形,使得将所述目标区域中的第二人脸关键点透视投影至二维的第三人脸关键点的情况下,所述第三人脸关键点与所述目标区域中的第二人脸关键点相对应的第一人脸关键点重叠,包括:The linear deformation of the target area is performed, so that when the second key point of human face in the target area is perspective-projected to the third two-dimensional key point of human face, the third key point of human face and The first human face key points corresponding to the second human face key points in the target area overlap, including:
    设定对所述目标区域进行形变、将所述目标区域内部的顶点映射至三维的目标点;Setting the deformation of the target area, and mapping the vertices inside the target area to three-dimensional target points;
    计算第一向量与第二向量之间的差异,作为向量差异,所述第一向量为所述目标区域中的顶点转换的向量,所述第二向量为所述目标点转换的向量;calculating a difference between a first vector and a second vector, as a vector difference, the first vector is a vector transformed by a vertex in the target area, and the second vector is a vector transformed by the target point;
    计算与所述目标区域中的第二人脸关键点相对应的第一人脸关键点与所述第三人脸关键点之间的差异,作为重投影差异,所述第三人脸关键点为将所述目标点所透视投影的二维的人脸关键点;calculating the difference between the first human face key point corresponding to the second human face key point in the target area and the third human face key point, as a reprojection difference, the third human face key point It is the two-dimensional human face key point of the perspective projection of the target point;
    计算所述目标点在Z轴方向上移动的距离;Calculate the distance that the target point moves in the Z-axis direction;
    将所述向量差异、所述重投影差异与所述距离线性融合,作为目标函数;linearly fusing said vector difference, said reprojected difference and said distance as an objective function;
    以最小化所述目标函数作为目标,求解所述目标点。Taking minimizing the objective function as an objective, solving the objective point.
  3. 根据权利要求2所述的方法,其中,所述计算第一向量与第二向量之间的差异,作为向量差异,包括:The method of claim 2, wherein said calculating the difference between the first vector and the second vector, as a vector difference, comprises:
    对所述目标区域中的顶点进行拉普拉斯变形,得到第一向量;Laplace deformation is performed on the vertices in the target area to obtain the first vector;
    对所述目标点进行拉普拉斯变形,得到第二向量;performing Laplace deformation on the target point to obtain a second vector;
    将所述第一向量减去所述第二向量,获得向量差异。The second vector is subtracted from the first vector to obtain a vector difference.
  4. 根据权利要求2所述的方法,其中,所述计算所述目标区域中的第二人 脸关键点相对应的第一人脸关键点与所述第三人脸关键点之间的差异,作为重投影差异,包括:The method according to claim 2, wherein the calculation of the difference between the first human face key point corresponding to the second human face key point in the target area and the third human face key point is as Reproject diffs, including:
    计算第一参考矩阵,所述第一参考矩阵为包含所述目标区域中的第二人脸关键点相对应的第一人脸关键点的单位矩阵与摄像头的参数之间的乘积,所述摄像头设置为采集所述第一人脸数据;Calculating the first reference matrix, the first reference matrix is the product between the identity matrix of the first human face key point corresponding to the second human face key point in the target area and the parameters of the camera, the camera set to collect the first face data;
    计算第二参考矩阵,所述第二参考矩阵为所述第一参考矩阵与对所述目标点线性变换后的Z轴的分量之间的比值;Calculating a second reference matrix, where the second reference matrix is the ratio between the first reference matrix and the Z-axis component after the linear transformation of the target point;
    计算第一目标矩阵、第二目标矩阵,所述第一目标矩阵为所述第二参考矩阵与旋转矩阵之间的乘积,所述第二目标矩阵为所述第二参考矩阵与平移向量之间乘积的反向,所述旋转矩阵与所述平移向量均用于线性变换;Calculate the first target matrix and the second target matrix, the first target matrix is the product between the second reference matrix and the rotation matrix, and the second target matrix is the product between the second reference matrix and the translation vector The reverse of the product, both the rotation matrix and the translation vector are used for linear transformation;
    将所述第一目标矩阵与所述目标点之间的乘积,减去所述第二目标矩阵,获得重投影差异。The product of the first target matrix and the target point is subtracted from the second target matrix to obtain a reprojection difference.
  5. 根据权利要求2所述的方法,其中,所述计算所述目标点在Z轴方向上移动的距离,包括:The method according to claim 2, wherein said calculating the moving distance of said target point in the Z-axis direction comprises:
    计算旋转矩阵的Z轴的分量与所述目标点之间的乘积,作为第一中间值;Calculate the product between the component of the Z axis of the rotation matrix and the target point, as the first intermediate value;
    计算所述目标区域中的第二人脸关键点线性变换后的Z轴的分量与平移向量的Z轴的分量之间的差值,作为第二中间值;Calculate the difference between the Z-axis component and the Z-axis component of the translation vector after the linear transformation of the second face key point in the target area, as the second intermediate value;
    计算所述第一中间值与所述第二中间值之间的差值,作为所述目标点在Z轴方向上移动的距离。Calculate the difference between the first intermediate value and the second intermediate value as the moving distance of the target point in the Z-axis direction.
  6. 根据权利要求2所述的方法,其中,所述将所述向量差异、所述重投影差异与所述距离线性融合,作为目标函数,包括:The method according to claim 2, wherein the linear fusion of the vector difference, the reprojection difference and the distance, as an objective function, comprises:
    对所述重投影差异配置第一权重;assigning a first weight to the reprojected difference;
    对所述距离配置第二权重;configuring a second weight for the distance;
    计算所述向量差异、配置所述第一权重的重投影差异与配置所述第二权重的距离之间的和值,作为目标函数。A sum of the vector difference, the reprojection difference configuring the first weight, and the distance configuring the second weight is calculated as an objective function.
  7. 根据权利要求2-6中任一项所述的方法,其中,所述以最小化所述目标函数作为目标,求解所述目标点,包括:The method according to any one of claims 2-6, wherein said minimizing said objective function as an objective and solving said objective point comprises:
    构建第一稀疏矩阵、第二稀疏矩阵,所述第一稀疏矩阵包括第一子拉普拉斯矩阵,所述第一权重与所述第一目标矩阵之间的乘积,所述第二权重与所述旋转矩阵的Z轴的分量之间的乘积,所述第一子拉普拉斯矩阵为在所有顶点的拉普拉斯矩阵中移除处于所述目标区域边界的顶点的前n列后剩余的矩阵,所述第二稀疏矩阵包括所述第一向量减去第二子拉普拉斯矩阵与处于所述目标区 域边界的顶点之间乘积之间的差值,所述第一权重与所述第二目标矩阵之间的乘积,所述第二权重与所述第二中间值之间的乘积,所述第二子拉普拉斯矩阵为在所述所有顶点的拉普拉斯矩阵中处于所述目标区域边界的顶点的前n列矩阵;Constructing a first sparse matrix and a second sparse matrix, the first sparse matrix includes a first sub-Laplace matrix, the product between the first weight and the first target matrix, the second weight and The product between the components of the Z axis of the rotation matrix, the first sub-Laplacian matrix is after removing the first n columns of vertices at the boundary of the target area in the Laplacian matrices of all vertices The remaining matrix, the second sparse matrix includes the difference between the first vector minus the product of the second sub-Laplace matrix and the vertices at the boundary of the target region, the first weight and The product between the second target matrix, the product between the second weight and the second intermediate value, the second sub-Laplacian matrix is the Laplacian matrix at all vertices The first n column matrix of the vertices in the border of the target area;
    设置目标关系,所述目标关系为所述第一稀疏矩阵与所述目标函数之间的乘积等于所述第二稀疏矩阵;Setting an objective relationship, the objective relationship being that the product between the first sparse matrix and the objective function is equal to the second sparse matrix;
    基于所述目标关系求解所述目标点。Solving for the target point based on the target relationship.
  8. 根据权利要求7所述的方法,其中,所述基于所述目标关系求解所述目标点,包括:The method according to claim 7, wherein said solving said target point based on said target relationship comprises:
    将处于所述目标区域内部的顶点设置为所述目标点的初始值;Setting the vertices inside the target area as the initial value of the target point;
    在所述初始值的基础上迭代更新所述目标点,直至所述第三人脸关键点与所述目标区域中的第二人脸关键点相对应的第一人脸关键点之间的差值小于预设的阈值。The target point is iteratively updated on the basis of the initial value until the difference between the first human face key point corresponding to the third human face key point and the second human face key point in the target area value is less than the preset threshold.
  9. 一种人脸关键点的插值装置,包括:A device for interpolating face key points, comprising:
    二维人脸数据获取模块,设置为获取二维的第一人脸数据,所述第一人脸数据中具有二维的第一人脸关键点;The two-dimensional face data acquisition module is configured to obtain two-dimensional first human face data, and the first two-dimensional face data has two-dimensional first human face key points;
    三维人脸数据拟合模块,设置为根据所述第一人脸关键点拟合三维的第二人脸数据,所述第二人脸数据具有三维的第二人脸关键点;The three-dimensional face data fitting module is configured to fit the second three-dimensional face data according to the first key points of the face, and the second face data has the second three-dimensional key points of the face;
    目标区域选定模块,设置为选定所述第二人脸数据中局部的区域,作为目标区域;The target area selection module is configured to select a local area in the second face data as the target area;
    目标区域形变模块,设置为对所述目标区域进行线性变形,使得将所述目标区域中的第二人脸关键点透视投影至二维的第三人脸关键点的情况下,所述第三人脸关键点与所述目标区域中的第二人脸关键点相对应的第一人脸关键点重叠。The target area deformation module is configured to perform linear deformation on the target area, so that when the second human face key point in the target area is perspective projected to a two-dimensional third human face key point, the third The key points of human face overlap with the first key points of human face corresponding to the second key points of human face in the target area.
  10. 一种计算机设备,包括:A computer device comprising:
    至少一个处理器;at least one processor;
    存储器,设置为存储至少一个程序,a memory configured to store at least one program,
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-8中任一项所述的人脸关键点的插值方法。When the at least one program is executed by the at least one processor, the at least one processor is made to implement the method for interpolating key points of a human face according to any one of claims 1-8.
  11. 一种计算机可读存储介质,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1-8中任一项所述的人脸关 键点的插值方法。A computer-readable storage medium, storing a computer program on the computer-readable storage medium, when the computer program is executed by a processor, the method for interpolating key points of a human face according to any one of claims 1-8 is realized .
PCT/CN2022/106211 2021-07-23 2022-07-18 Face key point interpolation method and apparatus, computer device, and storage medium WO2023001095A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110836320.0 2021-07-23
CN202110836320.0A CN113362231A (en) 2021-07-23 2021-07-23 Interpolation method and device for key points of human face, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2023001095A1 true WO2023001095A1 (en) 2023-01-26

Family

ID=77540217

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/106211 WO2023001095A1 (en) 2021-07-23 2022-07-18 Face key point interpolation method and apparatus, computer device, and storage medium

Country Status (2)

Country Link
CN (1) CN113362231A (en)
WO (1) WO2023001095A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117593493A (en) * 2023-09-27 2024-02-23 书行科技(北京)有限公司 Three-dimensional face fitting method, three-dimensional face fitting device, electronic equipment and storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113362231A (en) * 2021-07-23 2021-09-07 百果园技术(新加坡)有限公司 Interpolation method and device for key points of human face, computer equipment and storage medium
CN114037814B (en) * 2021-11-11 2022-12-23 北京百度网讯科技有限公司 Data processing method, device, electronic equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764180A (en) * 2018-05-31 2018-11-06 Oppo广东移动通信有限公司 Face identification method, device, electronic equipment and readable storage medium storing program for executing
CN108765351A (en) * 2018-05-31 2018-11-06 Oppo广东移动通信有限公司 Image processing method, device, electronic equipment and storage medium
CN110533777A (en) * 2019-08-01 2019-12-03 北京达佳互联信息技术有限公司 Three-dimensional face images modification method, device, electronic equipment and storage medium
US20210089836A1 (en) * 2019-09-24 2021-03-25 Toyota Research Institute, Inc. Systems and methods for training a neural keypoint detection network
CN113362231A (en) * 2021-07-23 2021-09-07 百果园技术(新加坡)有限公司 Interpolation method and device for key points of human face, computer equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9881203B2 (en) * 2013-08-29 2018-01-30 Nec Corporation Image processing device, image processing method, and program
CN109146769A (en) * 2018-07-24 2019-01-04 北京市商汤科技开发有限公司 Image processing method and device, image processing equipment and storage medium
CN109685873B (en) * 2018-12-14 2023-09-05 广州市百果园信息技术有限公司 Face reconstruction method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764180A (en) * 2018-05-31 2018-11-06 Oppo广东移动通信有限公司 Face identification method, device, electronic equipment and readable storage medium storing program for executing
CN108765351A (en) * 2018-05-31 2018-11-06 Oppo广东移动通信有限公司 Image processing method, device, electronic equipment and storage medium
CN110533777A (en) * 2019-08-01 2019-12-03 北京达佳互联信息技术有限公司 Three-dimensional face images modification method, device, electronic equipment and storage medium
US20210089836A1 (en) * 2019-09-24 2021-03-25 Toyota Research Institute, Inc. Systems and methods for training a neural keypoint detection network
CN113362231A (en) * 2021-07-23 2021-09-07 百果园技术(新加坡)有限公司 Interpolation method and device for key points of human face, computer equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117593493A (en) * 2023-09-27 2024-02-23 书行科技(北京)有限公司 Three-dimensional face fitting method, three-dimensional face fitting device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113362231A (en) 2021-09-07

Similar Documents

Publication Publication Date Title
WO2023001095A1 (en) Face key point interpolation method and apparatus, computer device, and storage medium
CN111028330B (en) Three-dimensional expression base generation method, device, equipment and storage medium
Jeni et al. Dense 3D face alignment from 2D video for real-time use
Yu et al. Direct, dense, and deformable: Template-based non-rigid 3d reconstruction from rgb video
US11348314B2 (en) Fast and deep facial deformations
US9792479B2 (en) Geometry tracking
CN111161395B (en) Facial expression tracking method and device and electronic equipment
CN113327278B (en) Three-dimensional face reconstruction method, device, equipment and storage medium
EP3992919B1 (en) Three-dimensional facial model generation method and apparatus, device, and medium
US8854376B1 (en) Generating animation from actor performance
CN113313085B (en) Image processing method and device, electronic equipment and storage medium
JP6207210B2 (en) Information processing apparatus and method
WO2020207177A1 (en) Image augmentation and neural network training method and apparatus, device and storage medium
JP2013242757A (en) Image processing apparatus, image processing method, and computer program
CN111899159B (en) Method, device, apparatus and storage medium for changing hairstyle
Dinev et al. User‐guided lip correction for facial performance capture
CN112714337A (en) Video processing method and device, electronic equipment and storage medium
Kryvonos et al. Information technology for the analysis of mimic expressions of human emotional states
CN110008873B (en) Facial expression capturing method, system and equipment
US11734889B2 (en) Method of gaze estimation with 3D face reconstructing
CN113628322A (en) Image processing method, AR display live broadcast method, AR display equipment, AR display live broadcast equipment and storage medium
Jian et al. Realistic face animation generation from videos
Peng et al. Geometrical consistency modeling on b-spline parameter domain for 3d face reconstruction from limited number of wild images
US20230237753A1 (en) Dynamic facial hair capture of a subject
WO2023169023A1 (en) Expression model generation method and apparatus, device, and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22845260

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE