CN113902852A

CN113902852A - Face three-dimensional reconstruction method and device, electronic equipment and storage medium

Info

Publication number: CN113902852A
Application number: CN202111212305.5A
Authority: CN
Inventors: 刘炫鹏; 杨国基; 刘致远; 刘云峰
Original assignee: Shenzhen Zhuiyi Technology Co Ltd
Current assignee: Shenzhen Zhuiyi Technology Co Ltd
Priority date: 2021-10-18
Filing date: 2021-10-18
Publication date: 2022-01-07

Abstract

The invention relates to a human face three-dimensional reconstruction method, a human face three-dimensional reconstruction device, electronic equipment and a storage medium, wherein the method comprises the following steps: the method comprises the steps of obtaining face RGB images collected by monocular cameras with different fixed poses, determining point cloud data of a plurality of three-dimensional points in a three-dimensional face according to pixel coordinates of pixel points in the face RGB images aiming at each face RGB image, carrying out point cloud registration according to the point cloud data corresponding to each face RGB image and face reference point cloud data to obtain registration point cloud data, and carrying out curved surface reconstruction according to the registration point cloud data to obtain a face three-dimensional model. In the embodiment of the invention, the point cloud registration process is carried out, the human face datum point cloud data is added as the shape reference, the shape deviation is greatly reduced, the reliable three-dimensional reconstruction precision is obtained from the point cloud data with low precision, the consumer-grade monocular camera array is used, the high-precision synchronous acquisition is not needed, the reliable three-dimensional human face reconstruction can be completed, the three-dimensional human face reconstruction difficulty is reduced, and the application of the three-dimensional human face weight is facilitated.

Description

Face three-dimensional reconstruction method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for three-dimensional face reconstruction, an electronic device, and a storage medium.

Background

At the present stage, when the real human face is subjected to three-dimensional reconstruction in the market, a monocular camera array is built, and the three-dimensional reconstruction is realized by utilizing the parallax principle.

However, the monocular camera array required by the monocular camera array needs to synchronously control all the cameras to shoot at the same time, and needs a fixed shooting place for reducing the camera calibration times and improving the calibration precision, and various requirements are relatively strict.

Disclosure of Invention

In order to solve the technical problem or at least partially solve the technical problem, the present application provides a method, an apparatus, an electronic device and a storage medium for three-dimensional reconstruction of a human face.

In a first aspect, the present application provides a method for three-dimensional reconstruction of a human face, including:

acquiring face RGB images acquired by a plurality of monocular cameras with different fixed poses;

aiming at each face RGB image, determining point cloud data of a plurality of three-dimensional points in a three-dimensional face based on pixel coordinates of each pixel point in the face RGB image;

performing point cloud registration based on the point cloud data corresponding to each face RGB image and preset face reference point cloud data to obtain registration point cloud data;

and carrying out curved surface reconstruction based on the registration point cloud data to obtain a human face three-dimensional model.

Optionally, determining point cloud data of a plurality of three-dimensional points in a three-dimensional face based on pixel coordinates of each pixel point in the RGB image of the face includes:

determining three-dimensional coordinates of a plurality of three-dimensional points in the three-dimensional face based on a plurality of pixel points in the face RGB image;

and removing three-dimensional points outside the face area of the three-dimensional face and three-dimensional coordinates of the removed three-dimensional points from the plurality of three-dimensional points to obtain point cloud data of the three-dimensional face.

Optionally, determining three-dimensional coordinates of a plurality of three-dimensional points in a three-dimensional face based on a plurality of pixel points in the RGB image of the face includes:

inputting the human face RGB image into a preset monocular depth model so that the monocular depth model outputs a depth image;

and respectively converting the pixel coordinates of a plurality of pixel points in the depth image into the three-dimensional coordinates of the corresponding three-dimensional points.

Optionally, performing point cloud registration based on the point cloud data corresponding to each face RGB image and preset face reference point cloud data to obtain registration point cloud data, including:

converting the point cloud data corresponding to the plurality of face RGB images into a target coordinate system to obtain a plurality of first conversion point cloud data, wherein the first conversion point cloud data comprises: three-dimensional coordinates of a plurality of three-dimensional points in the target coordinate system;

extracting human face reference point cloud data from the plurality of first conversion point cloud data, wherein the human face reference point cloud data comprises: reference coordinates of a plurality of three-dimensional points when the face is in the face-positive posture;

carrying out attitude alignment on remaining point cloud data except the face reference point cloud data in the plurality of first conversion point cloud data and the face reference point cloud data to obtain a plurality of second conversion point cloud data;

and performing point cloud fusion on the plurality of second conversion point cloud data and the face reference point cloud data to obtain the registration point cloud data.

Optionally, converting the point cloud data corresponding to the plurality of RGB face images into a target coordinate system, including:

calculating centroid coordinates of a plurality of the point cloud data;

converting the plurality of point cloud data into a mass center coordinate system with the mass center coordinate as an origin to obtain first converted point cloud data, wherein the first converted point cloud data comprises: three-dimensional coordinates of a plurality of three-dimensional points in the centroid coordinate system.

Optionally, calculating centroid coordinates of a plurality of the point cloud data comprises:

acquiring three-dimensional coordinates of all three-dimensional points in the point cloud data corresponding to the plurality of face RGB images;

and calculating the average value of the three-dimensional coordinates of all the three-dimensional points in the point cloud data to obtain the centroid coordinate.

Optionally, converting a plurality of point cloud data into a centroid coordinate system with the centroid coordinate as an origin to obtain first converted point cloud data, including:

subtracting the three-dimensional coordinates of each three-dimensional point in the plurality of point cloud data from the centroid coordinates to obtain first conversion point cloud data, wherein the first conversion point cloud data comprises: three-dimensional coordinates of a plurality of three-dimensional points in the centroid coordinate system.

Optionally, performing pose alignment on remaining point cloud data in the plurality of first conversion point cloud data except the face reference point cloud data and the face reference point cloud data to obtain a plurality of second conversion point cloud data, including:

calculating the corresponding relation between the three-dimensional coordinates of the three-dimensional points in the residual point cloud data and the reference coordinates of the three-dimensional points in the face reference point cloud data to obtain a nearest point pair set;

calculating a rotation matrix and an offset vector according to the nearest point pair set;

converting the residual point cloud data into intermediate point cloud data under the face-righting posture according to the rotation matrix and the offset vector;

calculating a point cloud conversion error between the intermediate point cloud data and the face reference point cloud data;

if the point cloud conversion error is smaller than a preset error threshold value, determining the intermediate point cloud data as the second conversion point cloud data;

and if the point cloud transformation error is greater than or equal to the preset error threshold, executing a step of calculating the corresponding relation between the three-dimensional coordinates of the plurality of three-dimensional points in the residual point cloud data and the reference coordinates of the plurality of three-dimensional points in the face reference point cloud data to obtain a nearest point pair set, and determining the intermediate point cloud data as the second conversion point cloud data.

Optionally, calculating a rotation matrix and an offset vector according to the set of nearest point pairs includes:

constructing a decomposition matrix according to the nearest point pair set;

and carrying out singular value decomposition on the decomposition matrix to obtain the rotation matrix and the offset vector.

Optionally, point cloud fusion is performed on the plurality of second conversion point cloud data and the face reference point cloud data to obtain the registration point cloud data, including:

merging the plurality of second conversion point cloud data and the human face reference point cloud data to obtain merged point cloud data;

down-sampling the merged point cloud data to obtain sampled point cloud data;

calculating a shape error value between the sampling point cloud data and the face reference point cloud data;

determining whether the shape error value is less than a preset error threshold;

if the shape error value is smaller than the preset error threshold value, determining whether the shape information of the sampling point cloud data is within a preset normal value range;

if the shape information of the sampling point cloud data is within a preset normal value range, determining the merged point cloud data as the registration point cloud data;

and if the shape information of the sampled point cloud data is within a preset abnormal value range, adjusting the preset error threshold, and performing down-sampling on the merged point cloud data to obtain sampled point cloud data, and determining the merged point cloud data as the registration point cloud data until the shape information of the sampled point cloud data is within a preset normal value range.

Optionally, the method further comprises:

extracting color values of a plurality of three-dimensional points from the face RGB image to obtain color data;

performing curved surface reconstruction based on the registration point cloud data to obtain a human face three-dimensional model, comprising:

and performing curved surface reconstruction based on the registration point cloud data and the color data to obtain a human face three-dimensional model.

Optionally, extracting color values of a plurality of three-dimensional points from the RGB image of the human face to obtain color data, including:

acquiring the size of a depth image corresponding to the face RGB image;

scaling the size of the face RGB image to a target size, wherein the target size is the size of the depth image;

and determining the color values of the pixel coordinates corresponding to the three-dimensional coordinates of the three-dimensional points as the color data.

In a second aspect, the present application provides a human face three-dimensional reconstruction apparatus, including:

the acquisition module is used for acquiring face RGB images acquired by monocular cameras with a plurality of poses;

the determining module is used for determining point cloud data of a plurality of three-dimensional points in a three-dimensional face based on pixel coordinates of all pixel points in the face RGB images aiming at each face RGB image;

the registration module is used for carrying out point cloud registration on the basis of the point cloud data corresponding to each face RGB image and preset face reference point cloud data to obtain registration point cloud data;

and the reconstruction module is used for carrying out curved surface reconstruction on the basis of the registration point cloud data to obtain a human face three-dimensional model.

Optionally, the determining module includes:

the first determining unit is used for determining three-dimensional coordinates of a plurality of three-dimensional points in a three-dimensional face based on a plurality of pixel points in the face RGB image;

and the removing unit is used for removing three-dimensional points outside the face area of the three-dimensional face and three-dimensional coordinates of the removed three-dimensional points in the three-dimensional points to obtain point cloud data of the three-dimensional face.

Optionally, the first determining unit includes:

the input subunit is used for inputting the human face RGB image into a preset monocular depth model so as to enable the monocular depth model to output a depth image;

and the first conversion subunit is used for respectively converting the pixel coordinates of a plurality of pixel points in the depth image into the three-dimensional coordinates of the corresponding three-dimensional points.

Optionally, the registration module comprises:

a conversion unit, configured to convert the point cloud data corresponding to the plurality of face RGB images into a target coordinate system to obtain a plurality of first conversion point cloud data, where the first conversion point cloud data includes: three-dimensional coordinates of a plurality of three-dimensional points in the target coordinate system;

an extracting unit, configured to extract, from the plurality of first conversion point cloud data, human face reference point cloud data, where the human face reference point cloud data includes: reference coordinates of a plurality of three-dimensional points when the face is in the face-positive posture;

the posture alignment unit is used for carrying out posture alignment on residual point cloud data except the face reference point cloud data in the plurality of first conversion point cloud data and the face reference point cloud data to obtain a plurality of second conversion point cloud data;

and the point cloud fusion unit is used for carrying out point cloud fusion on the plurality of second conversion point cloud data and the human face reference point cloud data to obtain the registration point cloud data.

Optionally, the conversion unit includes:

a first calculating subunit, configured to calculate centroid coordinates of the plurality of point cloud data;

a second conversion subunit, configured to convert the plurality of point cloud data into a centroid coordinate system using the centroid coordinate as an origin to obtain first converted point cloud data, where the first converted point cloud data includes: three-dimensional coordinates of a plurality of three-dimensional points in the centroid coordinate system.

Optionally, the first computing subunit is further configured to:

Optionally, the second conversion subunit is further configured to:

Optionally, the gesture alignment unit comprises:

the second calculating subunit is used for calculating the corresponding relation between the three-dimensional coordinates of the three-dimensional points in the residual point cloud data and the reference coordinates of the three-dimensional points in the face reference point cloud data to obtain a nearest point pair set;

a third computing subunit, configured to compute a rotation matrix and an offset vector according to the closest point pair set;

a third conversion subunit, configured to convert the residual point cloud data into intermediate point cloud data in a face-righting posture according to the rotation matrix and the offset vector;

the fourth calculating subunit is used for calculating a point cloud conversion error between the intermediate point cloud data and the face reference point cloud data;

the first determining subunit is configured to determine the intermediate point cloud data as the second converted point cloud data if the point cloud conversion error is smaller than a preset error threshold;

Optionally, the third computing subunit is further configured to:

constructing a decomposition matrix according to the nearest point pair set;

Optionally, the point cloud fusion unit includes:

a merging subunit, configured to merge the plurality of second conversion point cloud data with the face reference point cloud data to obtain merged point cloud data;

the down-sampling subunit is used for carrying out down-sampling on the merged point cloud data to obtain sampled point cloud data;

the fifth calculating subunit is used for calculating a shape error value between the sampling point cloud data and the face reference point cloud data;

a second determining subunit, configured to determine whether the shape error value is smaller than a preset error threshold;

a third determining subunit, configured to determine whether shape information of the sampled point cloud data is within a preset normal value range if the shape error value is smaller than the preset error threshold;

a fourth determining subunit, configured to determine the merged point cloud data as the registered point cloud data if the shape information of the sampled point cloud data is within a preset normal value range;

and the adjustment determining subunit is used for adjusting the preset error threshold value if the shape information of the sampled point cloud data is within a preset abnormal value range, executing the step of down-sampling the merged point cloud data to obtain sampled point cloud data, and determining the merged point cloud data as the registration point cloud data until the shape information of the sampled point cloud data is within a preset normal value range.

Optionally, the apparatus further comprises:

the extraction module is used for extracting color values of the three-dimensional points from the face RGB image to obtain color data;

the reconstruction module comprises:

and the curved surface reconstruction unit is used for carrying out curved surface reconstruction on the basis of the registration point cloud data and the color data to obtain a human face three-dimensional model.

Optionally, the extraction module comprises:

the acquisition unit is used for acquiring the size of a depth image corresponding to the face RGB image;

the scaling unit is used for scaling the size of the face RGB image to a target size, and the target size is the size of the depth image;

and a second determining unit configured to determine color values of pixel coordinates corresponding to three-dimensional coordinates of the plurality of three-dimensional points as the color data.

In a third aspect, the present application provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete mutual communication through the communication bus;

a memory for storing a computer program;

and the processor is used for realizing the human face three-dimensional reconstruction method of any one of the first aspect when executing the program stored in the memory.

In a fourth aspect, the present application provides a computer-readable storage medium, on which a program of a three-dimensional face reconstruction method is stored, and the program of the three-dimensional face reconstruction method realizes the steps of the three-dimensional face reconstruction method according to any one of the first aspect when executed by a processor.

Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages:

according to the embodiment of the invention, face RGB images acquired by a plurality of monocular cameras with different fixed poses are firstly acquired, then point cloud data of a plurality of three-dimensional points in a three-dimensional face are determined based on pixel coordinates of all pixel points in the face RGB images aiming at each face RGB image, point cloud registration is carried out based on the point cloud data corresponding to each face RGB image and preset face reference point cloud data to obtain registration point cloud data, and finally curved surface reconstruction can be carried out based on the registration point cloud data to obtain a face three-dimensional model.

In the embodiment of the invention, the preset human face datum point cloud data is added as the shape reference in the point cloud registration process, so that the shape deviation, such as irregular bulges or depressions, abnormal nose shape and the like, is greatly reduced, the reliable three-dimensional reconstruction precision obtained by the low-precision point cloud data is realized, namely, the consumer-grade monocular camera array can be used for completing the reliable three-dimensional human face reconstruction without high-precision synchronous acquisition, the three-dimensional human face reconstruction difficulty is reduced, and the application of the three-dimensional human face weight is facilitated.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

Fig. 1 is a flowchart of a three-dimensional face reconstruction method according to an embodiment of the present application;

fig. 2 is a schematic view of an arrangement manner of a plurality of monocular cameras provided in an embodiment of the present application;

fig. 3 is a schematic diagram of a curved surface reconstruction algorithm provided in an embodiment of the present application;

fig. 4 is a schematic diagram of a face region of a three-dimensional face according to an embodiment of the present application;

fig. 5 is a structural diagram of a three-dimensional face reconstruction device according to an embodiment of the present application;

fig. 6 is a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

When the monocular camera array is used for three-dimensional reconstruction at present, all cameras need to be synchronously controlled to shoot at the same time, so that various requirements are strict for reducing the camera calibration times and improving the calibration precision and needing fixed shooting places. Therefore, the method, the device, the electronic device and the storage medium for three-dimensional face reconstruction provided by the embodiments of the present application are provided, wherein the method for three-dimensional face reconstruction can be applied to a computer, and the computer can be connected to a plurality of monocular cameras and control the plurality of monocular cameras to acquire RGB images of a face.

As shown in fig. 1, the method for reconstructing a human face in three dimensions may include the following steps:

s101, acquiring face RGB images acquired by a plurality of monocular cameras with different fixed poses;

in the embodiment of the invention, the position information refers to the position and the orientation in the shooting space, and the monocular camera acquires the RGB image of the face area of the modeling object.

Because the face is stereoscopic, the raised portion can be shielded, and therefore a monocular camera is required to acquire data at different fixed position resources. In the embodiment of the invention, because the modeling object is a human face, the camera and the shooting object are in the same horizontal plane; in order to solve the problem of shielding, the camera is required to shoot and collect the human faces from different directions, and meanwhile, the more times of shooting and collecting are, the more the precision of subsequent operation is favorably improved.

Taking 3 monocular cameras as an example, as shown in fig. 2, the front face of the model is taken as the central axis, and the cameras are uniformly distributed within the range of 120 degrees at most. When data acquisition is carried out, all cameras do not need to carry out high-precision synchronous acquisition, and errors caused by the high-precision synchronous acquisition are processed in point cloud registration.

Step S102, aiming at each face RGB image, determining point cloud data of a plurality of three-dimensional points in a three-dimensional face based on pixel coordinates of each pixel point in the face RGB image;

in the embodiment of the invention, the face RGB image comprises a plurality of pixel points, each pixel point has a corresponding two-dimensional pixel coordinate in the face RGB image, the three-dimensional face is constructed according to a face region of a modeling object, the three-dimensional face is three-dimensional, so that the coordinates of three-dimensional points in the three-dimensional face are three-dimensional, and the point cloud data comprises the three-dimensional coordinates of a plurality of three-dimensional points in the three-dimensional face. Two-dimensional pixel points in the face RGB image correspond to three-dimensional points in the three-dimensional face one to one.

In this step, a preset homogeneous coordinate conversion method from two-dimensional to three-dimensional may be used to convert the two-dimensional pixel coordinates of each pixel point in the RGB image of the human face into the three-dimensional coordinates of the corresponding three-dimensional point in the three-dimensional human face, so as to obtain point cloud data.

Step S103, point cloud registration is carried out on the basis of the point cloud data corresponding to each face RGB image and preset face reference point cloud data to obtain registration point cloud data;

in the embodiment of the invention, the face area of the modeling object is shot by utilizing the monocular cameras with fixed poses, wherein the point cloud data generated based on the face RGB image of the front face of the modeling object acquired by the monocular cameras is the preset face reference point cloud data.

In this step, point cloud data corresponding to a plurality of face RGB images may be merged to obtain complete face point cloud data, and point cloud registration is performed on the complete face point cloud data and the face reference point cloud data to obtain registration point cloud data.

And step S104, performing curved surface reconstruction based on the registration point cloud data to obtain a human face three-dimensional model.

In this step, after obtaining the point cloud (i.e. registration point cloud data) of the desired complete face, curved surface reconstruction is required to restore the geometric shape thereof, so as to become a three-dimensional model of the face. Taking the delaunay triangle reconstruction as an example, one possible surface reconstruction algorithm is shown in fig. 3.

In another embodiment of the present invention, the step S102 of determining point cloud data of a plurality of three-dimensional points in a three-dimensional face based on pixel coordinates of each pixel point in the RGB image of the face includes:

step 201, determining three-dimensional coordinates of a plurality of three-dimensional points in a three-dimensional face based on a plurality of pixel points in the face RGB image;

in this step, a preset homogeneous coordinate conversion method from two-dimensional to three-dimensional may be used to convert two-dimensional pixel coordinates of a plurality of pixel points in the RGB image of the human face into three-dimensional coordinates of corresponding three-dimensional points in the three-dimensional human face.

Step 202, removing three-dimensional points outside the face area of the three-dimensional face and three-dimensional coordinates of the removed three-dimensional points from the plurality of three-dimensional points to obtain point cloud data of the three-dimensional face.

In the step, three-dimensional points which do not belong to the face region can be deleted manually through point cloud editing software, wherein the range of the face region is defined as a region surrounded by a left ear root (without a left ear), a hairline, a right ear root (without a right ear) and a chin extension, the visual expression of the face region is shown in fig. 4.

In another embodiment of the present invention, the step 201 determines three-dimensional coordinates of a plurality of three-dimensional points in a three-dimensional face based on a plurality of pixel points in the RGB image of the face, including:

step 301, inputting the face RGB image into a preset monocular depth model, so that the monocular depth model outputs a depth image;

in this step, the existing monocular Depth model (e.g., a Depth neural network model of the MiDaS series) may be used, and the face RGB images collected by the monocular cameras may be used as model inputs, so as to obtain a Depth (Depth) image, which is a corresponding model output. The method has the advantages that the Depth image can be obtained from the monocular image without containing the space structure information, and the method has the defects that the obtained Depth image is low in precision and can be used for three-dimensional reconstruction only through the point cloud registration method designed by the scheme.

Step 302, respectively converting the pixel coordinates of a plurality of pixel points in the depth image into the three-dimensional coordinates of the corresponding three-dimensional points.

Since the Depth image can only provide the plane position and Depth information z in the pixel coordinate system_cEach pixel point in the monocular camera parameters needs to be converted into a three-dimensional point in a world three-dimensional coordinate system so as to recover the spatial position and the scale of the pixel point. Pixel coordinate (u) of Depth image_m,v_m) To world coordinate (x)_w,y_w,z_w) The homogeneous conversion formula of (c) is as follows:

in the formula, K_mThe parameter matrix is an internal parameter matrix of the monocular camera, the size of the internal parameter matrix is 3x3, and the parameter matrix can be obtained through module interface reading or manual calibration; t is_w2_mThe external reference matrix of the monocular camera is 4x4 in size, is an Euclidean transformation matrix from a world coordinate system to the monocular camera, and is composed of a rotation matrix R with the size of 3x3_w2mAnd an offset vector t_w2mComposition, both of which can be obtained by manual calibration.

In another embodiment of the present invention, the step S103 of performing point cloud registration based on the point cloud data corresponding to each of the RGB face images and preset face reference point cloud data to obtain registration point cloud data includes:

step 401, converting the point cloud data corresponding to the plurality of face RGB images into a target coordinate system to obtain a plurality of first converted point cloud data.

In an embodiment of the present invention, the first converted point cloud data includes: three-dimensional coordinates of a plurality of three-dimensional points in the target coordinate system; the target coordinate system may refer to a centroid coordinate system having a centroid of the point cloud data as a coordinate origin.

In this step, the point cloud data corresponding to the plurality of face RGB images may be converted into a target coordinate system, respectively, to obtain a plurality of first converted point cloud data.

Step 402, extracting human face reference point cloud data from the plurality of first conversion point cloud data.

In the embodiment of the present invention, the point cloud data of the human face reference includes: reference coordinates of a plurality of three-dimensional points when the face is in the face-positive posture; the point cloud data generated based on the face RGB images acquired by each monocular camera can be provided with different identifications, correspondingly, the first conversion point cloud data obtained based on the point cloud data conversion also has different identifications, and further, the first conversion point cloud data from different face RGB images can be distinguished according to different identifications.

In this step, one first conversion point cloud data may be selected from the plurality of first conversion point cloud data according to the identification as the face reference point cloud data, and in the selection, the first conversion point cloud data derived from the monocular camera facing the modeling object frontal face may be used as the face reference point cloud data.

Step 403, performing pose alignment on remaining point cloud data except the face reference point cloud data in the plurality of first conversion point cloud data and the face reference point cloud data to obtain a plurality of second conversion point cloud data;

in the embodiment of the present invention, a plurality of first conversion point cloud data in the plurality of first conversion point cloud data, excluding the face reference point cloud data, are residual point cloud data, and since the RGB images of the face acquired by the monocular cameras other than the monocular camera facing the front face of the modeling object may not display a complete face due to the occlusion of some protruding five sense organs, and the three-dimensional coordinates of the three-dimensional points in the point cloud data generated based on the RGB images of the face acquired by the monocular cameras at different poses may have a deviation, it is necessary to perform pose alignment between each of the first conversion point cloud data in the residual point cloud data and the face reference point cloud data, so as to obtain a plurality of second conversion point cloud data.

And 404, performing point cloud fusion on the plurality of second conversion point cloud data and the face reference point cloud data to obtain the registration point cloud data.

Generally, the point cloud registration process comprises two stages of pose alignment and point cloud fusion, and during fusion, certain shape correction can be performed through algorithm design, and the more point clouds used for fusion, the higher the precision upper limit. According to the scheme, the point cloud obtained by using the consumption-level monocular camera has a large error, so that the shape error is further reduced by adopting a method of 'referring to the human face reference point cloud' in the point cloud fusion stage.

In another embodiment of the present invention, the step 401 of converting the point cloud data corresponding to the RGB face images into a target coordinate system includes:

step 501, calculating mass center coordinates of a plurality of point cloud data;

in an embodiment of the present invention, the first conversion point cloud data includes: three-dimensional coordinates of a plurality of three-dimensional points in the centroid coordinate system.

In this step, three-dimensional coordinates of all three-dimensional points in the point cloud data corresponding to the plurality of face RGB images may be obtained, and an average value of the three-dimensional coordinates of all three-dimensional points in the plurality of point cloud data may be calculated to obtain a centroid coordinate.

In practical application, the point cloud data can be regarded as a point cloud set C of a plurality of three-dimensional points, and the point cloud C in the point cloud set C_iCenter of mass μ thereof_iThe calculation formula is the average of all point space coordinates, wherein N is as follows_iIs a point cloud c_iNumber of points involved, p_jIs a point cloud c_iSpatial coordinates of the j-th point:

step 502, converting the plurality of point cloud data into a centroid coordinate system with the centroid coordinate as an origin to obtain first converted point cloud data.

In this step, the three-dimensional coordinates of each three-dimensional point in the plurality of point cloud data may be subtracted from the centroid coordinates to obtain the first converted point cloud data, where the first converted point cloud data includes: three-dimensional coordinates of a plurality of three-dimensional points in the centroid coordinate system.

Point cloud c_iEach point p in_jMinus the centroid mu_iC 'of the obtained new point cloud'_i：

c′_i＝{p_j-μ_i|p_j∈c_i}＝{p′_j}

So as to unify all point clouds to a mass with the mass center as the origin of the coordinate systemIn the heart coordinate system, the rotation matrix R and the offset vector t are conveniently calculated by an SVD algorithm. In the embodiment of the invention, the point cloud set under the centroid coordinate system can be called C_μ。

In another embodiment of the present invention, the step 403 performs pose alignment on the remaining point cloud data in the plurality of first converted point cloud data except for the face reference point cloud data and the face reference point cloud data to obtain a plurality of second converted point cloud data, including:

601, calculating the corresponding relation between the three-dimensional coordinates of a plurality of three-dimensional points in the residual point cloud data and the reference coordinates of a plurality of three-dimensional points in the human face reference point cloud data to obtain a nearest point pair set;

in the embodiment of the invention, the human face reference point cloud data can be regarded as a posture reference point cloud P, the residual point cloud data can be regarded as a point cloud Q, a KD-tree data structure is established on the posture reference point cloud P, the space coordinates of all points of the posture reference point cloud P are stored, and meanwhile, the point cloud Q (Q belongs to C)_μ) Randomly selecting n points, finding out the nearest neighbor corresponding to each point from the point cloud P by means of the characteristics of the KD-tree, and constructing a nearest point pair set H used in the iteration:

H＝{(p_i,q_i)|p_i∈P,q_i∈Q,1≤i≤n}

the method and the device use the nearest point pair to restrict the number of points needing to be calculated in the set H, so that the registration is accelerated.

Step 602, calculating a rotation matrix and an offset vector according to the closest point pair set;

in this step, a decomposition matrix may be constructed according to the set of the nearest point pairs, and singular value decomposition may be performed on the decomposition matrix to obtain the rotation matrix and the offset vector.

Constructing a matrix W with the size of 3x3 for SVD, wherein the calculation formula is as follows, and N in the formula_HIs the size of the set H, p_iAnd q is_iFor the closest point pair, σ, among them₁、σ₂、σ₃Is the eigenvalue.

By performing SVD on the matrix W, the rotation matrix R and the offset vector t can be obtained from the following equation, where μ_pAnd mu_qRespectively are the centroid coordinates of the attitude reference point cloud P and the current point cloud Q:

R＝UV^T

t＝μ_p-Rμ_q

step 603, converting the residual point cloud data into intermediate point cloud data under the normal face posture according to the rotation matrix and the offset vector;

after the rotation matrix R and the offset vector t of the point cloud Q are obtained, the point cloud Q can be converted through the following formula, and the space attitude of the obtained Q' is close to the attitude reference point cloud P, namely close to the face-facing attitude:

Q′＝{Rq+t|q∈Q}

step 604, calculating a point cloud conversion error between the intermediate point cloud data and the face reference point cloud data;

calculating the attitude difference between the converted point cloud Q' and the reference point cloud P as a point cloud conversion error L:

step 605, if the point cloud transformation error is smaller than a preset error threshold, determining the intermediate point cloud data as the second conversion point cloud data;

when the postures are aligned, if the point cloud conversion error is smaller than a preset error threshold, the intermediate point cloud error can be determined as second conversion point cloud data, and point cloud fusion can be carried out based on the second conversion point cloud data.

Step 606, if the point cloud transformation error is greater than or equal to the preset error threshold, a step of calculating a corresponding relationship between three-dimensional coordinates of a plurality of three-dimensional points in the remaining point cloud data and reference coordinates of a plurality of three-dimensional points in the face reference point cloud data to obtain a closest point pair set is performed until the repeated execution times is greater than a preset time threshold, and the intermediate point cloud data is determined as the second conversion point cloud data.

If the point cloud transformation error is greater than or equal to the preset error threshold, the steps 601 to 604 may be re-executed until the number of times of repeated execution is greater than the preset number threshold, and the intermediate point cloud data is determined as the second converted point cloud data.

In another embodiment of the present invention, performing point cloud fusion on a plurality of second conversion point cloud data and the face reference point cloud data to obtain the registration point cloud data includes:

step 701, merging a plurality of second conversion point cloud data and the face reference point cloud data to obtain merged point cloud data;

step 702, down-sampling the merged point cloud data to obtain sampled point cloud data;

703, calculating a shape error value between the sampling point cloud data and the face reference point cloud data;

in the embodiment of the present invention, the Chamfer Distance may be used as a specific measure of the shape difference E (i.e. the shape error value), and the calculation formula is as follows, where N is_mAnd N_BThe sizes of the point cloud m and the face reference point cloud B after down sampling are respectively shown, the first term is the average of the sum of the minimum distances from any point in the point cloud m to the reference point cloud B, the second term is the average of the sum of the minimum distances from any point in the reference point cloud B to the point cloud m, the two terms are combined to be used as the chamfering distance in the 3D space, and the smaller the value is, the smaller the difference of the shapes between the two point clouds is.

Step 704, determining whether the shape error value is smaller than a preset error threshold value;

step 705, if the shape error value is smaller than the preset error threshold, determining whether the shape information of the sampling point cloud data is within a preset normal value range;

in the step, whether the shape of the point cloud data is normal can be confirmed through manual naked eyes, and then whether the shape information of the sampled point cloud data is within a preset normal value range can be determined according to manual selection operation.

Step 706, if the shape information of the sampled point cloud data is within a preset normal value range, determining the merged point cloud data as the registered point cloud data;

and 707, if the shape information of the sampled point cloud data is within a preset abnormal value range, adjusting the preset error threshold, and performing down-sampling on the merged point cloud data to obtain sampled point cloud data, and determining the merged point cloud data as the registered point cloud data until the shape information of the sampled point cloud data is within a preset normal value range.

In yet another embodiment of the present invention, the method further comprises:

step 801, extracting color values of a plurality of three-dimensional points from the face RGB image to obtain color data;

in this step, the size of the depth image corresponding to the face RGB image may be obtained, the size of the face RGB image is scaled to a target size, the target size is the size of the depth image, and the color value of the pixel coordinate corresponding to the three-dimensional coordinates of the plurality of three-dimensional points is determined as the color data.

Because the Depth image is predicted from the RGB image, the Depth image and the RGB image can share one set of camera parameters, the RGB image only needs to be zoomed to the size of the Depth image, the conversion of a coordinate system is not needed, and the pixel coordinate (u) corresponding to the three-dimensional point is directly determined_m,v_m) The corresponding color values can be obtained from the RGB image.

Step S104, performing curved surface reconstruction based on the registration point cloud data to obtain a human face three-dimensional model, and the method comprises the following steps:

In the embodiment of the invention, in order to facilitate observation of the effect, curved surface reconstruction can be carried out on the basis of the registration point cloud data and the color data, and the curved surface reconstruction is carried out at the same time, so that a human face three-dimensional model is obtained.

In another embodiment of the present invention, there is further provided a three-dimensional face reconstruction apparatus, as shown in fig. 5, including:

the acquisition module 11 is used for acquiring face RGB images acquired by monocular cameras with multiple poses;

a determining module 12, configured to determine, for each face RGB image, point cloud data of a plurality of three-dimensional points in a three-dimensional face based on pixel coordinates of each pixel point in the face RGB image;

the registration module 13 is configured to perform point cloud registration based on the point cloud data corresponding to each face RGB image and preset face reference point cloud data to obtain registration point cloud data;

and the reconstruction module 14 is used for performing curved surface reconstruction based on the registration point cloud data to obtain a human face three-dimensional model.

Optionally, the determining module includes:

Optionally, the first determining unit includes:

Optionally, the registration module comprises:

Optionally, the conversion unit includes:

Optionally, the first computing subunit is further configured to:

Optionally, the second conversion subunit is further configured to:

Optionally, the gesture alignment unit comprises:

Optionally, the third computing subunit is further configured to:

constructing a decomposition matrix according to the nearest point pair set;

Optionally, the point cloud fusion unit includes:

Optionally, the apparatus further comprises:

the reconstruction module comprises:

Optionally, the extraction module comprises:

In another embodiment of the present invention, an electronic device is further provided, which includes a processor 1110, a communication interface 1120, a memory 1130, and a communication bus 1140, wherein the processor, the communication interface 1120, and the memory 1130 complete communication with each other through the communication bus 1140;

a memory 1130 for storing computer programs;

a processor, configured to implement the method for three-dimensional reconstruction of a human face according to any of the foregoing method embodiments when executing the computer program stored in the memory 1130.

In the electronic device provided by the embodiment of the invention, the processor acquires face RGB images acquired by monocular cameras with different fixed poses by executing a program stored in the memory, then determines point cloud data of a plurality of three-dimensional points in a three-dimensional face based on pixel coordinates of each pixel point in the face RGB images aiming at each face RGB image, performs point cloud registration based on the point cloud data corresponding to each face RGB image and preset face reference point cloud data to obtain registration point cloud data, and finally performs curved surface reconstruction based on the registration point cloud data to obtain a face three-dimensional model.

The communication bus 1140 mentioned in the above electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus 1140 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 6, but this is not intended to represent only one bus or type of bus.

The communication interface 1120 is used for communication between the electronic device and other devices.

The memory 1130 may include a Random Access Memory (RAM), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The processor 1110 may be a general-purpose processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the integrated circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components.

In yet another embodiment of the present invention, a computer-readable storage medium is further provided, on which a program of a three-dimensional face reconstruction method is stored, which when executed by a processor implements the steps of the three-dimensional face reconstruction method according to any one of the foregoing method embodiments.

It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The foregoing are merely exemplary embodiments of the present invention, which enable those skilled in the art to understand or practice the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A human face three-dimensional reconstruction method is characterized by comprising the following steps:

2. The method of claim 1, wherein determining point cloud data of a plurality of three-dimensional points in a three-dimensional face based on pixel coordinates of each pixel point in the RGB image of the face comprises:

3. The method of claim 2, wherein determining three-dimensional coordinates of a plurality of three-dimensional points in a three-dimensional face based on a plurality of pixel points in the RGB image of the face comprises:

4. The method for reconstructing a human face in three dimensions according to claim 1, wherein performing point cloud registration based on the point cloud data corresponding to each of the RGB human face images and preset human face reference point cloud data to obtain registration point cloud data, comprises:

5. The method of claim 4, wherein transforming the point cloud data corresponding to the RGB images of the human face into a target coordinate system comprises:

calculating centroid coordinates of a plurality of the point cloud data;

6. The method of claim 5, wherein calculating the coordinates of the center of mass of the point cloud data comprises:

7. The three-dimensional reconstruction method of human face according to claim 5, wherein transforming the plurality of point cloud data into a centroid coordinate system with the centroid coordinate as an origin to obtain a first transformed point cloud data comprises:

8. The method of claim 4, wherein the step of performing pose alignment on the remaining point cloud data of the plurality of first converted point cloud data except the face reference point cloud data and the face reference point cloud data to obtain a plurality of second converted point cloud data comprises:

9. The method of claim 8, wherein calculating a rotation matrix and an offset vector from the set of nearest point pairs comprises:

constructing a decomposition matrix according to the nearest point pair set;

10. The method of claim 4, wherein the point cloud fusion of the second transformed point cloud data and the face reference point cloud data to obtain the registered point cloud data comprises:

down-sampling the merged point cloud data to obtain sampled point cloud data;

11. The three-dimensional reconstruction method of human face according to claim 1, characterized in that the method further comprises:

12. The method of claim 11, wherein extracting color values of a plurality of three-dimensional points from the RGB image of the human face to obtain color data comprises:

acquiring the size of a depth image corresponding to the face RGB image;

13. A three-dimensional reconstruction apparatus for a human face, comprising:

14. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method of three-dimensional reconstruction of a human face according to any one of claims 1 to 12 when executing a program stored in a memory.

15. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a program of a three-dimensional reconstruction method of a human face, which program, when being executed by a processor, carries out the steps of the three-dimensional reconstruction method of a human face according to any one of claims 1 to 12.