CN110009725B

CN110009725B - Face reconstruction method based on multiple RGB images

Info

Publication number: CN110009725B
Application number: CN201910168988.5A
Authority: CN
Inventors: 任重; 张诗禹
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2019-03-06
Filing date: 2019-03-06
Publication date: 2021-04-09
Anticipated expiration: 2039-03-06
Also published as: CN110009725A

Abstract

The invention discloses a face reconstruction method based on multiple RGB images, which can reconstruct a high-quality face model. Aiming at the problems of excessive highlight removal and color distortion of the traditional algorithm, the invention adopts pixel classification and compensation functions to improve the effect based on the joint bilateral filtering algorithm and eliminate the black spot phenomenon. The invention uses the face geometric reconstruction of the thickness and the fineness, firstly uses a method based on a three-dimensional deformation model to restore the whole shape of the face, and then uses a method of restoring the shape from the light and the shade to restore the details of the face, thereby reconstructing a high-quality face geometric model. Aiming at the phenomenon of misalignment among a plurality of images, the texture mapping method based on the block solves the problem of fuzzy texture by adopting the texture mapping technology based on the block. Meanwhile, aiming at the problem of low efficiency of block matching operation, the KDTree is adopted to optimize the search space, so that the acceleration effect of at least 5 times is realized.

Description

Face reconstruction method based on multiple RGB images

Technical Field

The invention relates to a face reconstruction technology in computer graphics, in particular to a face reconstruction method based on a plurality of RGB images.

Background

The human face reconstruction is always an important research direction in the field of computer graphics, aims to reconstruct a human face model most similar to a human face in an image, and plays a key role in human face reconstruction technologies in the fields of games, movies, social entertainment and the like. The highlight removal algorithm based on the joint bilateral filtering can effectively remove the highlight of the object image, but for the face image with complex texture, the phenomenon of black spots is generated due to inaccurate estimation of the maximum diffuse reflection chromaticity. The human face geometric reconstruction based on the three-dimensional deformation model can simply and robustly reconstruct the whole human face geometry, but can not reconstruct the human face shape which is not contained in the original data, and is lack of geometric details. The face details can be effectively reconstructed based on the geometric reconstruction of the face from the light and shade recovery shape, but the result is unstable due to the self-incompatibility. In the traditional method, a face texture map is generated through projection fusion, and the problem of fuzzy texture exists. According to the method, the estimation accuracy of the maximum diffuse reflection chromaticity is improved through pixel classification and a compensation function, highlight of a face image is effectively removed, the overall shape and the geometric details of the face are recovered simultaneously by adopting coarse-fine double-granularity face geometric reconstruction, the texture fuzzy problem is solved through a block-based texture mapping technology, a KDTree is adopted to accelerate a block matching process, and a high-quality face model is finally generated. The highlight removal algorithm based on the joint bilateral filtering may be referred to in the document "Yang Q, Wang S, Ahuja N.real-time spectral high removal using binary filtering [ C ]/European conference control Computer vision. Springer, Berlin, Heidelberg,2010: 87-100", the pixel classification and compensation function may be referred to in the document "Gao R X, Li X Y.temporal high removal using binary filtering [ J ]. Journal of and Graphics,2018,23(1): 0009. sub. 0017 ], the three-dimensional deformation model may be referred to in the document" Blanz V, Vetter T.A movable for the synthesis of 3D/processing [ C ]/C187. sub. filtration and the shape recovery technique of the Image processing [ C ]. 80. sub. J., "the Image processing and Graphics, the Image processing and compensation technique" Zideal "1999, tsai P S, layer J E, et al, shape-from-mapping: a present [ J ] IEEE Transactions on Pattern analysis and machine interaction, 1999,21(8):690- & ltJ ] 706. ", block-based texture mapping techniques may be referred to as" Bi S, Kalantari N K, ramamorthia R.Patch-based optimization for image-based texture mapping [ J ]. ACM Transactions on Graphics (Proceedings of SIGRAPH 2017),2017,36(4) ", KDTree-based block matching acceleration may be referred to as" 2012K, Sun J.computing-neighbor field video mapping-estimation-implementation [ C ] & ltC.12.12.12.C. & ltC.12..

Disclosure of Invention

The invention aims to provide a face reconstruction method based on a plurality of RGB images aiming at the defects of the prior art, and the invention realizes the reconstruction of a high-quality face model through a plurality of RGB images.

The purpose of the invention is realized by the following technical scheme: a face reconstruction method based on a plurality of RGB images comprises the following steps:

(1) maximum diffuse reflectance chromaticity estimation: and calculating the average minimum intensity value of pixels in the image, wherein the pixels with the minimum intensity value less than or equal to the product of the average minimum intensity value and the empirical threshold value are called diffuse reflection pixels, and the rest pixels are called specular reflection pixels. The diffuse reflection pixels estimate the maximum diffuse reflection chromaticity by the maximum chromaticity, and the specular reflection pixels estimate the maximum diffuse reflection chromaticity by the compensation function.

(2) Highlight removal: and (3) performing combined bilateral filtering on the maximum chromaticity diagram iteratively by using the estimated maximum diffuse reflection chromaticity as a guide, taking the larger value of the filtered maximum chromaticity and the original maximum chromaticity as the maximum diffuse reflection chromaticity, and calculating the highlight-removed image.

(3) Geometric reconstruction of coarse-grained human faces: and detecting two-dimensional face characteristic points on the image by adopting a face alignment algorithm DDE, optimizing the coefficient of a three-dimensional deformation model Facewarehouse by constraint that the projection error of the characteristic points is the same as the identity coefficients of the faces of a plurality of images, and interpolating to obtain a coarse-granularity face grid.

(4) And (3) geometrically reconstructing a fine-grained face: illumination, surface color, and surface normal vector parameters are iteratively optimized under the constraint of surface normal vector integrability using a second order harmonic approximated lambertian surface irradiance equation. And optimizing a depth map through the surface normal vector, and organizing into a fine-grained face grid.

(5) Block-based texture mapping: and according to the bidirectional similarity and the continuity of the multi-view images, constructing a block-based energy equation, optimizing the aligned images, and generating a complete high-quality face texture mapping through projection fusion.

(6) Block matching acceleration based on KDTree: representing the block as a 24-dimensional vector through Walsh-Hadamard transform, constructing KDTree of the candidate block vector by carrying out median segmentation on the dimension with the largest value difference in the candidate block iteratively, and accelerating block matching and improving matching accuracy by using a propagation-assisted search strategy.

The invention has the beneficial effects that: the method improves the accuracy of maximum diffuse reflection chromaticity estimation by utilizing pixel classification and compensation functions, and effectively removes highlight of the face image through joint bilateral filtering robustness. The invention combines the three-dimensional deformation model and the technology of recovering the shape from the light and the shade, ensures that the geometry of the human face is highly similar to the image in the whole shape and the local details, and reconstructs a high-quality human face grid. The invention utilizes the energy alignment image based on the block, solves the problem of fuzzy texture, adopts KDTree to optimize the search space and greatly accelerates the block matching algorithm. The face reconstruction technology provided by the invention has the advantages of high operation efficiency, strong robustness, easy implementation and low cost, and can effectively reconstruct a high-quality face model.

Drawings

FIG. 1 is a flow chart of the process for face reconstruction according to the present invention;

FIG. 2 is a schematic diagram of the feature point update of the face contour, in which (a) is the feature line of the face contour, (b) is the vertex with the smallest dot product of the vertex normal vector and the sight line direction on each contour line, and (c) is the new contour feature point;

FIG. 3 is a schematic diagram of a block-based energy equation;

FIG. 4 is the first 16 bases of the Walsh-Hadamard transform;

fig. 5 is a schematic diagram of a propagation-assisted KDTree search strategy.

Detailed Description

The invention is further described with reference to the following drawings and detailed description.

As shown in fig. 1, the face reconstruction technique based on multiple RGB images of the present invention includes the following steps:

1. and estimating the maximum diffuse reflection chroma.

And calculating the average minimum intensity value of pixels in the image, wherein the pixels with the minimum intensity value less than or equal to the product of the average minimum intensity value and the empirical threshold value are called diffuse reflection pixels, and the rest pixels are called specular reflection pixels. The diffuse reflection pixel estimates the maximum diffuse reflection chromaticity by the maximum chromaticity, and the specular reflection pixel estimates the maximum diffuse reflection chromaticity by the equation (1).

Wherein

For the compensation function, k is the number of diffusely reflecting pixels,

for estimated maximum diffuse reflectance chroma, J_cIs the intensity value of channel c, J_minAnd J_maxMinimum intensity value and maximum intensity value respectivelyHigh intensity value.

2. And (5) removing high light.

The following operations are performed iteratively: and performing combined bilateral filtering on the maximum chromaticity diagram by using the estimated maximum diffuse reflection chromaticity as a guide, performing the next iteration by taking the larger of the filtered maximum chromaticity and the original maximum chromaticity for each pixel, and finishing the iteration when the maximum chromaticity difference of each pixel after the two iterations is less than a threshold value. And taking the maximum chroma after final filtering as the maximum diffuse reflection chroma, and substituting the maximum diffuse reflection chroma into the formula (2) to calculate the image without the highlight.

Wherein Λ_maxMaximum diffuse reflectance chroma, J^DTo remove the highlight image.

3. And (5) carrying out geometric reconstruction on the coarse-grained human face.

Firstly, detecting two-dimensional feature points of a human face by using a human face alignment algorithm DDE, then projecting the three-dimensional feature points on a three-dimensional deformation model Facewarehouse to an image space under an ideal pinhole camera model, minimizing the distance between the three-dimensional feature points and the detected two-dimensional feature points, and simultaneously optimizing a plurality of images by using the constraint of the same human face coefficients of the plurality of images. For the face contour feature points with unfixed indexes, updating is carried out after each iteration as follows: the vertex with the smallest dot product of the vertex normal vector and the sight line direction is found on the pre-labeled face contour line, then all the selected vertices are projected to the image space, and the point closest to the two-dimensional feature point is selected as the new face contour feature point, as shown in fig. 2. And finally, calculating a coarse-grained face grid by using the face identity and expression coefficient obtained by optimization and inserting values in faceware house.

4. And (5) carrying out geometric reconstruction on the fine-grained human face.

The lambertian surface irradiance equation of the second harmonic approximation is as in equation (3):

where p represents the surface color,

which is indicative of the illumination of the light,

representing the surface normal vector.

Firstly, initializing each parameter: geometrically projecting the coarse-grained human face to an image space to obtain a normal vector corresponding to each pixel, initializing a surface normal vector according to the normal vector, and using the image after Gaussian filtration to initialize surface color. And then, obtaining a final required surface normal vector of the fine-grained face geometry by iteratively and alternately optimizing illumination, surface color and the surface normal vector, wherein a regular term of integrability of the surface normal vector is used in the optimization process. And finally, constructing an energy equation according to different modes of the depth map representing the normal vector, optimizing the depth map under the constraint of the difference regular term and the Gaussian regular term, and organizing a final fine-grained face grid.

5. Block-based texture mapping.

The block-based energy equation, such as fig. 3, is constructed based on the bi-directional similarity, such as equation (4), and the multi-view image continuity, such as equation (5).

Where L denotes the number of pixels in a block, S denotes an input image, T denotes an alignment image, M denotes a texture image, D denotes a distance between blocks, α and w denote weights, N denotes the number of images, x denotes a weight_i→jTo representThe pixels at view i project to view j.

And initializing an alignment image and a texture image as input images, and performing two-stage optimization of alignment and reconstruction through iteration to generate a final aligned image. And projecting the images at all the aligned visual angles onto the object geometry, and performing weighted average on different color values projected by each vertex to generate a complete high-quality face texture mapping.

6. KDTree based block matching acceleration.

Firstly, the Y channel is reduced in dimension by using the first 16 bases of the Walsh-Hadamard transform in the YCbCr color space, the Cb channel and the Cr channel are reduced in dimension by the first 4 bases, each block is represented as a 24-dimensional vector, and the Walsh-Hadamard transform is shown as the graph 4. And then, carrying out median segmentation on the dimension with the largest median difference in the candidate blocks iteratively to construct KDTree of the candidate block vector. And finally, searching the nearest candidate block by using a propagation-assisted KDTree searching strategy, as shown in FIG. 5, thereby realizing great acceleration of block matching and improving the matching accuracy.

Claims

1. A face reconstruction method based on a plurality of RGB images is characterized by comprising the following steps:

(1) maximum diffuse reflectance chromaticity estimation: calculating the average minimum intensity value of pixels in the image, wherein the minimum intensity value is less than or equal to the pixel of the product of the average minimum intensity value and the empirical threshold value and is called as a diffuse reflection pixel, and the rest pixels are called as specular reflection pixels; the maximum diffuse reflection chromaticity is estimated by the diffuse reflection pixel through the maximum chromaticity, and the maximum diffuse reflection chromaticity is estimated by the specular reflection pixel through a compensation function;

(2) highlight removal: performing combined bilateral filtering on the maximum chromaticity diagram iteratively by using the estimated maximum diffuse reflection chromaticity as a guide, and performing highlight removal on the image by taking the larger value of the filtered maximum chromaticity and the original maximum chromaticity as the maximum diffuse reflection chromaticity;

(3) geometric reconstruction of coarse-grained human faces: detecting two-dimensional face characteristic points on the images by adopting a face alignment algorithm, optimizing the coefficient of the three-dimensional deformation model through the constraint that the projection error of the characteristic points is the same as the face identity coefficients of a plurality of images, and interpolating coarse-grained face grids;

(4) and (3) geometrically reconstructing a fine-grained face: iteratively optimizing illumination, surface color and surface normal vector parameters under the constraint of surface normal vector integrability by adopting a Lambert surface irradiance equation with second-order harmonic approximation; optimizing a depth map through a surface normal vector, and organizing into a fine-grained face grid;

(5) block-based texture mapping: according to the bidirectional similarity and the continuity of the multi-view images, a block-based energy equation is constructed, the aligned images are optimized, and a complete high-quality face texture mapping is generated through projection fusion;

(6) block matching acceleration based on KDTree: blocks are expressed as low-dimensional vectors through Walsh-Hadamard transform, KDTree is adopted to organize candidate block vectors, and a propagation-assisted search strategy is used for accelerating block matching and improving matching accuracy.