CN112069923B - 3D face point cloud reconstruction method and system - Google Patents
3D face point cloud reconstruction method and system Download PDFInfo
- Publication number
- CN112069923B CN112069923B CN202010834329.3A CN202010834329A CN112069923B CN 112069923 B CN112069923 B CN 112069923B CN 202010834329 A CN202010834329 A CN 202010834329A CN 112069923 B CN112069923 B CN 112069923B
- Authority
- CN
- China
- Prior art keywords
- point cloud
- face
- module
- point
- initial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 238000005070 sampling Methods 0.000 claims abstract description 58
- 239000011159 matrix material Substances 0.000 claims abstract description 41
- 238000012952 Resampling Methods 0.000 claims abstract description 15
- 238000013528 artificial neural network Methods 0.000 claims abstract description 8
- 238000000605 extraction Methods 0.000 claims description 45
- 238000013527 convolutional neural network Methods 0.000 claims description 38
- 230000009466 transformation Effects 0.000 claims description 31
- 238000001514 detection method Methods 0.000 claims description 20
- 238000004364 calculation method Methods 0.000 claims description 19
- 238000004422 calculation algorithm Methods 0.000 claims description 11
- 238000000354 decomposition reaction Methods 0.000 claims description 11
- 238000012545 processing Methods 0.000 claims description 11
- 238000005457 optimization Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 230000001131 transforming effect Effects 0.000 claims description 6
- 238000003064 k means clustering Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 4
- 238000012360 testing method Methods 0.000 claims description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Abstract
The invention discloses a 3D face point cloud reconstruction method and a system, wherein the method comprises four stages of initial point cloud establishment, point cloud registration, point cloud resampling and point cloud enhancement; in the point cloud registration stage, coefficients required by point cloud registration are dynamically generated in a neural network learning mode, so that the generated point cloud data are more accurate and do not depend on corresponding relations in point pairs too much; meanwhile, a feature matrix of two adjacent frames of point clouds is generated in a down-sampling mode of a neural network, euclidean distance between the two feature matrices is obtained, so that the calculated amount of the point clouds is reduced, and the condition that the point clouds can be processed in real time is basically achieved; in addition, the noise of the reconstructed point cloud is further reduced through the point cloud resampling, and the robustness of the reconstructed point cloud is improved through the point cloud enhancement mode, so that the method is better applied to the subsequent curved surface reconstruction.
Description
Technical Field
The invention relates to the technical field of 3D face reconstruction, in particular to a 3D face point cloud reconstruction method and system.
Background
With the development of Chinese economy, the level of urbanization is further improved, and the automatic detection system for drivers has become an important direction for the development of modern vehicle-mounted intelligent systems. Face recognition systems are also evolving from traditional 2D image recognition modes to 3D deep learning modes as an important component thereof. In 3D face recognition and related tasks, the point cloud reconstruction problem of the 3D data of the face is an important split in one of the pre-processing. Therefore, the key technology of the point cloud reconstruction method based on the 3D data has become a research hotspot in the related fields at home and abroad.
The 3D point cloud reconstruction is used as one of the front-end tasks of related tasks such as 3D face recognition and the like, so that the defect that the 2D face recognition and related tasks are too dependent on imaging quality can be avoided to a great extent, and the accuracy of the face recognition task and the robustness to scene environments are improved. However, the conventional point cloud reconstruction method consumes a large amount of calculation amount and is time-consuming, and in the reconstruction process, the ICP algorithm is too dependent, that is, the correspondence between the point pairs is too dependent, and the point cloud with the lacking correspondence and partial visibility cannot be processed, so that the point cloud cannot be applied in a real scene.
Disclosure of Invention
The invention aims to provide a 3D face point cloud reconstruction method for solving the defects of the technical problems, so as to improve the accuracy of 3D face recognition and effectively reduce the calculated amount.
Another object of the present invention is to provide a 3D face point cloud reconstruction system, so as to improve the accuracy of 3D face recognition and effectively reduce the calculation amount.
In order to achieve the above purpose, the invention discloses a 3D face point cloud reconstruction method, which comprises the following steps:
1) Selecting two adjacent frames of face images from a video stream comprising multiple frames of face images, and processing to obtain initial point clouds S T、ST+1 corresponding to the two frames of face images respectively;
2) Then transforming the initial point cloud S T through a predefined initial transformation matrix { R i,Ti }, obtaining S ′ T, splicing and fusing S ′ T and S T+1, and sending the fused S ′ T and S T+1 into a coefficient prediction network based on a convolutional neural network and comprising a convolutional layer and a maximum pooling layer, so as to obtain a group of iterative coefficient matrixes (alpha, beta);
3) Respectively sending S ′ T and S T+1 into a first feature extraction network based on a convolutional neural network, wherein the first feature extraction network is used for performing network downsampling on S ′ T and S T+1 so as to obtain two groups of feature matrixes F T and F T+1;
4) Obtaining an initial registration matrix M 0 by a calculation model, wherein the input parameters of the calculation model are a group of iteration coefficient matrices (alpha, beta), and feature matrices F T and F T+1;
5) Normalizing the initial registration matrix M 0 to obtain a final registration matrix M T;
6) Singular value decomposition is carried out on M T by adopting a decomposition algorithm to obtain a transformation matrix { R T,TT }, and { R i,Ti } is updated by adopting { R T,TT } and is used as an initial value of the next frame point cloud transformation.
Compared with the prior art, the 3D face point cloud reconstruction method comprises initial point cloud establishment and point cloud registration, wherein after the initial point cloud is established and subjected to primary coordinate transformation, point clouds of adjacent frames are selected, point clouds are fused, the fused point clouds are sent into a convolutional neural network to obtain a group of coefficient matrixes alpha and beta, meanwhile, the point clouds of the selected adjacent frames are respectively sent into another convolutional neural network to be subjected to network downsampling, point cloud data are reduced to obtain two groups of feature matrixes, the obtained coefficient matrixes and the feature matrixes are subjected to mixed calculation, so that an initial registration matrix is calculated, then the initial registration matrix is optimally processed, a final registration matrix is obtained, then a transformation matrix is obtained according to the final registration matrix, and the initial transformation matrix is updated by using the obtained transformation matrix, so that the initial transformation matrix is used as an initial value of the point cloud change of the next frame until all frames of face image data are processed; therefore, in the 3D face point cloud reconstruction process, the required iteration coefficient is not preset manually, but is obtained from the self-learning of the neural network, so that the accuracy of the final point cloud reconstruction can be effectively improved, in addition, the initial point cloud is subjected to network downsampling through the first feature extraction network, and the calculated amount in the point cloud reconstruction and the curved surface reconstruction can be effectively reduced.
Preferably, the calculation model of the initial registration matrix M 0 is:
Where Δ F is the euclidean distance between F T and F T+1.
Preferably, in the step 1), the method for obtaining the initial point cloud from two adjacent frames of the face image includes:
Performing preliminary clipping on the face image, wherein the face image comprises an RGB image and a depth image, and inputting the selected two adjacent frames of face images into a face detection network based on a convolutional neural network so as to obtain a 2D face frame corresponding to the face detection network;
Based on the 2D face frame, the RGB images and the depth map coordinate systems of the two selected face images are aligned respectively, so that initial point clouds S T、ST+1 corresponding to the two face images are obtained.
Preferably, in the step 1), after the initial point cloud S T、ST+1 is established, the method further includes a step of clipping the face image again:
Taking the center of the 2D face frame as a midpoint, and cutting the initial point cloud S T、ST+1 according to a certain depth threshold; the depth threshold may be a preset prior value or an average value of depth data in the face image after preliminary clipping.
Preferably, when the face image is primarily cut, key points on the 2D face frame can be obtained through the face detection network; the 3D face point cloud reconstruction method further comprises the step of resampling the registered point cloud data:
7) Randomly selecting a point from key points in the 2D face frame as an initial point P 0(x0,y0,z0), taking the point P 0 as a center, and obtaining a sampling set S 1={Pi |i E (0, 1,2, …, N) by a furthest point sampling method for points in a certain threshold range;
8) Repeating the step 7) for the rest key points to obtain a sampling set S= { S j |j epsilon (0, 1,2, …, M) }; wherein M is the number of key points;
9) Obtaining k groups of sets by clustering the sample points except the set S through k-means, wherein k is more than 1;
10 Taking the set S as an initial point set, and respectively sampling k groups of sets to obtain a final sampling point set S ′={St |t epsilon (0, 1,2, …, M+k).
Preferably, the method further comprises the step of carrying out point cloud enhancement on the resampled point cloud data:
The method comprises the steps of performing point-by-point feature extraction on a sampling point set S ′ through a second feature extraction network based on a convolutional neural network, adding a weight factor mu in a feature extraction process to obtain feature-enhanced point cloud data f t, taking a preset value larger than 1 when an extracted feature data point is key point data, and taking a value of 1 when the extracted feature data point is non-key point data;
The point cloud data f t are respectively sent into two different sub-networks of the same convolutional neural network, and two groups of different parameters T S and D p are obtained through linear layer regression of the convolutional neural network;
f t is updated according to the following formula,
ft=TS×ft+DP
The invention also discloses a 3D face point cloud reconstruction system, which comprises an initial point cloud establishment unit and a point cloud registration unit; the initial point cloud establishing unit comprises an initial point cloud establishing module and a coordinate transformation module; the registration unit comprises an iteration coefficient generation module, a first feature extraction module, a calculation module, an optimization module and a first updating module;
the initial point cloud establishing module is used for processing two adjacent frames of face images selected from the video stream to obtain initial point clouds T、ST+1 corresponding to the two frames of face images respectively;
The coordinate transformation module is used for transforming the initial point cloud S T through a predefined initial transformation matrix { R i,Ti }, so as to obtain S ′ T;
The iteration coefficient generation module is used for splicing and fusing the S ′ T and the S T+1 and then sending the fused S ′ T and the S T+1 into a coefficient prediction network which is based on a convolutional neural network and comprises a convolutional layer and a maximum pooling layer so as to obtain a group of iteration coefficient matrixes (alpha, beta);
The first feature extraction module is configured to send S ′ T and S T+1 to a first feature extraction network based on a convolutional neural network, where the first feature extraction network is configured to perform network downsampling on S ′ T and S T+1 to obtain two sets of feature matrices F T and F T+1;
the computing module is used for obtaining an initial registration matrix M 0 through a computing model, wherein the input parameters of the computing model are a group of iteration coefficient matrices (alpha, beta), and feature matrices F T and F T+1;
the optimization module is used for carrying out normalization processing on the M 0 through an optimization algorithm so as to obtain a final registration matrix M T;
The first updating module is configured to perform singular value decomposition on M T by using a decomposition algorithm to obtain a transformation matrix { R T,TT }, and update { R i,Ti } by using { R T,TT } as an initial value of the next frame point cloud transformation.
Preferably, the initial point cloud establishing unit further includes a preliminary clipping module, wherein the face image includes an RGB image and a depth image, and the preliminary clipping module is configured to perform face region detection on two adjacent frames of face images selected by a face detection network based on a convolutional neural network, so as to obtain a 2D face frame; the initial point cloud establishing module may align the RGB images of the two selected face images with the depth map coordinate system based on the 2D face frames cut by the preliminary cutting module, so as to obtain initial point clouds T、ST+1 corresponding to the two face images.
Preferably, the initial point cloud establishing unit further includes a re-clipping module, and after the initial point cloud S T、ST+1 is established, the re-clipping module is configured to clip the initial point cloud S T、ST+1 according to a certain depth threshold with the center of the 2D face frame as a midpoint; the depth threshold may be a preset prior value or an average value of depth data in the face image after preliminary clipping.
Preferably, the system further comprises a point cloud resampling unit, wherein the point cloud resampling unit comprises a first sampling module, a grouping module and a second sampling module;
The first sampling module is configured to randomly select a point from key points in the 2D face frame as an initial point P 0(x0,y0,z0, and obtain a sampling set S 1={Pi |i e (0, 1,2, …, N) } by using a furthest point sampling method for points within a certain threshold range with the point P 0 as a center, so as to obtain a sampling set s= { S j |j e (0, 1,2, …, M) }; wherein M is the number of key points, and the key points are obtained through a face detection network;
The grouping module is used for grouping the sample points outside the set S through k-means clustering to obtain k groups of sets, wherein k is greater than 1;
The second sampling module is configured to sample the k groups of sets with the set S as an initial point set, so as to obtain a final sampling point set S ′={St |t e (0, 1,2, …, m+k).
Preferably, the system further comprises a point cloud enhancement unit, wherein the point cloud enhancement unit comprises a second feature extraction module, a parameter generation module and a second updating module;
The second feature extraction module is configured to perform point-by-point feature extraction on the sampling point set S ′ through a second feature extraction network based on a convolutional neural network, and add a weight factor μ in a feature extraction process to obtain feature-enhanced point cloud data f t, where μ takes a preset value greater than 1 when the extracted feature data point is key point data, and where μ takes a value of 1 when the extracted feature data point is non-key point data;
The parameter generating module is configured to send the point cloud data f t generated by the second feature extracting module into two different sub-networks of the same convolutional neural network, and obtain two different sets of parameters T S and D P through linear layer regression of the convolutional neural network;
the second updating module is used for updating f t according to the following formula,
ft=TS×ft+DP
The invention also discloses a 3D face point cloud reconstruction system, which comprises one or more processors, a memory and one or more programs, wherein the one or more programs are stored in the memory and are configured to be executed by the one or more processors, and the programs comprise instructions for executing the 3D face point cloud reconstruction method.
In addition, the invention also discloses a computer readable storage medium, which comprises a computer program for testing, wherein the computer program can be executed by a processor to finish the 3D face point cloud reconstruction method.
Drawings
Fig. 1 is a flow chart of a method for reconstructing a 3D face point cloud according to an embodiment of the present invention.
Fig. 2 is a schematic flow chart of initial point cloud establishment and point cloud registration in an embodiment of the invention.
Fig. 3 is a schematic flow chart of point cloud resampling in an embodiment of the invention.
Fig. 4 is a schematic structural diagram of a 3D face point cloud reconstruction system according to an embodiment of the present invention.
Detailed Description
In order to describe the technical content, the constructional features, the achieved objects and effects of the present invention in detail, the following description is made in connection with the embodiments and the accompanying drawings.
As shown in fig. 1 and fig. 2, the invention discloses a 3D face point cloud reconstruction method, which comprises two stages of initial point cloud establishment and point cloud registration.
1. The initial point cloud establishment includes the steps of:
s1), acquiring a video stream comprising a plurality of frames of face images through 3D video equipment (such as a TOF camera), randomly selecting two adjacent frames of face images, and processing the two adjacent frames of face images respectively to obtain initial point clouds S T、ST+1 corresponding to the two frames of face images respectively.
2. The point cloud registration is a process of integrating the point cloud data under different view angles into a designated coordinate system in a unified way through rigid transformation such as rotation, translation and the like. In the changing process, a rotation matrix R and a translation matrix T are needed to be used, and the coincidence of the source point cloud and the target point cloud is realized through R, T transformation. Specifically, referring to fig. 1 and 2, the point cloud registration in the present embodiment includes the following steps:
S2), transforming the initial point cloud S T through a predefined initial transformation matrix { R i,Ti }, obtaining S ′ T, then splicing and fusing S ′ T and S T+1, and sending the fused S ′ T and S T+1 into a coefficient prediction network which comprises a convolution layer and a maximum pooling layer and is based on a convolution neural network, so as to obtain a group of iteration coefficient matrixes (alpha, beta); it should be noted that, the specific role of the coefficient matrix (α, β) is common general knowledge in the art, that is, α is an iteration stop condition parameter, β is an iteration attenuation parameter, and these two parameters are manually preset values in the prior art, and only need to be called when in use, in this embodiment, the coefficient value in each iteration process is different through the learning dynamic generation of the point cloud data by a neural network (coefficient prediction network);
S3), respectively sending the S ′ T and the S T+1 into a first feature extraction network based on a convolutional neural network, wherein the first feature extraction network is used for performing network downsampling on the S ′ T and the S T+1 so as to obtain two groups of feature matrixes F T and F T+1; for network downsampling, it should be noted that: the processing of the data by the convolutional neural network comprises up-sampling and down-sampling, wherein the down-sampling is also called down-sampling, mainly causes the image to conform to the size of the display area and generates a thumbnail of the corresponding image; upsampling, also called image interpolation, is mainly used for amplifying images;
S4), obtaining an initial registration matrix M 0 through a calculation model, wherein the input parameters of the calculation model are a group of iteration coefficient matrices (alpha, beta) obtained in the step S4), and feature matrices F T and F T+1 obtained in the step S5);
S5), carrying out normalization processing on the M 0 to obtain a final registration matrix M T;
S6), adopting a decomposition algorithm (such as SVD algorithm) to decompose singular values of M T to obtain a transformation matrix { R T,TT }, and adopting { R T,TT } to update { R i,Ti } as an initial value of the next frame point cloud transformation until face images of all frames are processed.
Compared with the prior art, the method has two largest differences in the face reconstruction process, namely, iteration coefficients (alpha, beta) required in the point cloud reconstruction process are obtained in a network self-learning mode, so that the accuracy of final point cloud reconstruction and subsequent curved surface reconstruction can be effectively improved, the corresponding relation between the point pairs of two face images is not excessively relied, and the quality requirement on the initial point cloud is reduced; and secondly, the initial point cloud is subjected to network downsampling through a convolutional neural network (a first feature extraction network), so that the calculated amount in the point cloud reconstruction and the curved surface reconstruction is effectively reduced, and the hardware load is further reduced.
Preferably, the calculation model of the initial registration matrix M 0 is:
Where Δ F is the euclidean distance between F T and F T+1.
In the above embodiment, the face image acquired by the TOF camera includes an RGB image and a depth image. The RGB map is a planar color map, the depth map is an information image or image channel containing information about the surface distance of a scene object from a viewpoint, the depth map is similar to a gray scale image, except that each pixel value is the actual distance from a photosensitive sensor in a camera to an object, typically the RGB map and the depth map acquired by a TOF camera are paired, and thus there is a one-to-one correspondence between pixel points. Since the RGB image contains some background information, in the step S1), the step of performing preliminary cropping on the face image is further included before aligning the RGB image and the depth image coordinate system of the two frames of face images:
Inputting the selected two adjacent frames of face images into a face detection network based on a convolutional neural network, so as to obtain a 2D face frame corresponding to the face detection network;
Based on the 2D face frames, the RGB images and the depth map coordinate systems of the two selected face images are aligned respectively to obtain initial point clouds S T、ST+1 corresponding to the two face images respectively, so that some background information is cut off.
By the above preliminary clipping, points beyond the range of the face region are defaulted to be the background and the background region is clipped, and the initial point cloud is built only in the face region.
Since the detection result based on the RGB map has no depth information, a part of background information is still included in the initial point cloud. Therefore, in the above step S1), after the initial point cloud S T、ST+1 is established, the face image may also be cropped again, specifically:
and cutting the initial point cloud S T、ST+1 by taking the center of the 2D face frame as a midpoint according to a certain depth threshold. The depth threshold may be a preset prior value or an average value of depth data in the face image after preliminary clipping.
After registration, the reconstructed point clouds still have noise, so that further optimization can be realized, and the point clouds can be processed by means of resampling, as shown in fig. 1 and 3, namely, the 3D face point cloud reconstruction method further comprises a point cloud resampling stage, and specifically comprises the following steps:
S7), randomly selecting a point from key points in the 2D face frame as an initial point P 0(x0,y0,z0), taking the point P 0 as a center, and obtaining a sampling set S 1={Pi |i E (0, 1,2, …, N) by a furthest point sampling method for points in a certain threshold range; the key points in the face frame are convex and concave points on the face, such as nose tips, mouths, glasses and the like, and the key points on the 2D face frame can be output through a face detection network in the process of primarily cutting the face image;
s8), repeating the step S9 for the rest key points to obtain a sampling set S= { S j |j epsilon (0, 1,2, …, M) }; wherein M is the number of key points;
S9), clustering sample points except the set S through k-means to obtain k groups of sets, wherein k is greater than 1;
S10), taking the set S as an initial point set, and respectively sampling k groups of sets to obtain a final sampling point set S ′={St |t epsilon (0, 1,2, …, M+k).
In the resampling process, key points on the face are taken as the center, preliminary sampling is carried out through the furthest point sampling method, then data of the preliminary sampling is taken as an initial point set, and secondary sampling is carried out on other points after the preliminary sampling through k-means clustering, so that noise data in the point cloud can be removed to the maximum extent, and meanwhile accuracy of the point cloud data obtained after the sampling is guaranteed.
In addition, in order to further improve the robustness of the reconstructed point cloud, the 3D face point cloud reconstruction method further comprises the step of carrying out point cloud enhancement on resampled point cloud data:
the method further comprises the step of carrying out point cloud enhancement on the resampled point cloud data:
The method comprises the steps of performing point-by-point feature extraction on a sampling point set S ′ through a second feature extraction network based on a convolutional neural network, adding a weight factor mu in a feature extraction process to obtain feature-enhanced point cloud data f t, taking a preset value larger than 1 mu when an extracted feature data point P i(Pi∈S′) is key point data, and taking a value of 1 mu when P i is non-key point data, namely f t←μ×Pi;
The point cloud data f t are respectively sent into two different sub-networks of the same convolutional neural network, and two groups of different parameters T S and D P are obtained through linear layer regression of the convolutional neural network;
f t is updated according to the following formula,
ft=TS×ft+DP。
In addition, the parameters T S and D P in the present embodiment are not limited by practical meaning, but are just one data processing method of the point cloud data f t, and when f t is updated, D P may be multiplied by f t, that is, f t=ft×DP+TS.
In the point cloud enhancement method, the point cloud data are updated in a self-learning mode, and meanwhile, different weights are given to the key point data and the non-key point data for different treatment, so that the robustness of the finally formed point cloud data set { f t } is effectively improved.
In summary, the 3D face point cloud reconstruction method disclosed by the invention comprises four stages of initial point cloud establishment, point cloud registration, point cloud resampling and point cloud enhancement. In the point cloud registration stage, coefficients required by point cloud registration are dynamically generated in a neural network learning mode, so that the generated point cloud data are more accurate and do not depend on corresponding relations in point pairs too much; meanwhile, feature matrixes of two adjacent frames of point clouds are generated in a down-sampling mode of the neural network, euclidean distance between the two feature matrixes is obtained, and therefore calculation amount of the point clouds is reduced, and the condition that the point clouds can be processed in real time is basically achieved. In addition, the noise of the reconstructed point cloud is further reduced through the point cloud resampling, and the robustness of the reconstructed point cloud is improved through the point cloud enhancement mode, so that the method is better applied to the subsequent curved surface reconstruction.
The invention also discloses a 3D face point cloud reconstruction system, as shown in fig. 4, which comprises an initial point cloud establishment unit, a point cloud registration unit, a point cloud resampling unit and a point cloud enhancement unit.
The initial point cloud establishing unit comprises a basic data acquiring module and an initial point cloud establishing module. The registration unit comprises a coordinate transformation module, an iteration coefficient generation module, a first feature extraction module, a calculation module, an optimization module and a first updating module.
And the initial point cloud establishing module is used for processing two adjacent frames of face images selected from the video stream to obtain initial point clouds S T、ST+1 corresponding to the two frames of face images respectively.
The coordinate transformation module is used for transforming the initial point cloud S T through a pre-defined initial transformation matrix { R i,Ti }, and obtaining S ′ T.
The iteration coefficient generation module is used for splicing and fusing the S ′ T and the S T+1 and then sending the fused S ′ T and the S T+1 into a coefficient prediction network which is based on a convolutional neural network and comprises a convolutional layer and a maximum pooling layer so as to obtain a group of iteration coefficient matrixes (alpha, beta);
The first feature extraction module is used for sending the S ′ T and the S T+1 into a first feature extraction network based on a convolutional neural network respectively, and the first feature extraction network is used for performing network downsampling on the S ′ T and the S T+1 so as to obtain two groups of feature matrixes F T and F T+1.
The calculation module is used for obtaining an initial registration matrix M 0 through a calculation model, wherein the input parameters of the calculation model are a set of iteration coefficient matrices (alpha, beta), and feature matrices F T and F T+1.
And the optimization module is used for carrying out normalization processing on the M 0 through an optimization algorithm so as to obtain a final registration matrix M T.
The first updating module is used for carrying out singular value decomposition on M T by adopting a decomposition algorithm to obtain a transformation matrix { R T,TT }, and updating { R i,Ti } by adopting { R T,TT } to serve as an initial value of the next frame point cloud transformation.
Preferably, the initial point cloud establishing unit further includes a preliminary clipping module, wherein the face image includes an RGB image and a depth image, and the preliminary clipping module is configured to perform face region detection on two adjacent frames of face images selected by a face detection network based on a convolutional neural network, so as to obtain a 2D face frame; the initial point cloud establishing module may align the RGB images of the two selected face images with the depth map coordinate system based on the 2D face frames cut by the preliminary cutting module, so as to obtain initial point clouds T、ST+1 corresponding to the two face images.
Further, the initial point cloud establishing unit further includes a re-clipping module, and when the initial point cloud S T、ST+1 is established, the re-clipping module is configured to clip the initial point cloud S T、ST+1 according to a certain depth threshold with the center of the 2D face frame as a midpoint; the depth threshold may be a preset prior value or an average value of depth data in the face image after preliminary clipping.
The point cloud resampling unit comprises a first sampling module, a grouping module and a second sampling module.
The first sampling module is used for randomly selecting a point from key points in the 2D face frame as an initial point P 0(x0,y0,z0), taking the point P 0 as a center, and obtaining a sampling set S 1={Pi |i epsilon (0, 1,2, …, N) by a furthest point sampling method for the points in a certain threshold range, so as to obtain a sampling set S= { S j |j epsilon (0, 1,2, …, M) of all the key points; wherein M is the number of key points, and the key points are obtained through a face detection network;
and the grouping module is used for grouping the sample points outside the set S through k-means clustering to obtain k groups of sets, wherein k is greater than 1.
And the second sampling module is used for taking the set S as an initial point set and respectively sampling k groups of sets to obtain a final sampling point set S ′={St |t epsilon (0, 1,2, …, M+k).
The point cloud enhancement unit comprises a second feature extraction module, a parameter generation module and a second updating module.
The second feature extraction module is configured to perform point-by-point feature extraction on the sampling point set S ′ through a second feature extraction network based on a convolutional neural network, and add a weight factor μ in a feature extraction process to obtain feature-enhanced point cloud data f t, where μ takes a preset value greater than 1 when the extracted feature data point is key point data, and where μ takes a value of 1 when the extracted feature data point is non-key point data.
The parameter generation module is used for respectively sending the point cloud data f t generated by the second feature extraction module into two different sub-networks of the same convolutional neural network, and obtaining two groups of different parameters T S and D P through linear layer regression of the convolutional neural network;
the second updating module is used for updating f t according to the following formula,
ft=TS×ft+DP
Regarding the working principle and the working flow of the 3D face point cloud reconstruction system, details of the 3D face point cloud reconstruction method are described in detail herein and are not described in detail.
In addition, the invention also discloses another 3D face point cloud reconstruction system, which comprises one or more processors, a memory and one or more programs, wherein the one or more programs are stored in the memory and are configured to be executed by the one or more processors, and the programs comprise instructions for executing the 3D face point cloud reconstruction method.
In addition, the invention also discloses a computer readable storage medium, which comprises a computer program for testing, wherein the computer program can be executed by a processor to finish the 3D face point cloud reconstruction method.
The foregoing description of the preferred embodiments of the present invention is not intended to limit the scope of the claims, which follow, as defined in the claims.
Claims (12)
1. The 3D face point cloud reconstruction method is characterized by comprising the following steps of:
1) Selecting two adjacent frames of face images from a video stream comprising multiple frames of face images, and processing to obtain initial point clouds S T、ST+1 corresponding to the two frames of face images respectively;
2) Then transforming the initial point cloud S T through a predefined initial transformation matrix { R i,Ti }, obtaining S ' T, splicing and fusing S ' T and S T+1, and sending the fused S ' T and S T+1 into a coefficient prediction network which comprises a convolution layer and a maximum pooling layer and is based on a convolution neural network, so as to obtain a group of iteration coefficient matrixes (alpha, beta);
3) Respectively sending S 'T and S T+1 into a first feature extraction network based on a convolutional neural network, wherein the first feature extraction network is used for performing network downsampling on S' T and S T+1 so as to obtain two groups of feature matrices F T and F T+1;
4) Obtaining an initial registration matrix M 0 by a calculation model, wherein the input parameters of the calculation model are a group of iteration coefficient matrices (alpha, beta), and feature matrices F T and F T+1;
5) Normalizing the initial registration matrix M 0 to obtain a final registration matrix M T;
6) Performing singular value decomposition on M T by adopting a decomposition algorithm to obtain a transformation matrix { R T,TT }, and updating { R i,Ti } by adopting { R T,TT } to serve as an initial value of the next frame point cloud transformation;
The calculation model of the initial registration matrix M 0 is:
Where Δ F is the euclidean distance between F T and F T+1.
2. The method of reconstructing a 3D face point cloud according to claim 1, wherein in the step 1), the method of obtaining the initial point cloud from two adjacent frames of the face images includes:
Performing preliminary clipping on the face image, wherein the face image comprises an RGB image and a depth image, and inputting the selected two adjacent frames of face images into a face detection network based on a convolutional neural network so as to obtain a 2D face frame corresponding to the face detection network;
Based on the 2D face frame, the RGB images and the depth map coordinate systems of the two selected face images are aligned respectively, so that initial point clouds S T、ST+1 corresponding to the two face images are obtained.
3. The method of reconstructing a 3D face point cloud according to claim 2, wherein in the step 1), after the initial point cloud S T、ST+1 is established, the method further comprises a step of cropping the face image again:
Taking the center of the 2D face frame as a midpoint, and cutting the initial point cloud S T、ST+1 according to a certain depth threshold; the depth threshold may be a preset prior value or an average value of depth data in the face image after preliminary clipping.
4. The 3D face point cloud reconstruction method according to claim 2, wherein key points on the 2D face frame are also obtained through the face detection network when the face image is primarily cut; the 3D face point cloud reconstruction method further comprises the step of resampling the registered point cloud data:
7) Randomly selecting a point from key points in the 2D face frame as an initial point P 0(x0,y0,z0), taking the point P 0 as a center, and obtaining a sampling set S 1={Pi |i epsilon (0, 1,2, the..N) by a furthest point sampling method for points in a certain threshold range; wherein N is the number of samples;
8) Repeating the step 7) for the rest key points to obtain a sampling set S= { S j |j e (0, 1,2,., M) }; wherein M is the number of key points;
9) Obtaining k groups of sets of the sample points except the set S through k-means clustering, wherein k is more than 1;
10 Taking the set S as an initial point set, and respectively sampling the k groups of sets to obtain a final sampling point set S' = { S t |t e (0, 1, 2.., m+k) }.
5. The method for reconstructing a 3D face point cloud of claim 4, further comprising the step of performing point cloud enhancement on the resampled point cloud data:
the method comprises the steps of performing point-by-point feature extraction on a sampling point set S' through a second feature extraction network based on a convolutional neural network, adding a weight factor mu in a feature extraction process to obtain feature-enhanced point cloud data f t, taking a preset value larger than 1 when an extracted feature data point is key point data, and taking a value of 1 when the extracted feature data point is non-key point data;
The point cloud data f t are respectively sent into two different sub-networks of the same convolutional neural network, and two groups of different parameters T S and D P are obtained through linear layer regression of the convolutional neural network;
f t is updated according to the following formula,
ft=TS×ft+DP。
6. The 3D face point cloud reconstruction system is characterized by comprising an initial point cloud establishment unit and a point cloud registration unit; the initial point cloud establishing unit comprises an initial point cloud establishing module; the registration unit comprises a coordinate transformation module, an iteration coefficient generation module, a first feature extraction module, a calculation module, an optimization module and a first updating module;
the initial point cloud establishing module is used for processing two adjacent frames of face images selected from the video stream to obtain initial point clouds T、ST+1 corresponding to the two frames of face images respectively;
The coordinate transformation module is used for transforming the initial point cloud S T through a pre-defined initial transformation matrix { R i,Ti }, so as to obtain S' T;
The iteration coefficient generation module is used for splicing and fusing the S 'T and the S T+1 and then sending the fused S' T and the S T+1 into a coefficient prediction network which is based on a convolutional neural network and comprises a convolutional layer and a maximum pooling layer so as to obtain a group of iteration coefficient matrixes (alpha, beta);
The first feature extraction module is configured to send S 'T and S T+1 to a first feature extraction network based on a convolutional neural network, where the first feature extraction network is configured to perform network downsampling on S' T and S T+1 to obtain two sets of feature matrices F T and F T+1;
The computing module is used for pushing out an initial registration matrix M 0 through a computing model, wherein the input parameters of the computing model are a group of iteration coefficient matrices (alpha, beta), and feature matrices F T and F T+1;
the optimization module is used for carrying out normalization processing on the M 0 through an optimization algorithm so as to obtain a final registration matrix M T;
The first updating module is configured to perform singular value decomposition on M T by using a decomposition algorithm to obtain a transformation matrix { R T,TT }, and update { R i,Ti } by using { R T,TT } as an initial value of the next frame point cloud transformation.
7. The 3D face point cloud reconstruction system according to claim 6, wherein the initial point cloud establishing unit further includes a preliminary cropping module, the face image includes an RGB image and a depth image, and the preliminary cropping module is configured to perform face region detection on the two adjacent frames of face images selected by the face detection network based on the convolutional neural network, so as to obtain a 2D face frame; the initial point cloud establishing module may align the RGB images of the two selected face images with the depth map coordinate system based on the 2D face frames cut by the preliminary cutting module, so as to obtain initial point clouds T、ST+1 corresponding to the two face images.
8. The 3D face point cloud reconstruction system according to claim 7, wherein the initial point cloud establishing unit further includes a re-clipping module, and when the initial point cloud S T、ST+1 is established, the re-clipping module is configured to clip the initial point cloud S T、ST+1 according to a certain depth threshold with the center of the 2D face frame as a midpoint; the depth threshold may be a preset prior value or an average value of depth data in the face image after preliminary clipping.
9. The 3D face point cloud reconstruction system of claim 7, further comprising a point cloud resampling unit comprising a first sampling module, a grouping module, and a second sampling module;
The first sampling module is configured to randomly select a point from key points in the 2D face frame as an initial point P 0(x0,y0,z0), and obtain a sampling set S 1={Pi |i e (0, 1,2, N) } by using a furthest point sampling method with the point P 0 as a center, so as to obtain a sampling set s= { S j |j e (0, 1,2, M) }; wherein N is the number of samples, M is the number of key points, and the key points are obtained through a face detection network;
The grouping module is used for grouping the sample points outside the set S through k-means clustering to obtain k groups of sets, wherein k is more than 1;
The second sampling module is configured to sample the k groups of sets with the set S as an initial point set, so as to obtain a final sampling point set S' = { S t |t e (0, 1,2, a., m+k) }.
10. The 3D face point cloud reconstruction system of claim 9, further comprising a point cloud enhancement unit comprising a second feature extraction module, a parameter generation module, and a second update module;
The second feature extraction module is configured to perform point-by-point feature extraction on the sampling point set S' through a second feature extraction network based on a convolutional neural network, and add a weight factor μ in a feature extraction process to obtain feature-enhanced point cloud data f t, where μ takes a preset value greater than 1 when the extracted feature data point is key point data, and where μ takes a value of 1 when the extracted feature data point is non-key point data;
The parameter generating module is configured to send the point cloud data f t generated by the second feature extracting module into two different sub-networks of the same convolutional neural network, and obtain two different sets of parameters T S and D P through linear layer regression of the convolutional neural network;
the second updating module is used for updating f t according to the following formula,
ft=TS×ft+DP。
11. A 3D face point cloud reconstruction system, comprising:
One or more processors;
A memory;
and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs comprising instructions for performing the 3D face point cloud reconstruction method of any of claims 1 to 5.
12. A computer readable storage medium comprising a computer program for testing, the computer program being executable by a processor to perform the 3D face point cloud reconstruction method of any of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010834329.3A CN112069923B (en) | 2020-08-18 | 3D face point cloud reconstruction method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010834329.3A CN112069923B (en) | 2020-08-18 | 3D face point cloud reconstruction method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112069923A CN112069923A (en) | 2020-12-11 |
CN112069923B true CN112069923B (en) | 2024-07-12 |
Family
ID=
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109242951A (en) * | 2018-08-06 | 2019-01-18 | 宁波盈芯信息科技有限公司 | A kind of face's real-time three-dimensional method for reconstructing |
CN109360267A (en) * | 2018-09-29 | 2019-02-19 | 杭州蓝芯科技有限公司 | A kind of thin objects quick three-dimensional reconstructing method |
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109242951A (en) * | 2018-08-06 | 2019-01-18 | 宁波盈芯信息科技有限公司 | A kind of face's real-time three-dimensional method for reconstructing |
CN109360267A (en) * | 2018-09-29 | 2019-02-19 | 杭州蓝芯科技有限公司 | A kind of thin objects quick three-dimensional reconstructing method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112233038B (en) | True image denoising method based on multi-scale fusion and edge enhancement | |
KR102319177B1 (en) | Method and apparatus, equipment, and storage medium for determining object pose in an image | |
CN110348330B (en) | Face pose virtual view generation method based on VAE-ACGAN | |
CN109410127B (en) | Image denoising method based on deep learning and multi-scale image enhancement | |
CN110427968B (en) | Binocular stereo matching method based on detail enhancement | |
WO2021129569A1 (en) | Human action recognition method | |
CN109035172B (en) | Non-local mean ultrasonic image denoising method based on deep learning | |
CN110070517B (en) | Blurred image synthesis method based on degradation imaging mechanism and generation countermeasure mechanism | |
CN111932577B (en) | Text detection method, electronic device and computer readable medium | |
CN112288788A (en) | Monocular image depth estimation method | |
CN117392496A (en) | Target detection method and system based on infrared and visible light image fusion | |
CN110390724B (en) | SLAM method with instance segmentation | |
CN113706407B (en) | Infrared and visible light image fusion method based on separation characterization | |
WO2020087434A1 (en) | Method and device for evaluating resolution of face image | |
CN112069923B (en) | 3D face point cloud reconstruction method and system | |
CN115330874B (en) | Monocular depth estimation method based on superpixel processing shielding | |
CN116309213A (en) | High-real-time multi-source image fusion method based on generation countermeasure network | |
CN113781368B (en) | Infrared imaging device based on local information entropy | |
CN114529455A (en) | Task decoupling-based parameter image super-resolution method and system | |
CN114155406A (en) | Pose estimation method based on region-level feature fusion | |
CN112069923A (en) | 3D face point cloud reconstruction method and system | |
CN114066760A (en) | Image denoising method, network model training method, device, medium, and apparatus | |
CN113240589A (en) | Image defogging method and system based on multi-scale feature fusion | |
CN111985535A (en) | Method and device for optimizing human body depth map through neural network | |
CN110189272B (en) | Method, apparatus, device and storage medium for processing image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: Room 101, No. 1, East Ring 3rd Street, Jitiagang, Huangjiang Town, Dongguan City, Guangdong Province, 523000 Applicant after: Guangdong Zhengyang Sensor Technology Co.,Ltd. Address before: 523000 Jitigang Village, Huangjiang Town, Dongguan City, Guangdong Province Applicant before: DONGGUAN ZHENGYANG ELECTRONIC MECHANICAL Co.,Ltd. |
|
GR01 | Patent grant |