WO2022205762A1 - 三维人体重建方法、装置、设备及存储介质 - Google Patents

三维人体重建方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2022205762A1
WO2022205762A1 PCT/CN2021/115537 CN2021115537W WO2022205762A1 WO 2022205762 A1 WO2022205762 A1 WO 2022205762A1 CN 2021115537 W CN2021115537 W CN 2021115537W WO 2022205762 A1 WO2022205762 A1 WO 2022205762A1
Authority
WO
WIPO (PCT)
Prior art keywords
human body
model
dimensional
image
target
Prior art date
Application number
PCT/CN2021/115537
Other languages
English (en)
French (fr)
Inventor
宋勃宇
邓又铭
刘文韬
钱晨
Original Assignee
深圳市慧鲤科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市慧鲤科技有限公司 filed Critical 深圳市慧鲤科技有限公司
Publication of WO2022205762A1 publication Critical patent/WO2022205762A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present disclosure relates to image processing technology, and in particular, to a three-dimensional human body reconstruction method, device, equipment and storage medium.
  • 3D human body reconstruction is an important problem in the field of computer vision and computer graphics.
  • the reconstructed human digital model has important applications in many fields, such as anthropometric measurements, virtual fitting, virtual anchors, custom design of game characters, and virtual reality social interaction.
  • how to project the human body in the real world into the virtual world to obtain a three-dimensional human body digital model is an important issue.
  • the digital reconstruction of a three-dimensional human body is very complicated, requiring the scanner to perform continuous scanning at multiple angles without dead ends around the scanning target; and the reconstruction effect still needs to be improved and improved.
  • the embodiments of the present disclosure provide at least a three-dimensional human body reconstruction method, apparatus, device, and storage medium.
  • a first aspect provides a three-dimensional human body reconstruction method, the method includes: based on a human body image of a target human body, fitting human parameters of a parameterized human body template to obtain a first three-dimensional mesh model; The extracted image features are used for three-dimensional human body reconstruction, and a second three-dimensional mesh model is determined; the first three-dimensional mesh model and the second three-dimensional mesh model are fused to obtain an initial three-dimensional model, and the initial three-dimensional model represents the target human body Geometry. According to the initial three-dimensional model and the human body image, human body texture reconstruction is performed to obtain a textured three-dimensional human body model corresponding to the target human body.
  • the human body parameters of the parameterized human body template are fitted based on the human body image of the target human body to obtain a first three-dimensional mesh model, including: based on the human body image in RGBD format of the target human body, the parameterized human body image is The human body parameters of the template are fitted to obtain an initial parameterized model; the depth information of the area where the target human body is located in the human body image is extracted; the contour information of the target human body on the human body image is identified; based on the depth information and the contour information, and geometrically deform the initial parameterized model to obtain the first three-dimensional mesh model.
  • the fitting of the human body parameters of the parameterized human body template based on the human body image of the target human body includes: fitting the human body parameters of the parameterized human body template based on the frontal human body image of the target human body, to obtain the frontal reconstruction result; fit the human body parameters of the parameterized human body template based on the backside image of the target human body to obtain a backside reconstruction result; and fuse the frontside reconstruction result and the backside reconstruction result.
  • performing human body texture reconstruction according to the initial three-dimensional model and the human body image to obtain a textured three-dimensional human body model corresponding to the target human body includes: according to the initial three-dimensional model, the target The human body texture reconstruction is performed on the frontal human body image of the human body and the back human body image of the target human body to obtain a textured three-dimensional human body model corresponding to the target human body.
  • performing human body texture reconstruction according to the initial three-dimensional model, the frontal human body image of the target human body, and the back human body image of the target human body, to obtain a textured three-dimensional human body model corresponding to the target human body which includes: mapping the front human body image and the back human body image to the initial three-dimensional model to obtain a three-dimensional human body model of the target human body filled with texture structures.
  • the obtaining the initial three-dimensional model by fusing the first three-dimensional mesh model and the second three-dimensional mesh model includes: obtaining the corresponding upper body region from the second three-dimensional mesh model.
  • the method further includes: performing geometric reconstruction on a local part of the target human body based on the frontal human body image of the target human body to obtain a local three-dimensional mesh model;
  • the model and the second three-dimensional mesh model are fused to obtain an initial three-dimensional model, including: fusing the local three-dimensional mesh model, the first three-dimensional mesh model and the second three-dimensional mesh model to obtain Initial 3D model.
  • the fusion of the local three-dimensional mesh model, the first three-dimensional mesh model, and the second three-dimensional mesh model to obtain an initial three-dimensional model includes: using the second three-dimensional mesh
  • the upper body model corresponding to the upper body region is obtained from the grid model, and the upper body region is a region including at least the head of the target human body;
  • the upper body model replaces the corresponding part in the first three-dimensional mesh model to obtain the initial three-dimensional model.
  • the performing geometric reconstruction on the local part of the target human body based on the frontal human body image of the target human body to obtain a local three-dimensional mesh model includes: performing feature extraction on the frontal human body image of the target human body , obtain a third image feature; determine the local three-dimensional mesh model according to the third image feature and the three-dimensional topology template of the local part.
  • the performing three-dimensional human body reconstruction based on the image features extracted from the human body image of the target human body, and determining the second three-dimensional mesh model includes: using the first deep neural network branch to reconstruct the frontal surface of the target human body performing three-dimensional reconstruction of the human body image to obtain a first human body model; performing three-dimensional reconstruction on a partial image in the frontal human body image through a second deep neural network branch to obtain a second human body model; wherein the partial image includes the target human body fuse the first human body model and the second human body model to obtain a fused human body model; perform grid processing on the fused human body model to obtain the second three-dimensional grid model.
  • a three-dimensional human body reconstruction device includes: a parametric reconstruction module, configured to fit the human body parameters of the parameterized human body template based on the human body image of the target human body, and obtain the first parameter of the target human body.
  • a three-dimensional mesh model configured to fit the human body parameters of the parameterized human body template based on the human body image of the target human body, and obtain the first parameter of the target human body.
  • a three-dimensional mesh model an image feature reconstruction module for performing three-dimensional human body reconstruction based on image features extracted from the human body image of the target human body to determine a second three-dimensional mesh model of the target human body;
  • a fusion module for The first three-dimensional mesh model and the second three-dimensional mesh model of the target human body are fused to obtain an initial three-dimensional model;
  • a texture reconstruction module is used to perform the target human body image according to the initial three-dimensional model and the human body image. The texture reconstruction of the human body is obtained, and a textured three-dimensional human body model corresponding to the target human
  • the parameterized reconstruction module is specifically used for: fitting the human body parameters of the parameterized human body template based on the human body image of the target human body to obtain an initial parameterized model; extracting the target human body in the human body image The depth information of the area where it is located; the contour information of the target human body on the human body image is identified; based on the depth information and the contour information, the initial parameterized model is geometrically deformed to obtain the first three-dimensional mesh model .
  • the method when the parameterized reconstruction module is used to fit the human body parameters of the parameterized human body template based on the human body image of the target human body, the method includes: matching the parameterized human body based on the frontal human body image of the target human body to the parameterized human body. Fitting the human body parameters of the template to obtain a frontal reconstruction result; fitting the human body parameters of the parameterized human body template based on the backside image of the target human body to obtain a backside reconstruction result; fusion.
  • the texture reconstruction module is specifically configured to: perform human texture reconstruction according to the initial three-dimensional model, and the frontal and backside images of the target human body, so as to obtain a textured corresponding to the target human body. 3D human body model.
  • the method when the fusion module is used to obtain the initial three-dimensional model by fusing the first three-dimensional mesh model and the second three-dimensional mesh model of the target human body, the method includes: obtaining the initial three-dimensional model from the second three-dimensional mesh Obtain the upper body model corresponding to the upper body area of the target human body in the model, and the upper body area is an area including at least the head of the target human body; use the upper body model to replace the corresponding part in the first three-dimensional mesh model to obtain the initial three-dimensional model.
  • the apparatus further includes: a local reconstruction module; the local reconstruction module is configured to perform geometric reconstruction on the local part of the target human body based on the frontal human body image of the target human body to obtain a local three-dimensional mesh model; when the fusion module is used to fuse the first three-dimensional mesh model and the second three-dimensional mesh model of the target human body to obtain an initial three-dimensional model, it includes: combining the local three-dimensional mesh model, the first three-dimensional mesh model, and the first three-dimensional mesh model. The three-dimensional mesh model and the second three-dimensional mesh model are fused to obtain an initial three-dimensional model.
  • the image feature reconstruction module is specifically configured to: perform three-dimensional reconstruction on the frontal human body image of the target human body through the first deep neural network branch to obtain the first human body model;
  • the partial image in the frontal human body image is three-dimensionally reconstructed to obtain a second human body model; wherein, the partial image includes a partial region of the target human body; the first human body model and the second human body model are fused to obtain fusing the human body model; performing grid processing on the fused human body model to obtain a second three-dimensional mesh model of the target human body.
  • an electronic device comprising: a memory and a processor, where the memory is used for storing computer-readable instructions, and the processor is used for invoking the computer instructions to implement any of the embodiments of the present disclosure.
  • a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the method described in any embodiment of the present disclosure.
  • a computer program product including a computer program, which implements the method described in any embodiment of the present disclosure when the computer program is executed by a processor.
  • the three-dimensional human body reconstruction method, device, device, and storage medium provided by the embodiments of the present disclosure, when performing three-dimensional human body reconstruction on a target human body, by combining the human body reconstruction based on parameterized human body templates and directly based on human body images, without using the human body
  • the human body reconstruction of the template makes the reconstructed 3D human body model not only ensures the robustness of the body shape such as the torso and limbs of the human body, but also improves the authenticity and accuracy of the upper body region of the human body; and this method can be based on the target human body. Reconstructing a small number of human body images also simplifies the user's cooperation cost, making the three-dimensional human body reconstruction easier.
  • FIG. 1 shows a flowchart of a three-dimensional human body reconstruction provided by at least one embodiment of the present disclosure
  • FIG. 2 shows a schematic diagram of a parametric human reconstruction provided by at least one embodiment of the present disclosure
  • FIG. 3 shows a schematic diagram of a three-dimensional human body reconstruction provided by at least one embodiment of the present disclosure
  • FIG. 4 shows a schematic diagram of another three-dimensional human body reconstruction provided by at least one embodiment of the present disclosure
  • FIG. 5 shows a schematic diagram of model fusion provided by at least one embodiment of the present disclosure
  • FIG. 6 shows a schematic diagram of a three-dimensional human body reconstruction apparatus provided by at least one embodiment of the present disclosure
  • FIG. 7 shows a schematic diagram of a three-dimensional human body reconstruction apparatus provided by at least one embodiment of the present disclosure.
  • 3D human body reconstruction has important applications in many fields, including but not limited to the following application scenarios.
  • the realism of some virtual reality application scenarios can be enhanced through 3D human reconstruction.
  • the 3D human body model obtained by 3D human body reconstruction can be imported into the game data to complete the generation of the personalized character.
  • the production of science fiction movies currently requires the use of various technologies such as green screens and motion capture. The hardware equipment is expensive and the overall process is time-consuming and complicated. Obtaining a virtual three-dimensional human body model through three-dimensional human body reconstruction can simplify the process and save resources.
  • 3D human body reconstruction expects to obtain a 3D human body model with better reconstruction effect as much as possible. For example, in scenarios such as virtual cloud conferences or AR virtual interaction scenarios, there is a higher demand for realism and immersion for the 3D human body model reconstructed from the 3D human body.
  • FIG. 1 shows a flowchart of a three-dimensional human body reconstruction provided by at least one embodiment of the present disclosure.
  • the method can include:
  • step 100 based on the human body image of the target human body, the human body parameters of the parameterized human body template are fitted to obtain a first three-dimensional mesh model of the target human body.
  • the target human body is the basic user of the 3D human body reconstruction.
  • Xiao Zhang can be called the target human body
  • the reconstructed 3D human body model is based on Xiao Zhang's body, which is similar to Xiao Zhang's body, appearance, clothing and hairstyle. have high similarity.
  • the human body image of the target human body may be a full-body frontal photo of the target human body.
  • the human body image may be an RGB color image.
  • the acquisition cost of this RGB format image is low.
  • it is not necessary to use high-cost equipment such as a depth-of-field camera during image acquisition, and it can be acquired by ordinary shooting equipment.
  • the human body image may also be an RGBD image. The image in the RGBD format can be used to make the reconstruction effect of the three-dimensional human model better.
  • the embodiments of the present disclosure do not limit the number of human body images used in human body reconstruction.
  • one or two human images can be used to fit and reconstruct the human parameters of the parameterized human template.
  • the parameterized human body template is a human body topology pre-defined before model reconstruction.
  • the template may have initialized human body parameters, and the posture and body shape of the human body are predefined by the initialized human body parameters.
  • the human body parameters may include posture parameters, body shape parameters, etc., and these parameters may represent the state of the human body in terms of movement posture, height, shortness, fatness, and head-to-body ratio.
  • the initial human body parameters in the parameterized human body template can be fitted, and the values of these human body parameters can be re-determined to obtain a three-dimensional model of the human body.
  • parametric human body reconstruction is performed based on the human body image of the target human body, which can be called structured reconstruction.
  • the parameterized human body reconstruction is the human body geometry reconstruction based on the parameterized human body template, which is to calculate a set of optimal parameters for the human body parameters in the template, so that the reconstructed human body posture and body shape are as consistent as possible with those in the human body image.
  • the specific process of reconstruction will not be described in detail here, and it can be performed according to any parametric human reconstruction process known to those skilled in the art.
  • a fitted parametric human body model can be obtained, which can also be called a three-dimensional mesh model.
  • the three-dimensional mesh model as a three-dimensional mesh (Mesh) representing the geometric shape of the human body, may include several vertices and faces.
  • the three-dimensional mesh obtained by the parametric human body reconstruction method may be referred to as the first three-dimensional mesh model.
  • Fig. 2 takes the parametric human body reconstruction from two images of the target human body as an example.
  • the front human body image 21 and the back human body image 22 of the target human body can be acquired.
  • parametric human body reconstruction is performed respectively.
  • the frontal reconstruction result 23 is obtained by fitting the human body parameters of the parameterized human body template based on the frontal human body image 21
  • the backside reconstruction result 24 is obtained by fitting the human body parameters of the parameterized human body template based on the back human body image 22 .
  • the front reconstruction result 23 and the back reconstruction result 24 are fused to obtain a fusion reconstruction result 25 .
  • the fusion reconstruction result 25 may be the first three-dimensional mesh model of the target human body.
  • the bone structure and skin weight of the target human body can also be output.
  • the first three-dimensional mesh model obtained by the parametric human body reconstruction method is relatively robust, and basically there is no abnormal torso or limbs, that is, the reconstructed human body shapes such as the human torso and limbs can be as good as possible. Consistent with those in human images.
  • step 102 three-dimensional human body reconstruction is performed based on the image features extracted from the human body image of the target human body, and a second three-dimensional mesh model of the target human body is determined.
  • a three-dimensional human body reconstruction method different from that in step 100 is adopted.
  • the reconstruction method in this step does not use the predefined human body topology, but directly performs three-dimensional reconstruction based on the human body image.
  • This reconstruction method can also be called unstructured reconstruction, or no fixed topology reconstruction.
  • FIG. 3 exemplifies a method for obtaining a 3D mesh model based on a single human image reconstruction.
  • a single human body image 31 of the target human body can be input into the first deep neural network branch 32 for three-dimensional reconstruction.
  • the first deep neural network branch 32 may include a global feature sub-network 321 and a first fitting sub-network 322 .
  • the features of the human body image 31 may be extracted through the global feature sub-network 321 to obtain high-level image features of the human body image 31, and the high-level image features may be referred to as first image features.
  • the global feature sub-network 321 may be a HourGlass convolutional network.
  • the first image feature is input to the first fitting sub-network 322, and the first fitting sub-network 322 can predict whether each voxel block in the three-dimensional space belongs to the interior of the target human body according to the first image feature.
  • the first fitting sub-network 322 may be a multilayer perceptron structure.
  • the first fitting sub-network 322 outputs a first human body model that already includes each three-dimensional voxel block located inside the target human body.
  • the meshing process of the first human body model can be continued, for example, the MarchingCubes algorithm is applied to the first human body model in the voxel space to obtain another three-dimensional mesh model of the target human body.
  • the other three-dimensional mesh model may be referred to as a second three-dimensional mesh model of the target body.
  • the bone structure of the target human body can also be obtained, and the skin weight is determined according to the bone structure and the second three-dimensional mesh model.
  • step 104 the first three-dimensional mesh model and the second three-dimensional mesh model of the target body are fused to obtain an initial three-dimensional model.
  • the first three-dimensional mesh model reconstructed in step 100 and the second three-dimensional mesh model reconstructed in step 102 may be fused, and the model obtained after fusion may be called an initial three-dimensional model.
  • the initial 3D model is also a 3D mesh model.
  • the upper body model corresponding to the upper body region of the target human body may be obtained from the second three-dimensional mesh model.
  • the upper body region is a region including at least the head of the target human body.
  • the upper body region may be the part above the shoulders of the human body, then the upper body model may have a hair shape, that is, the upper body model has better fineness.
  • the frontal human body image of the target human body can be input into a pre-trained key point detection model, and a plurality of key points in the upper body region can be determined through the key point detection model.
  • the first model key points corresponding to the key points in the first three-dimensional mesh model of the target body and the second model corresponding to the upper body model can be respectively determined according to the coordinates of these key points. key point.
  • the coordinates of the key points of each first model calculates of the key points of each second model, the first three-dimensional mesh model, the upper body model, and the camera external parameters when shooting the target body image, calculate the first three-dimensional mesh model.
  • the upper body model is then used to replace the corresponding part in the first three-dimensional mesh model based on the transformation relationship.
  • the initial 3D model obtained after fusion has the following advantages.
  • the upper body area is the area above the shoulder of the target human body.
  • the human body model below the shoulder in this initial 3D model is reconstructed by parametric human body reconstruction. It ensures the robustness of the torso and limbs of the human body, so that there will be no abnormally shaped torso or limbs in the initial three-dimensional model, that is, the reconstruction effect of the human body shape is better, so that the reconstructed human body can be reconstructed.
  • the posture and body shape are as consistent as possible with the human body image; on the other hand, the human body model of the region above the shoulder in the initial 3D model is a non-fixed topology (that is, the reconstruction does not use a predefined human body topology)
  • the resulting reconstruction which includes shapes such as hair, can more realistically simulate the target body. Therefore, the final initial 3D model not only reflects the texture of details such as hair, but also maintains the shape robustness of the human body torso and limbs, and has a good human reconstruction effect.
  • step 106 reconstruct the human body texture of the target human body according to the initial three-dimensional model and the human body image, and obtain a textured three-dimensional human body model of the target human body.
  • the reconstruction of the human body texture can be performed according to the frontal human body image and the rear human body image.
  • the above-mentioned frontal human body image may represent the frontal texture of the target human body
  • the rear human body image may represent the backside texture of the target human body
  • the frontal human body image and the back human body image may be mapped to the initial three-dimensional model of the target human body (representing the target human body).
  • the three-dimensional human body model of the target human body filled with the texture structure is obtained.
  • the interpolation method can be used to fill in the texture to obtain a three-dimensional human body model of the target human body with colored textures.
  • the three-dimensional mannequin may be a three-dimensional mesh model with a colored appearance.
  • the three-dimensional human body reconstruction method of this embodiment when performing three-dimensional human body reconstruction on the target human body, combines the human body reconstruction based on the parameterized human body template and the human body reconstruction directly based on the human body image without using the human body template, so that the reconstructed body can be reconstructed.
  • the three-dimensional human body model not only ensures the robustness of the body shape such as the torso and limbs of the human body, but also improves the authenticity and accuracy of the upper body region of the human body.
  • the user's cooperation cost makes the 3D human body reconstruction easier.
  • the first three-dimensional mesh model may continue to be geometrically deformed, so that the model produces a reasonable clothing texture.
  • the human body image used for parametric human body reconstruction may be an RGBD image, and depth information of the region where the target human body is located and contour information of the target human body on the human body image may be extracted according to the human body image. And based on the depth information and the outline information, geometric deformation is performed on the model obtained by fitting the human body parameters based on the parameterized human body template to simulate the texture of clothing.
  • the model obtained by fitting the human body parameters to the parameterized human body template may be called an initial parameterized model, and then the above-mentioned geometric deformation is performed based on the initial parameterized model to obtain the first three-dimensional mesh model.
  • the above-mentioned first three-dimensional mesh model and the second three-dimensional mesh model that simulate the texture of the clothes may be fused.
  • the obtained initial three-dimensional model not only has better clothing texture and is more realistic, but also is more robust in terms of the body shape of the human body.
  • the human body reconstruction not based on the parameterized human body template can be further improved to improve the geometric reconstruction accuracy of the reconstructed second three-dimensional mesh model of the target human body.
  • a second deep neural network branch 41 is added.
  • the second deep neural network branch 41 may include a local feature sub-network 411 and a second fitting sub-network 412 .
  • An image 42 of a local area can be extracted from the human body image 31 of the target human body, and the second deep neural network branch 41 is used for three-dimensional reconstruction of the local image 42 .
  • the human body image 31 may be a frontal human body image
  • the above-mentioned partial image 42 may be an image including a partial area of the target human body, for example, the partial area may be an area above the shoulder of the human body.
  • the first human body model is reconstructed through the first deep neural network branch 32, and the local image 42 is input into the second deep neural network branch 41, and the local feature sub-network 411 performs feature extraction on the local image 42 to obtain the first human body model. Two image features. Then, a second human body model is obtained through the second fitting sub-network 412 based on the second image feature and the intermediate feature output by the first fitting sub-network 322 .
  • the intermediate feature may be a feature output by a part of the network structure in the first fitting sub-network 322 .
  • the outputs of a partial number of the fully-connected layers may be input to the second fitting sub-network 412 as the intermediate features.
  • the structure of the second deep neural network branch 41 may be substantially the same as the structure of the first deep neural network branch 32 .
  • the global feature sub-network 321 in the first deep neural network branch 32 may include four blocks, and each block may include a certain number of convolutional layers, pooling layers and other feature extraction layers, while the second deep neural network
  • the local feature sub-network 411 in the branch 41 may include at least one of the above-mentioned Blocks.
  • the first human body model and the second human body model may be fused to obtain a fused human body model. And continue to perform mesh processing on the fusion human body model to obtain a second three-dimensional mesh model of the target human body.
  • the reconstruction effect on the local area of the target human body is improved.
  • the human skeleton structure of the target human body can also be obtained.
  • the skin weight may be calculated according to the above-mentioned second three-dimensional mesh model and the human skeleton structure.
  • the human skeleton structure and the second three-dimensional mesh model obtained above can be input into a deep learning network, and the skin weight of the model can be automatically obtained through the deep learning network.
  • the attribute feature corresponding to each vertex in the second three-dimensional mesh model may be first generated according to the second three-dimensional mesh model and the human skeleton structure.
  • the attribute feature may be constructed by using the spatial positional relationship between each vertex in the second three-dimensional mesh model and the human skeleton structure.
  • the attribute features of the vertex may include the following four features: 1) the position coordinates of the vertex; 2) the position coordinates of the K bone joint points closest to the vertex; 3) from the vertex to the above
  • the volume geodesic distance between each bone joint point in the K bone joint points of The angle between the bones where the described bone joint points are located.
  • the attribute feature of each vertex in the second three-dimensional mesh model After the attribute feature of each vertex in the second three-dimensional mesh model is obtained, the attribute feature of each vertex and the adjacency relationship feature between the vertices can be input into the spatial graph convolution attention network in the deep learning network .
  • the above features Before feeding these features into the spatial graph convolutional attention network, the above features can be transformed into hidden layer features through a multilayer perceptron.
  • the weight of each vertex affected by each of the above-mentioned K skeleton joint points can be predicted according to the above hidden layer features, and the latter multi-layer perceptron in the deep learning network can use this weight.
  • Perform normalization processing so that for a certain vertex, the sum of the influence weights of the K bone joint points on the vertex is 1.
  • the weight of each vertex in the finally obtained second three-dimensional mesh model affected by each bone joint point is the skin weight of the vertex.
  • the method of this embodiment can obtain the human skeleton structure according to the human body image of the target human body, and finally automatically calculate the skin weight according to the human skeleton structure and the reconstructed second three-dimensional mesh model, so that the mask can be automatically and quickly generated. Skin weights are more convenient for model driving.
  • the second three-dimensional mesh model reconstructed in the embodiment of the present disclosure is not based on the parameterized human body template, but is directly reconstructed based on the human body image.
  • the details of the local parts of the target body may be blurred.
  • the local part may be a human face, and the face is usually an area that the user pays more attention to. Therefore, the embodiments of the present disclosure can also separately perform geometric reconstruction on the local parts of the target human body.
  • a local part of the target human body can be geometrically reconstructed to obtain a local three-dimensional mesh model.
  • feature extraction is performed on the frontal body image of the target body to obtain third image features; the local 3D mesh model is determined according to the third image features and the 3D topology template of the local part.
  • the third image feature and the above-mentioned three-dimensional topology template of the local part may be input into a graph convolution network to obtain the local three-dimensional mesh model.
  • the single human body image of the target human body may be a frontal human body image, and the reconstruction of the human face may adopt fine reconstruction with a fixed topology.
  • the semantic structure of the human face is consistent, so a 3D face with a fixed topology structure can be used as a template, and the template can be called a 3D topology template.
  • the template includes a plurality of vertices, each vertex corresponds to a face semantics, for example, one vertex represents the tip of the nose, and the other vertex represents the corner of the eye.
  • each vertex position of the above face template can be obtained by regression through a deep neural network.
  • the deep neural network may include a deep convolutional network and a graph convolutional network
  • the frontal body image of the target body may be input into the deep convolutional network to extract third image features.
  • the third image feature and the 3D topology template of the face are used as the input of the graph convolution network
  • the 3D mesh model of a face output by the graph convolution network is obtained as a local 3D mesh model.
  • the local 3D mesh model Closer to the face of the target human body.
  • the input of the deep convolutional network can also be an image region containing a face captured from a human body image of the target human body.
  • the local 3D mesh model, the first 3D mesh model and the second 3D mesh model may be fused to obtain an initial 3D model.
  • the schematic diagram in FIG. 5 it may be based on the reconstruction method in the foregoing embodiment, performing geometric reconstruction of local parts according to the human body image 51 of the target human body to obtain a local three-dimensional mesh model 52 ; The parameters are fitted, and the first three-dimensional mesh model 53 is obtained by reconstruction; the reconstruction without a predefined human body template is also performed directly based on the human body image 51 to obtain the second three-dimensional mesh model 54 . It can be clearly seen from FIG.
  • the local three-dimensional mesh model 52 makes the shape of the reconstructed human face more refined and achieves a good face reconstruction effect
  • the first three-dimensional mesh model 53 is located between the torso and The reconstruction effect of body shape is good, but it lacks materials with detailed texture such as hair; while the second three-dimensional mesh model 54 has human hair, and the texture details are better.
  • the local three-dimensional mesh model 52 and the second 3D mesh model 54 can be fused first, and the fusion result can be fused with the first 3D mesh model 53 to obtain the initial 3D model 55 .
  • the first three-dimensional mesh model 53 and the second three-dimensional mesh model 54 can also be fused first, and then the fusion result can be fused with the local three-dimensional mesh model 52 to obtain the initial three-dimensional model 55 .
  • the upper body model corresponding to the upper body region of the target human body can be obtained from the second three-dimensional mesh model 54 , and the upper body model can be used to replace the corresponding part in the first three-dimensional mesh model 53 .
  • the corresponding parts in the upper body model are then replaced with the local three-dimensional mesh model 52 .
  • the transformation of the coordinate system can be performed first, and then the model fusion can be performed on the basis of the transformation of the coordinate system.
  • the human body image of the target human body can be input into a pre-trained key point detection model, and multiple key points of the face of the target human body in the image can be determined through the key point detection model. According to the coordinates of these key points in the face region of the human body image, the second model key points corresponding to the key points in the upper body model of the target human body and the corresponding local model key points on the face 3D mesh model can be respectively determined.
  • the coordinates of the key points of each second model calculate the corresponding key points of each second model on the upper body model and the three-dimensional mesh model of the face.
  • the coordinate transformation relationship between the key points of each local model is transformed into the coordinate system of the upper body model for fusion.
  • the model obtained by the fusion of the above three models is the initial three-dimensional model 55 , and the texture of the initial three-dimensional model 55 is then reconstructed and supplemented to obtain the final three-dimensional human body model 56 .
  • the three-dimensional human body reconstruction method of this embodiment is performed by geometrically reconstructing the local parts of the target human body, and combining the obtained local three-dimensional mesh model with the first three-dimensional mesh model of the target human body based on the parametric reconstruction.
  • the method is directly based on the second 3D mesh model of the target body reconstructed from the human body image for fusion, so that the final 3D mesh model of the target body is not only more robust in terms of body shape, but also more realistic in texture of clothing.
  • the local details are also clearer, finer and more accurate. For example, the hair, body shape, posture, etc. in the 3D mesh model of the target human body have better reconstruction effects, and make the facial features more fine and accurate.
  • a plurality of human body images from different angles may also be acquired to comprehensively perform the three-dimensional reconstruction of the target human body.
  • the three images may be acquired from different angles.
  • the three images can be used as the input of the global feature sub-network respectively, and a first image feature output by the global feature sub-network respectively corresponding to the three images can be obtained.
  • the three first image features are fused, and the image features obtained after fusion are used as the input of the first fitting sub-network 322 to continue processing.
  • the neural network models involved can be trained separately.
  • the first deep neural network branch and the texture generation network may each perform their own training.
  • a three-dimensional human body model of the user is to be constructed based on two human body images of the user, and the two human body images may include a frontal human body image and a back human body image of the user.
  • parametric reconstruction can be performed based on the frontal body image and the back body image of the user.
  • the human body parameters of the parameterized human body template can be fitted based on the frontal human body image of the user to obtain a frontal reconstruction result.
  • the human body parameters are fitted to obtain the back reconstruction results.
  • the initial parametric model is obtained by fusing the front reconstruction results and the back reconstruction results.
  • the initial parameterized model may be geometrically deformed based on the depth information and body contour information of the target body region in the frontal body image to obtain a first three-dimensional mesh model.
  • reconstruction without parameterized human body template can be performed directly based on the frontal human body image of the user.
  • the user's frontal body image 31 and the partial image 42 in the frontal body image can be reconstructed through the first deep neural network branch 32 and the second deep neural network branch 41 respectively according to the processing flow shown in FIG. 4 . , to obtain the corresponding first human body model and second human body model. After the first human body model and the second human body model are fused and meshed, a second three-dimensional mesh model of the user is obtained.
  • the second deep neural network branch 41 may perform geometric reconstruction of the face based on the frontal human body image 31 of the user.
  • the 3D Mesh of the user's face can be determined based on the image features extracted from the frontal human body image of the user and the 3D topology template of the face.
  • the upper body model of the human body in the second three-dimensional mesh model can be obtained, for example, the model of the area above the human body shoulder, and the upper body model can be used to replace the corresponding part in the first three-dimensional mesh model.
  • the face part in the above-mentioned upper body model can be replaced with a face 3D Mesh, and finally an initial 3D model of the user is obtained.
  • the initial 3D model can be similar to the model 55 in FIG. 5 , and has a relatively complete shape of human body limbs, good texture such as hair and clothing, and a relatively finely reconstructed human face, which has a good 3D reconstruction effect.
  • texture mapping is performed according to the frontal and backside images of the target body, and the parts that cannot be seen by the frontal and backside images are filled with textures by interpolation. Get a textured 3D mannequin.
  • FIG. 6 illustrates a schematic structural diagram of a three-dimensional human body reconstruction apparatus.
  • the apparatus may include a parametric reconstruction module 61 , an image feature reconstruction module 62 , a fusion module 63 and a texture reconstruction module 64 .
  • the parametric reconstruction module 61 is configured to fit the human body parameters of the parametric human body template based on the human body image of the target human body to obtain a first three-dimensional mesh model.
  • the image feature reconstruction module 62 is configured to perform three-dimensional human body reconstruction based on the image features extracted from the human body image of the target human body, and determine a second three-dimensional mesh model.
  • the fusion module 63 is configured to fuse the first three-dimensional grid model and the second three-dimensional grid model to obtain an initial three-dimensional model, where the initial three-dimensional model represents the geometric shape of the target human body.
  • the texture reconstruction module 64 is configured to reconstruct the texture of the human body according to the initial three-dimensional model and the human body image, so as to obtain a three-dimensional human body model with texture corresponding to the target human body.
  • the parameterized reconstruction module 61 is specifically configured to: fit the human body parameters of the parameterized human body template based on the human body image of the target human body to obtain an initial parameterized model; extract the location of the target human body in the human body image. depth information of the region; and, identifying the contour information of the target human body on the human body image; based on the depth information and contour information, geometrically deform the initial parameterized model to obtain the first three-dimensional grid Model.
  • the method when the parameterized reconstruction module 61 is used to fit the human body parameters of the parameterized human body template based on the human body image of the target human body, the method includes: matching the human body parameters of the parameterized human body template based on the frontal human body image. performing fitting to obtain a frontal reconstruction result; and, fitting the human body parameters of the parameterized human body template based on the backside body image to obtain a backside reconstruction result; fusing the frontal reconstruction result and the backside reconstruction result; the target
  • the human body image of the human body includes a frontal human body image and a back human body image of the target human body.
  • the texture reconstruction module 64 is specifically configured to: reconstruct the human body texture of the target human body according to the initial three-dimensional model, the frontal human body image and the back human body image of the target human body, and obtain the target human body The corresponding textured 3D human model.
  • the method includes: obtaining an initial three-dimensional model from the second three-dimensional mesh model.
  • the image feature reconstruction module 62 is specifically configured to: perform three-dimensional reconstruction on the frontal human body image of the target human body through the first deep neural network branch to obtain the first human body model; performing three-dimensional reconstruction on the partial image in the frontal human body image to obtain a second human body model; wherein, the partial image includes a partial area of the target human body; fuse the first human body model and the second human body model to obtain a fusion A human body model; performing grid processing on the fused human body model to obtain a second three-dimensional grid model of the target human body.
  • the apparatus may further include a local reconstruction module 65 for performing geometric reconstruction on the local parts of the target human body based on the frontal human body image of the target human body to obtain a local three-dimensional mesh Model.
  • the fusion module 63 when used for fusing the first three-dimensional mesh model and the second three-dimensional mesh model of the target body to obtain the initial three-dimensional model, includes: combining the local three-dimensional mesh model, the first three-dimensional mesh The model and the second 3D mesh model are fused to obtain an initial 3D model.
  • the above-mentioned apparatus may be used to execute any corresponding method described above, and for brevity, details are not repeated here.
  • An embodiment of the present disclosure further provides an electronic device, where the device includes a memory and a processor, where the memory is used to store computer-readable instructions, and the processor is used to invoke the computer instructions to implement any embodiment of this specification Methods.
  • An embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, implements the method of any embodiment of the present specification.
  • Embodiments of the present disclosure also provide a computer program product, including a computer program, which implements the method of any embodiment of the present specification when the computer program is executed by a processor.
  • one or more embodiments of the present disclosure may be provided as a method, system or computer program product. Accordingly, one or more embodiments of the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present disclosure may employ a computer program implemented on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein form of the product.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • Embodiments of the subject matter and functional operations described in this disclosure can be implemented in digital electronic circuitry, in tangible embodied computer software or firmware, in computer hardware including the structures disclosed in this disclosure and their structural equivalents, or in a combination of one or more.
  • Embodiments of the subject matter described in this disclosure may be implemented as one or more computer programs, ie, one or more of computer program instructions encoded on a tangible, non-transitory program carrier for execution by, or to control the operation of, data processing apparatus. multiple modules.
  • the program instructions may be encoded on an artificially generated propagated signal, such as a machine-generated electrical, optical or electromagnetic signal, which is generated to encode and transmit information to a suitable receiver device for interpretation by the data.
  • the processing device executes.
  • the computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of these.
  • the processes and logic flows described in this disclosure can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output.
  • the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, eg, an FPG multi (field programmable gate array) or multi SIC (application specific integrated circuit).
  • Computers suitable for the execution of a computer program include, for example, general and/or special purpose microprocessors, or any other type of central processing unit.
  • the central processing unit will receive instructions and data from read only memory and/or random access memory.
  • the basic components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operatively coupled to, one or more mass storage devices for storing data, such as magnetic, magneto-optical or optical disks, to receive data therefrom or to It transmits data, or both.
  • the computer does not have to have such a device.
  • the computer may be embedded in another device, such as a mobile phone, personal digital assistant (PD), mobile audio or video player, game console, global positioning system (GPS) receiver, or a universal serial bus ( Portable storage devices such as USB) flash drives, to name a few.
  • a mobile phone personal digital assistant (PD), mobile audio or video player, game console, global positioning system (GPS) receiver, or a universal serial bus ( Portable storage devices such as USB) flash drives, to name a few.
  • PD personal digital assistant
  • GPS global positioning system
  • USB Universal serial bus
  • Computer-readable media suitable for storage of computer program instructions and data include all forms of non-volatile memory, media, and memory devices including, for example, semiconductor memory devices (eg, EPROM, EEPROM, and flash memory devices), magnetic disks (eg, internal hard disks or memory devices). removable disks), magneto-optical disks, and CD-ROM and DVD-ROM disks.
  • semiconductor memory devices eg, EPROM, EEPROM, and flash memory devices
  • magnetic disks eg, internal hard disks or memory devices. removable disks
  • magneto-optical disks e.g., CD-ROM and DVD-ROM disks.
  • the processor and memory may be supplemented by or incorporated in special purpose logic circuitry.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Processing Or Creating Images (AREA)
  • Image Analysis (AREA)

Abstract

本公开实施例提供一种三维人体重建方法、装置、设备及存储介质。其中,该方法可以包括: 基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,得到第一三维网格模型; 基于由人体图像提取得到的图像特征进行三维人体重建,确定第二三维网格模型; 将第一三维网格模型和第二三维网格模型进行融合,得到初始三维模型,所述初始三维模型表示所述目标人体的几何形状; 根据初始三维模型和人体图像进行人体纹理重建,得到所述目标人体的带有纹理的三维人体模型。

Description

三维人体重建方法、装置、设备及存储介质
相关申请的交叉引用
本公开要求于2021年3月31日提交的、申请号为202110349813.1、发明名称为“三维人体重建方法、装置、设备及存储介质”的中国专利申请的优先权,该中国专利申请公开的全部内容以引用的方式并入本文中。
技术领域
本公开涉及图像处理技术,具体涉及一种三维人体重建方法、装置、设备及存储介质。
背景技术
三维人体重建是计算机视觉与计算机图形学领域的重要问题。重建出来的人体数字模型在很多领域有着重要应用,如人体测量、虚拟试衣、虚拟主播、游戏角色自定义设计、虚拟现实社交等。其中,如何将真实世界中的人体投射到虚拟世界中得到三维人体数字模型是一个重要问题。然而,三维人体的数字化重建是很复杂的,需要扫描者围绕扫描目标进行多角度无死角的连续扫描;并且,重建效果仍然有待改善和提高。
发明内容
有鉴于此,本公开实施例至少提供一种三维人体重建方法、装置、设备及存储介质。
第一方面,提供一种三维人体重建方法,所述方法包括:基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,得到第一三维网格模型;基于由所述人体图像提取得到的图像特征进行三维人体重建,确定第二三维网格模型;将第一三维网格模型和第二三维网格模型进行融合,得到初始三维模型,所述初始三维模型表示所述目标人体的几何形状。根据所述初始三维模型和所述人体图像,进行人体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型。
在一个例子中,所述基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,得到第一三维网格模型,包括:基于目标人体的RGBD格式的人体图像,对参数化人体模板的人体参数进行拟合,得到初始参数化模型;提取所述人体图像中的所述目 标人体所在区域的深度信息;识别所述目标人体在所述人体图像上的轮廓信息;基于所述深度信息和所述轮廓信息,对所述初始参数化模型进行几何形变,得到所述第一三维网格模型。
在一个例子中,所述基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,包括:基于所述目标人体的正面人体图像对参数化人体模板的人体参数进行拟合,得到正面重建结果;基于所述目标人体的背面人体图像对所述参数化人体模板的人体参数进行拟合,得到背面重建结果;将所述正面重建结果和所述背面重建结果进行融合。
在一个例子中,所述根据所述初始三维模型和所述人体图像进行人体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型,包括:根据所述初始三维模型、所述目标人体的正面人体图像和所述目标人体的背面人体图像进行人体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型。
在一个例子中,所述根据所述初始三维模型、所述目标人体的正面人体图像和所述目标人体的背面人体图像进行人体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型,包括:将所述正面人体图像和所述背面人体图像,映射至所述初始三维模型,得到填充有纹理结构的所述目标人体的三维人体模型。
在一个例子中,所述将所述第一三维网格模型和所述第二三维网格模型进行融合,得到初始三维模型,包括:从所述第二三维网格模型中获取上半身区域对应的上半身模型,所述上半身区域是至少包括所述目标人体的头部的区域;使用所述上半身模型替换所述第一三维网格模型中的对应部分,得到所述初始三维模型。
在一个例子中,所述方法还包括:基于所述目标人体的正面人体图像,对所述目标人体的局部部位进行几何重建,得到局部三维网格模型;所述将所述第一三维网格模型和所述第二三维网格模型进行融合,得到初始三维模型,包括:将所述局部三维网格模型、所述第一三维网格模型和所述第二三维网格模型进行融合,得到初始三维模型。
在一个例子中,所述将所述局部三维网格模型、所述第一三维网格模型和所述第二三维网格模型进行融合,得到初始三维模型,包括:由所述第二三维网格模型中获取上半身区域对应的上半身模型,所述上半身区域是至少包括所述目标人体的头部的区域;通过使用所述局部三维网格模型替换所述上半身模型中的对应部分、以及使用所述上半身模型替换所述第一三维网格模型中的对应部分,来得到所述初始三维模型。
在一个例子中,所述基于所述目标人体的正面人体图像,对所述目标人体的局部部 位进行几何重建,得到局部三维网格模型,包括:对所述目标人体的正面人体图像进行特征提取,得到第三图像特征;根据所述第三图像特征、以及所述局部部位的三维拓扑模板,确定所述局部三维网格模型。
在一个例子中,所述基于由所述目标人体的人体图像提取得到的图像特征进行三维人体重建,确定第二三维网格模型,包括:通过第一深度神经网络分支对所述目标人体的正面人体图像进行三维重建,得到第一人体模型;通过第二深度神经网络分支对所述正面人体图像中的局部图像进行三维重建,得到第二人体模型;其中,所述局部图像包括所述目标人体的局部区域;将所述第一人体模型和第二人体模型进行融合,得到融合人体模型;对所述融合人体模型进行网格化处理,得到所述第二三维网格模型。
第二方面,提供一种三维人体重建装置,所述装置包括:参数化重建模块,用于基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,得到所述目标人体的第一三维网格模型;图像特征重建模块,用于基于由所述目标人体的人体图像提取得到的图像特征进行三维人体重建,确定所述目标人体的第二三维网格模型;融合模块,用于将所述目标人体的第一三维网格模型和第二三维网格模型进行融合,得到初始三维模型;纹理重建模块,用于根据所述初始三维模型和所述人体图像,进行所述目标人体的人体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型。
在一个例子中,所述参数化重建模块,具体用于:基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,得到初始参数化模型;提取所述人体图像中的目标人体所在区域的深度信息;识别所述目标人体在所述人体图像上的轮廓信息;基于所述深度信息和轮廓信息,对所述初始参数化模型进行几何形变,得到所述第一三维网格模型。
在一个例子中,所述参数化重建模块,在用于基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合时,包括:基于所述目标人体的正面人体图像对参数化人体模板的人体参数进行拟合,得到正面重建结果;基于所述目标人体的背面人体图像对参数化人体模板的人体参数进行拟合,得到背面重建结果;将所述正面重建结果和背面重建结果进行融合。
在一个例子中,所述纹理重建模块,具体用于:根据所述初始三维模型、以及所述目标人体的正面人体图像和背面人体图像进行人体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型。
在一个例子中,所述融合模块,在用于将所述目标人体的第一三维网格模型和第二三维网格模型进行融合得到初始三维模型时,包括:从所述第二三维网格模型中获取目标人体的上半身区域对应的上半身模型,所述上半身区域是至少包括所述目标人体的头部的区域;使用所述上半身模型替换所述第一三维网格模型中的对应部分,得到所述初始三维模型。
在一个例子中,所述装置还包括:局部重建模块;所述局部重建模块,用于基于所述目标人体的正面人体图像,对所述目标人体的局部部位进行几何重建,得到局部三维网格模型;所述融合模块,在用于将所述目标人体的第一三维网格模型和第二三维网格模型进行融合得到初始三维模型时,包括:将所述局部三维网格模型、第一三维网格模型和第二三维网格模型进行融合,得到初始三维模型。
在一个例子中,所述图像特征重建模块,具体用于:通过第一深度神经网络分支对所述目标人体的正面人体图像进行三维重建,得到第一人体模型;通过第二深度神经网络分支对所述正面人体图像中的局部图像进行三维重建,得到第二人体模型;其中,所述局部图像包括所述目标人体的局部区域;将所述第一人体模型和第二人体模型进行融合,得到融合人体模型;对所述融合人体模型进行网格化处理,得到目标人体的第二三维网格模型。
第三方面,提供一种电子设备,该设备包括:存储器、处理器,所述存储器用于存储计算机可读指令,所述处理器用于调用所述计算机指令,实现本公开任一实施例所述的方法。
第四方面,提供一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时实现本公开任一实施例所述的方法。
第五方面,提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现本公开任一实施例所述的方法。
本公开实施例提供的三维人体重建方法、装置、设备及存储介质,在对目标人体进行三维人体重建时,通过结合了基于参数化人体模板的人体重建以及直接基于人体图像进行、而未使用人体模板的人体重建,使得重建得到的三维人体模型既保证了人体的躯干和四肢等身体形状的鲁棒性,又提高了人体上半身区域的真实性和精确性;并且,该方法可以依据目标人体的少量人体图像进行重建,也简化了用户的配合成本,使得三维人体重建更加简便。
附图说明
为了更清楚地说明本公开一个或多个实施例或相关技术中的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开一个或多个实施例中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1示出了本公开至少一个实施例提供的一种三维人体重建的流程图;
图2示出了本公开至少一个实施例提供的一种参数化人体重建的示意图;
图3示出了本公开至少一个实施例提供的一种三维人体重建的示意图;
图4示出了本公开至少一个实施例提供的另一种三维人体重建的示意图;
图5示出了本公开至少一个实施例提供的模型融合示意图;
图6示出了本公开至少一个实施例提供的一种三维人体重建装置的示意图;
图7示出了本公开至少一个实施例提供的一种三维人体重建装置的示意图。
具体实施方式
为了使本技术领域的人员更好地理解本公开一个或多个实施例中的技术方案,下面将结合本公开一个或多个实施例中的附图,对本公开一个或多个实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。基于本公开一个或多个实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都应当属于本公开保护的范围。
三维人体重建在很多领域有着重要应用,包括但不限于如下的应用场景。例如,可以通过三维人体重建,增强一些虚拟现实应用场景的真实感。比如,虚拟试衣、虚拟云会议、虚拟课堂等。又例如,可以将通过三维人体重建得到的三维人体模型,导入到游戏数据里,完成个性化人物角色的生成。再例如,目前制作科幻电影需要用到绿幕、动作捕捉等多种科技技术,硬件设备昂贵、整体流程耗时繁杂。通过三维人体重建得到虚拟的三维人体模型,可以简化流程,节省资源。
不论何种应用场景,三维人体重建都期望尽可能得到重建效果更好的三维人体模型。比如,在诸如虚拟云会议、或者AR虚拟交互场景下,对三维人体重建得到的三维人体模型有着更高的真实感和沉浸感的需求。
为了解决上述问题,本公开实施例提供了一种三维人体重建方法。请参见图1所示,图1示出了本公开至少一个实施例提供的一种三维人体重建的流程图。该方法可以包括:
在步骤100中,基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,得到所述目标人体的第一三维网格模型。
其中,目标人体即三维人体重建的基础用户。例如,对用户小张进行三维人体重建,小张可以称为目标人体,而重建得到的三维人体模型是以小张的身体为基础得到,与小张的体态、样貌、服装和发型等都具有较高的相似性。
本公开实施例对该目标人体的人体图像的采集方式、格式没有特殊要求。在一个示例性的方式中,该目标人体的人体图像可以是目标人体的一张全身人体正面照片。例如,在采集人体图像时,目标人体对应的用户可以自然站立,双腿分开与肩同宽,双臂微上扬成“A字”型。又例如,该人体图像可以是RGB彩色图像。这种RGB格式的图像的获得成本较低,比如,在图像采集时不需要使用景深摄像头等成本较高的设备,普通的拍摄设备就可以采集得到。该人体图像也可以是RGBD图像。可以利用该RGBD格式的图像使得三维人体模型的重建效果更好。
此外,本公开实施例不限制人体重建时使用的人体图像的数量。例如,可以使用一张或者两张人体图像对参数化人体模板的人体参数进行拟合重建。
其中,所述的参数化人体模板是在模型重建之前预定义的一个人体拓扑结构,该模板中可以具有初始化的人体参数,将人体的姿态、体型通过初始化的人体参数进行预定义。例如,该人体参数可以包括姿态参数、体型参数等,这些参数可以代表人体的运动位姿、高矮胖瘦、头身比例等方面的状态。在基于参数化人体模板进行三维人体重建时,可以对参数化人体模板中初始的人体参数进行拟合,重新确定这些人体参数的数值,得到人体的三维模型。
本步骤中的模型重建,基于目标人体的人体图像进行参数化人体重建,可以称为结构化重建。该参数化人体重建即基于参数化人体模板的人体几何重建,是对模板中的人体参数计算一组最优参数,使得重建得到的人体的姿态、体型等与人体图像中的尽量一致。这里对重建的具体过程将不再详述,可以按照本领域技术人员熟知的任意参数化人体重建流程执行。经过参数化人体重建后,可以得到一个拟合好的参数化的人体模型,也可以称为三维网格模型。该三维网格模型作为表示人体几何形状的三维网格(Mesh),可包括若干顶点和面。本实施例可以将通过参数化人体重建方式得到的三维Mesh称为 第一三维网格模型。
请参见图2的示例,图2以通过目标人体的两张图像进行参数化人体重建为例。具体的,可以获取目标人体的正面人体图像21和背面人体图像22。并分别基于该正面人体图像21和背面人体图像22进行参数化人体重建。其中,基于正面人体图像21对参数化人体模板的人体参数进行拟合得到正面重建结果23,基于背面人体图像22对参数化人体模板的人体参数进行拟合得到背面重建结果24。再将正面重建结果23和背面重建结果24进行融合,得到融合重建结果25。在一个例子中,该融合重建结果25可以是目标人体的第一三维网格模型。
在根据目标人体的人体图像进行参数化人体重建得到第一三维网格模型之外,还可以输出目标人体的骨骼结构以及蒙皮权重。
本步骤中通过参数化人体重建的方式得到的第一三维网格模型较为鲁棒,基本不会存在异常的躯干或者四肢,即使得重建得到的人体形状中的人体躯干和四肢等体态能够尽可能与人体图像中的一致。
在步骤102中,基于从所述目标人体的人体图像提取到的图像特征进行三维人体重建,确定所述目标人体的第二三维网格模型。
本步骤中,采用了一种与步骤100中不同的三维人体重建方式。具言之,本步骤中的重建方式没有使用预定义的人体拓扑结构,而是基于人体图像直接进行三维重建。这种重建方式也可以称为非结构化重建,或称为无固定拓扑重建。
请结合参见图3,示例了一种基于单张人体图像重建获取三维网格模型的方式。如图3所示,可以将目标人体的单张人体图像31输入第一深度神经网络分支32进行三维重建。在一个示例性的实施方式中,该第一深度神经网络分支32可以包括全局特征子网络321和第一拟合子网络322。
其中,可以通过全局特征子网络321对人体图像31进行特征提取,得到该人体图像31的高层图像特征,可以将该高层图像特征称为第一图像特征。例如,该全局特征子网络321可以是一个HourGlass卷积网络。该第一图像特征输入到第一拟合子网络322,该第一拟合子网络322可以依据第一图像特征对三维空间的每一个体素块是否属于目标人体的内部进行预测。例如,该第一拟合子网络322可以是一个多层感知机结构。该第一拟合子网络322输出第一人体模型,该第一人体模型已经包括了位于目标人体内部的各三维体素块。
接着,可以继续对该第一人体模型进行网格化处理,例如,对该第一人体模型在体素空间应用MarchingCubes算法,得到目标人体的另一三维网格模型。可以将该另一三维网格模型称为目标人体的第二三维网格模型。
此外,在通过本步骤得到目标人体的第二三维网格模型时,还可以得到目标人体的骨骼结构,并依据该骨骼结构和第二三维网格模型确定蒙皮权重。
在步骤104中,将所述目标人体的第一三维网格模型和第二三维网格模型进行融合,得到初始三维模型。
本步骤可以将步骤100重建得到的第一三维网格模型、以及步骤102重建得到的第二三维网格模型进行融合,融合后得到的模型可以称为初始三维模型。该初始三维模型也是一个三维网格模型。
例如,在进行融合时,可以由所述第二三维网格模型中获取目标人体的上半身区域对应的上半身模型。该上半身区域是至少包括所述目标人体的头部的区域。例如,该上半身区域可以是人体肩部以上部分,那么该上半身模型可以带有头发形状,也即该上半身模型的精细度较好。
可以先将目标人体的正面人体图像输入预先训练好的关键点检测模型,通过该关键点检测模型确定所述上半身区域中的多个关键点。在获取到该多个关键点之后,可以根据这些关键点的坐标,分别确定关键点在目标人体的第一三维网格模型中对应的第一模型关键点和在上半身模型中对应的第二模型关键点。再依据各第一模型关键点的坐标、各第二模型关键点的坐标、第一三维网格模型、上半身模型、以及拍摄目标人体图像时的相机外参,计算第一三维网格模型上的各第一模型关键点和上半身模型上的各对应的第二模型关键点之间的坐标变换关系。然后基于该变换关系使用上半身模型替换第一三维网格模型中的对应部分。
融合以后得到的初始三维模型具有如下的优点,以上半身区域是目标人体的肩部以上的区域为例来说:一方面,这个初始三维模型中处于肩部以下的人体模型是通过参数化人体重建方式得到的,保证了人体的躯干和四肢的鲁棒性,使得该初始三维模型中不会存在形状异常的躯干或四肢,也即对于人体形状的重建效果较好,从而能够使得重建得到的人体姿态和体型方面与人体图像中尽量保持一致;另一方面,该初始三维模型中的肩部以上的区域的人体模型是通过无固定拓扑的方式(即重建中并没有使用预先定义的人体拓扑)重建得到的,其中包括头发等形状,可以更逼真地模拟目标人体。因此最 终得到的初始三维模型既体现了头发等细节的质感,也能保持人体躯干和四肢等部位的形状鲁棒性,具有较好的人体重建效果。
在步骤106中,根据所述初始三维模型和所述人体图像,进行所述目标人体的人体纹理重建,得到所述目标人体的带有纹理的三维人体模型。
例如,当通过目标人体的正面人体图像和背面人体图像这两张图像进行重建时,可以依据该正面人体图像和该背面人体图像进行人体纹理的重建。
例如,上述的正面人体图像可表示目标人体的正面纹理,背面人体图像可表示目标人体的背面纹理,可以将所述正面人体图像和背面人体图像映射至所述目标人体的初始三维模型(表示目标人体的几何形状的人体三维Mesh),得到填充有纹理结构的所述目标人体的三维人体模型。并且,对于这两张图像中都无法看到的部位、即人体不可见区域,可以使用插值的方法进行纹理的填充,得到带有彩色纹理的所述目标人体的三维人体模型。该三维人体模型可以是一个带有彩色外观的三维网格模型。
本实施例的三维人体重建方法,在对目标人体进行三维人体重建时,通过结合了基于参数化人体模板的人体重建以及直接基于人体图像进行、而未使用人体模板的人体重建,使得重建得到的三维人体模型既保证了人体的躯干和四肢等身体形状的鲁棒性,又提高了人体上半身区域的真实性和精确性;并且,该方法可以依据目标人体的少量人体图像进行重建,也简化了用户的配合成本,使得三维人体重建更加简便。
在另一个实施例中,步骤100中重建得到第一三维网格模型之后,还可以继续对该第一三维网格模型进行几何形变,以使得该模型产生合理的衣物质感。
比如,用于进行参数化人体重建的人体图像可以是RGBD图像,可以根据该人体图像提取其中目标人体所在区域的深度信息、以及目标人体在所述人体图像上的轮廓信息。并基于该深度信息和该轮廓信息,对基于参数化人体模板的人体参数拟合得到的模型进行几何形变,以模拟衣物质感。这种情况下,可以将对参数化人体模板进行人体参数拟合后得到的模型称为初始参数化模型,然后基于该初始参数化模型进行上述的几何形变,得到第一三维网格模型。
相应的,在步骤104中进行模型之间的融合时,可将上述模拟了衣物质感的第一三维网格模型与第二三维网格模型进行融合。这样,得到的初始三维模型不仅具有较好的衣物质感,更为真实,而且在人体的体态形状上更具鲁棒性。
在又一个实施例中,可以对未基于参数化人体模板的人体重建进一步改进,以提高 重建得到的目标人体的第二三维网格模型的几何重建精度。请继续结合参见图4所示,在图3所示的网络结构的基础上,增加了第二深度神经网络分支41。该第二深度神经网络分支41可以包括局部特征子网络411和第二拟合子网络412。可以由目标人体的人体图像31中提取出局部区域的图像42,第二深度神经网络分支41是用于对该局部图像42进行三维重建。其中,该人体图像31可以是正面人体图像,上述的局部图像42可以是包括目标人体的局部区域的图像,该局部区域例如可以是人体肩部以上的区域。
具体的,通过第一深度神经网络分支32重建得到第一人体模型,并将局部图像42输入第二深度神经网络分支41,由局部特征子网络411对所述局部图像42进行特征提取,得到第二图像特征。再通过第二拟合子网络412基于所述第二图像特征以及第一拟合子网络322输出的中间特征,得到第二人体模型。其中,所述的中间特征可以是第一拟合子网络322中的部分网络结构输出的特征。示例性的,假设第一拟合子网络322中包括一定数量的全连接层,那么可以将其中部分数量的全连接层的输出作为所述中间特征输入至第二拟合子网络412。
示例性的,第二深度神经网络分支41的结构可以与第一深度神经网络分支32的结构基本相同。例如,第一深度神经网络分支32中的全局特征子网络321中可以包括四个Block,每一个Block中可以包括一定数量的卷积层、池化层等特征提取层,而第二深度神经网络分支41中的局部特征子网络411可以包括至少一个上述的Block。在得到第一人体模型和第二人体模型之后,可以将第一人体模型和第二人体模型进行融合,得到融合人体模型。并继续对该融合人体模型进行网格化处理,得到目标人体的第二三维网格模型。
如上,通过第二深度神经网络分支41对局部图像进行重建,提高了对目标人体的局部区域的重建效果。在得到第二三维网格模型之外,还可以得到该目标人体的人体骨骼结构。
此外,为了方便进行模型的驱动,可以根据上述的第二三维网格模型和人体骨骼结构计算蒙皮权重。例如,可以将该人体骨骼结构和上述得到的第二三维网格模型输入深度学习网络,通过深度学习网络自动得到模型的蒙皮权重。
例如,可以先根据第二三维网格模型和人体骨骼结构来生成所述第二三维网格模型中的各顶点对应的属性特征。该属性特征可以是利用第二三维网格模型中的各顶点与人体骨骼结构的空间位置关系来构造得到。例如,对于其中一个顶点来说,该顶点的属性特征可以包括如下四个特征:1)顶点的位置坐标;2)离顶点最近的K个骨骼关节点的 位置坐标;3)由顶点分别到上述的K个骨骼关节点中各个骨骼关节点之间的体积测地线距离;4)以上述K个骨骼关节点中的每个骨骼关节点为起点,由该起点指向所述顶点的向量与所述骨骼关节点所在的骨骼之间的夹角。
在获得第二三维网格模型中的各顶点的属性特征后,可以将该各顶点的属性特征、以及所述顶点之间的邻接关系特征,输入深度学习网络中的空间图卷积注意力网络。在将这些特征输入空间图卷积注意力网络之前,可以通过一个多层感知机将上述特征转换为隐层特征。通过空间图卷积注意力网络可以依据上述隐层特征预测每个顶点受上述K个骨骼关节点中的各个骨骼关节点影响的权重,深度学习网络中的后一个多层感知机可以将该权重进行归一化处理,使得对于某一个顶点来说,K个骨骼关节点对该顶点的影响权重和为1。最后得到的第二三维网格模型中各个顶点受各个骨骼关节点影响的权重即为该顶点的蒙皮权重。
本实施例的方法,能够依据目标人体的人体图像得到人体骨骼结构,并在最后依据该人体骨骼结构和重建得到的第二三维网格模型自动计算得到蒙皮权重,使得能够自动地快速生成蒙皮权重,更加方便进行模型驱动。
进一步的,本公开实施例中未基于参数化人体模板、而直接基于人体图像进行重建的方式重建得到的第二三维网格模型,尽管在模拟衣物质感等方面能够做的较好,但是仍然有可能在目标人体的局部部位的细节上是模糊的。例如,该局部部位可以是人脸,而脸部通常是用户较为关注的区域。因此,本公开实施例还可以对目标人体的局部部位单独进行几何重建。
具体的,可以基于所述目标人体的正面人体图像,对所述目标人体的局部部位进行几何重建,得到局部三维网格模型。例如,对目标人体的正面人体图像进行特征提取,得到第三图像特征;根据所述第三图像特征、以及所述局部部位的三维拓扑模板,确定所述局部三维网格模型。具体的,可以将第三图像特征和上述的局部部位的三维拓扑模板输入图卷积网络,得到所述局部三维网格模型。
以所述局部部位是人脸为例:目标人体的单张人体图像可以是正面人体图像,对人体脸部的重建可以采用固定拓扑的精细重建。具体的,人体脸部的语义结构具有一致性,因此可以采用一个固定拓扑结构的三维人脸作为模板,该模板可以称为三维拓扑模板。该模板上包括多个顶点,每个顶点对应一个脸部语义,例如,一个顶点表示鼻子尖部,另一个顶点表示眼角。在脸部重建时,可以通过一个深度神经网络来回归得到上述人脸模板的各顶点位置。
例如,该深度神经网络可以包括一个深度卷积网络和一个图卷积网络,可以将目标人体的正面人体图像输入所述深度卷积网络提取第三图像特征。再将该第三图像特征和人脸的三维拓扑模板作为图卷积网络的输入,得到图卷积网络输出的一个脸部的三维网格模型作为局部三维网格模型,该局部三维网格模型与目标人体的脸部较接近。可选的,深度卷积网络的输入也可以是由目标人体的人体图像中截取的包含脸部的图像区域。
在进行模型之间的融合时,可以是将所述局部三维网格模型、第一三维网格模型和第二三维网格模型进行融合,得到初始三维模型。请参见图5的示意,可以是基于前述实施例的重建方式,根据目标人体的人体图像51进行局部部位的几何重建,得到局部三维网格模型52;基于人体图像51对参数化人体模板的人体参数进行拟合,重建得到第一三维网格模型53;还直接基于该人体图像51进行无预定义的人体模板的重建,得到第二三维网格模型54。由图5中可以清楚的看到,局部三维网格模型52使得重建的人脸部的形状更为精细,达到了很好的脸部重建效果;第一三维网格模型53在人体的躯干和肢体形态方面的重建效果较好,但是缺少头发等细节质感的物质;而第二三维网格模型54具有人体头发,质感细节更好。
本领域技术人员可以采用熟知的任意方式将这三种模型“局部三维网格模型52、第一三维网格模型53、第二三维网格模型54”融合在一起,即本公开实施例不限制这三种模型具体的融合方式。例如,可以先将局部三维网格模型52与第二三维网格模型54进行融合,融合结果再与第一三维网格模型53融合,得到初始三维模型55。又例如,也可以先融合第一三维网格模型53和第二三维网格模型54,再将融合结果与局部三维网格模型52进行融合,得到初始三维模型55。
比如,在一个例子中,可以由第二三维网格模型54中获取目标人体的上半身区域对应的上半身模型,并使用所述上半身模型替换所述第一三维网格模型53中的对应部分。再使用局部三维网格模型52替换所述上半身模型中的对应部分。这两部分的模型替换的处理,都可以先进行坐标系的转换,再在坐标系转换的基础上进行模型的融合。
例如,以使用人脸三维网格模型替换上半身模型中的对应部分为例。可以先将目标人体的人体图像输入预先训练好的关键点检测模型,通过该关键点检测模型确定图像中目标人体的人脸的多个关键点。可以根据这些关键点在人体图像的人脸区域中的坐标,分别确定关键点在目标人体的上半身模型中对应的第二模型关键点、以及人脸三维网格模型上对应的局部模型关键点。再基于各第二模型关键点的坐标、各局部模型关键点的坐标以及拍摄目标人体图像时的相机外参,计算上半身模型上的各第二模型关键点和人 脸三维网格模型上对应的各局部模型关键点之间的坐标变换关系。然后基于该变换关系将人脸三维网格模型变换到上半身模型的坐标系下进行融合。
请继续参见图5,上述三种模型融合后得到的模型为初始三维模型55,再对该初始三维模型55进行纹理的重建补充,得到最终的三维人体模型56。
本实施例的三维人体重建方法,通过将目标人体的局部部位进行几何重建,并将得到的局部三维网格模型与基于参数化重建得到的目标人体的第一三维网格模型、通过无固定拓扑方式直接基于人体图像重建得到的目标人体的第二三维网格模型进行融合,使得最终得到的目标人体的三维网格模型不仅在体态形状上更具鲁棒性,而且在衣物质感上更具真实感,局部细节上也更加清晰、精细和准确。比如目标人体的三维网格模型中的头发、体型、体态等都具有较好的重建效果,而且使得脸部的五官结构更加的精细和准确。
在其他的实施例中,为了提高重建的效果,也可以获取多张不同角度的人体图像来综合进行目标人体的三维重建。例如,以获取了该目标人体的三张图像为例,这三张图像可以是从不同角度采集得到。可以将这三张图像分别作为全局特征子网络的输入,得到全局特征子网络输出的分别对应这三张图像的一个第一图像特征。然后将三个第一图像特征进行融合,将融合后得到的图像特征作为第一拟合子网络322的输入继续处理。
如上,通过获取目标人体的多张不同角度的图像来综合进行该目标人体的三维人体重建,能够得到该目标人体对应的更精细的三维人体模型。
还需要说明的是,本公开任一实施例描述的三维人体重建方法的各个流程步骤中,涉及到的神经网络模型,都可以分别进行训练。例如,第一深度神经网络分支和纹理生成网络可以是各自进行自身的训练。
如下描述一个三维人体重建流程的示例,其中,与前述任一方法实施例中描述的过程相同的处理,在此简单说明,详细过程可以结合参见前述实施例。
在该例子中,假设要基于用户的两张人体图像构建该用户的三维人体模型,所述的两张人体图像可以包括该用户的正面人体图像和背面人体图像。
首先,可以基于该用户的正面人体图像和背面人体图像进行参数化重建。
例如,可以根据图2所示意的处理流程,基于该用户的正面人体图像对参数化人体模板的人体参数进行拟合,得到正面重建结果,并基于该用户的背面人体图像对参数化人体模板的人体参数进行拟合,得到背面重建结果。再将所述的正面重建结果和背面重 建结果进行融合后,得到初始参数化模型。然后,可以基于正面人体图像中目标人体区域的深度信息和人体轮廓信息,对初始参数化模型进行几何形变,得到第一三维网格模型。
其次,可以直接基于该用户的正面人体图像,进行未依据参数化人体模板的重建。
例如,可以根据图4所示意的处理流程,分别通过第一深度神经网络分支32和第二深度神经网络分支41,对该用户的正面人体图像31、以及正面人体图像中的局部图像42进行重建,得到对应的第一人体模型和第二人体模型。将该第一人体模型和第二人体模型进行融合,并进行网格化处理后,得到该用户的第二三维网格模型。
其中,第二深度神经网络分支41可以基于该用户的正面人体图像31,进行人脸的几何重建。例如,可以基于由用户的正面人体图像中提取得到的图像特征、以及人脸的三维拓扑模板,确定该用户的人脸三维Mesh。
接着,可以获取第二三维网格模型中的人体上半身模型,例如可以是人体肩部以上区域的模型,并用该上半身模型替换第一三维网格模型中的对应部分。并且,可以用人脸三维Mesh替换上述上半身模型中的人脸部分,最终得到该用户的初始三维模型。该初始三维模型可以类似于图5中的模型55,既具有较为完善的人体肢体形状,而且具有较好的头发衣物等质感,人脸部也重建的较为精细,具有很好的三维重建效果。
最后,在上述初始三维模型的基础上,根据目标人体的正面人体图像和背面人体图像进行纹理贴图,并将正面人体图像和背面人体图像都无法看到的部分,通过插值的方式进行纹理填充,得到带有纹理的三维人体模型。
图6示例了一种三维人体重建装置的结构示意图,如图6所示,该装置可以包括参数化重建模块61、图像特征重建模块62、融合模块63和纹理重建模块64。参数化重建模块61,用于基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,得到第一三维网格模型。图像特征重建模块62,用于基于由所述目标人体的人体图像提取得到的图像特征进行三维人体重建,确定第二三维网格模型。融合模块63,用于将所述第一三维网格模型和所述第二三维网格模型进行融合,得到初始三维模型,所述初始三维模型表示所述目标人体的几何形状。纹理重建模块64,用于根据所述初始三维模型和所述人体图像进行人体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型。
在一个例子中,参数化重建模块61,具体用于:基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,得到初始参数化模型;提取所述人体图像中的目标人 体所在区域的深度信息;以及,识别所述目标人体在所述人体图像上的轮廓信息;基于所述深度信息和轮廓信息,对所述初始参数化模型进行几何形变,得到所述第一三维网格模型。
在一个例子中,参数化重建模块61,在用于基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合时,包括:基于所述正面人体图像对参数化人体模板的人体参数进行拟合,得到正面重建结果;以及,基于所述背面人体图像对参数化人体模板的人体参数进行拟合,得到背面重建结果;将所述正面重建结果和背面重建结果进行融合;所述目标人体的人体图像包括所述目标人体的正面人体图像和背面人体图像。
在一个例子中,纹理重建模块64,具体用于:根据所述初始三维模型、以及所述目标人体的正面人体图像和背面人体图像,进行所述目标人体的人体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型。
在一个例子中,融合模块63,在用于将所述目标人体的第一三维网格模型和第二三维网格模型进行融合得到初始三维模型时,包括:从所述第二三维网格模型中获取目标人体的上半身区域对应的上半身模型,所述上半身区域是至少包括所述目标人体的头部的区域;使用所述上半身模型替换所述第一三维网格模型中的对应部分,得到所述初始三维模型。
在一个例子中,图像特征重建模块62,具体用于:通过第一深度神经网络分支对所述目标人体的正面人体图像进行三维重建,得到第一人体模型;通过第二深度神经网络分支对所述正面人体图像中的局部图像进行三维重建,得到第二人体模型;其中,所述局部图像包括所述目标人体的局部区域;将所述第一人体模型和第二人体模型进行融合,得到融合人体模型;对所述融合人体模型进行网格化处理,得到目标人体的第二三维网格模型。
在一个例子中,如图7所示,该装置还可以包括局部重建模块65,用于基于所述目标人体的正面人体图像,对所述目标人体的局部部位进行几何重建,得到局部三维网格模型。
融合模块63,在用于将所述目标人体的第一三维网格模型和第二三维网格模型进行融合得到初始三维模型时,包括:将所述局部三维网格模型、第一三维网格模型和第二三维网格模型进行融合,得到初始三维模型。
在一些实施例中,上述装置可以用于执行上文所述的对应任意方法,为了简洁,这 里不再赘述。
本公开实施例还提供了一种电子设备,所述设备包括存储器、处理器,所述存储器用于存储计算机可读指令,所述处理器用于调用所述计算机指令,实现本说明书任一实施例的方法。
本公开实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时实现本说明书任一实施例的方法。
本公开实施例还提供了一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现本说明书任一实施例的方法。
本领域技术人员应明白,本公开一个或多个实施例可提供为方法、系统或计算机程序产品。因此,本公开一个或多个实施例可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且,本公开一个或多个实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
其中,本公开实施例所述的“和/或”表示至少具有两者中的其中一个,例如,“多和/或B”包括三种方案:多、B、以及“多和B”。
本公开中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于数据处理设备实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
上述对本公开特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的行为或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。
本公开中描述的主题及功能操作的实施例可以在以下中实现:数字电子电路、有形体现的计算机软件或固件、包括本公开中公开的结构及其结构性等同物的计算机硬件、或者它们中的一个或多个的组合。本公开中描述的主题的实施例可以实现为一个或多个计算机程序,即编码在有形非暂时性程序载体上以被数据处理装置执行或控制数据处理装置的操作的计算机程序指令中的一个或多个模块。可替代地或附加地,程序指令可以 被编码在人工生成的传播信号上,例如机器生成的电、光或电磁信号,该信号被生成以将信息编码并传输到合适的接收机装置以由数据处理装置执行。计算机存储介质可以是机器可读存储设备、机器可读存储基板、随机或串行存取存储器设备、或它们中的一个或多个的组合。
本公开中描述的处理及逻辑流程可以由执行一个或多个计算机程序的一个或多个可编程计算机执行,以通过根据输入数据进行操作并生成输出来执行相应的功能。所述处理及逻辑流程还可以由专用逻辑电路—例如FPG多(现场可编程门阵列)或多SIC(专用集成电路)来执行,并且装置也可以实现为专用逻辑电路。
适合用于执行计算机程序的计算机包括,例如通用和/或专用微处理器,或任何其他类型的中央处理单元。通常,中央处理单元将从只读存储器和/或随机存取存储器接收指令和数据。计算机的基本组件包括用于实施或执行指令的中央处理单元以及用于存储指令和数据的一个或多个存储器设备。通常,计算机还将包括用于存储数据的一个或多个大容量存储设备,例如磁盘、磁光盘或光盘等,或者计算机将可操作地与此大容量存储设备耦接以从其接收数据或向其传送数据,抑或两种情况兼而有之。然而,计算机不是必须具有这样的设备。此外,计算机可以嵌入在另一设备中,例如移动电话、个人数字助理(PD多)、移动音频或视频播放器、游戏操纵台、全球定位系统(GPS)接收机、或例如通用串行总线(USB)闪存驱动器的便携式存储设备,仅举几例。
适合于存储计算机程序指令和数据的计算机可读介质包括所有形式的非易失性存储器、媒介和存储器设备,例如包括半导体存储器设备(例如EPROM、EEPROM和闪存设备)、磁盘(例如内部硬盘或可移动盘)、磁光盘以及CD ROM和DVD-ROM盘。处理器和存储器可由专用逻辑电路补充或并入专用逻辑电路中。
虽然本公开包含许多具体实施细节,但是这些不应被解释为限制任何公开的范围或所要求保护的范围,而是主要用于描述特定公开的具体实施例的特征。本公开内在多个实施例中描述的某些特征也可以在单个实施例中被组合实施。另一方面,在单个实施例中描述的各种特征也可以在多个实施例中分开实施或以任何合适的子组合来实施。此外,虽然特征可以如上所述在某些组合中起作用并且甚至最初如此要求保护,但是来自所要求保护的组合中的一个或多个特征在一些情况下可以从该组合中去除,并且所要求保护的组合可以指向子组合或子组合的变型。
类似地,虽然在附图中以特定顺序描绘了操作,但是这不应被理解为要求这些操作以所示的特定顺序执行或顺次执行、或者要求所有例示的操作被执行,以实现期望的结 果。在某些情况下,多任务和并行处理可能是有利的。此外,上述实施例中的各种系统模块和组件的分离不应被理解为在所有实施例中均需要这样的分离,并且应当理解,所描述的程序组件和系统通常可以一起集成在单个软件产品中,或者封装成多个软件产品。
由此,主题的特定实施例已被描述。其他实施例在所附权利要求书的范围以内。在某些情况下,权利要求书中记载的动作可以以不同的顺序执行并且仍实现期望的结果。此外,附图中描绘的处理并非必需所示的特定顺序或顺次顺序,以实现期望的结果。在某些实现中,多任务和并行处理可能是有利的。
以上所述仅为本公开一个或多个实施例的较佳实施例而已,并不用以限制本公开一个或多个实施例,凡在本公开一个或多个实施例的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本公开一个或多个实施例保护的范围之内。

Claims (20)

  1. 一种三维人体重建方法,其特征在于,所述方法包括:
    基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,得到第一三维网格模型;
    基于由所述人体图像提取得到的图像特征进行三维人体重建,确定第二三维网格模型;
    将所述第一三维网格模型和所述第二三维网格模型进行融合,得到初始三维模型,所述初始三维模型表示所述目标人体的几何形状;
    根据所述初始三维模型和所述人体图像进行人体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型。
  2. 根据权利要求1所述的方法,其特征在于,所述基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,得到第一三维网格模型,包括:
    基于目标人体的RGBD格式的人体图像,对参数化人体模板的人体参数进行拟合,得到初始参数化模型;
    提取所述人体图像中的所述目标人体所在区域的深度信息;
    识别所述目标人体在所述人体图像上的轮廓信息;
    基于所述深度信息和所述轮廓信息,对所述初始参数化模型进行几何形变,得到所述第一三维网格模型。
  3. 根据权利要求1或2所述的方法,其特征在于,所述基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,包括:
    基于所述目标人体的正面人体图像对参数化人体模板的人体参数进行拟合,得到正面重建结果;
    基于所述目标人体的背面人体图像对所述参数化人体模板的人体参数进行拟合,得到背面重建结果;
    将所述正面重建结果和所述背面重建结果进行融合。
  4. 根据权利要求1至3任一所述的方法,其特征在于,所述根据所述初始三维模型和所述人体图像进行人体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型,包括:
    根据所述初始三维模型、所述目标人体的正面人体图像和所述目标人体的背面人体图像进行人体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述初始三维模型、所述 目标人体的正面人体图像和所述目标人体的背面人体图像进行人体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型,包括:
    将所述正面人体图像和所述背面人体图像,映射至所述初始三维模型,得到填充有纹理结构的所述目标人体的三维人体模型。
  6. 根据权利要求1至5任一所述的方法,其特征在于,所述将所述第一三维网格模型和所述第二三维网格模型进行融合,得到初始三维模型,包括:
    从所述第二三维网格模型中获取上半身区域对应的上半身模型,所述上半身区域是至少包括所述目标人体的头部的区域;
    使用所述上半身模型替换所述第一三维网格模型中的对应部分,得到所述初始三维模型。
  7. 根据权利要求1至5任一所述的方法,其特征在于,所述方法还包括:
    基于所述目标人体的正面人体图像,对所述目标人体的局部部位进行几何重建,得到局部三维网格模型;
    所述将所述第一三维网格模型和所述第二三维网格模型进行融合,得到初始三维模型,包括:
    将所述局部三维网格模型、所述第一三维网格模型和所述第二三维网格模型进行融合,得到初始三维模型。
  8. 根据权利要求7所述的方法,其特征在于,所述将所述局部三维网格模型、所述第一三维网格模型和所述第二三维网格模型进行融合,得到初始三维模型,包括:
    由所述第二三维网格模型中获取上半身区域对应的上半身模型,所述上半身区域是至少包括所述目标人体的头部的区域;通过使用所述局部三维网格模型替换所述上半身模型中的对应部分、以及使用所述上半身模型替换所述第一三维网格模型中的对应部分,来得到所述初始三维模型。
  9. 根据权利要求7或8所述的方法,其特征在于,所述基于所述目标人体的正面人体图像,对所述目标人体的局部部位进行几何重建,得到局部三维网格模型,包括:
    对所述目标人体的正面人体图像进行特征提取,得到第三图像特征;
    根据所述第三图像特征、以及所述局部部位的三维拓扑模板,确定所述局部三维网格模型。
  10. 根据权利要求1至9任一所述的方法,其特征在于,所述基于由所述目标人体的人体图像提取得到的图像特征进行三维人体重建,确定第二三维网格模型,包括:
    通过第一深度神经网络分支对所述目标人体的正面人体图像进行三维重建,得到第 一人体模型;
    通过第二深度神经网络分支对所述正面人体图像中的局部图像进行三维重建,得到第二人体模型;其中,所述局部图像包括所述目标人体的局部区域;
    将所述第一人体模型和第二人体模型进行融合,得到融合人体模型;
    对所述融合人体模型进行网格化处理,得到所述第二三维网格模型。
  11. 一种三维人体重建装置,其特征在于,所述装置包括:
    参数化重建模块,用于基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,得到所述目标人体的第一三维网格模型;
    图像特征重建模块,用于基于由所述目标人体的人体图像提取得到的图像特征进行三维人体重建,确定所述目标人体的第二三维网格模型;
    融合模块,用于将所述目标人体的第一三维网格模型和第二三维网格模型进行融合,得到初始三维模型;
    纹理重建模块,用于根据所述初始三维模型和所述人体图像,进行所述目标人体的人体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型。
  12. 根据权利要求11所述的装置,其特征在于,所述参数化重建模块,具体用于:
    基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合,得到初始参数化模型;
    提取所述人体图像中的目标人体所在区域的深度信息;
    识别所述目标人体在所述人体图像上的轮廓信息;
    基于所述深度信息和轮廓信息,对所述初始参数化模型进行几何形变,得到所述第一三维网格模型。
  13. 根据权利要求11或12所述的装置,其特征在于,所述参数化重建模块,在用于基于目标人体的人体图像,对参数化人体模板的人体参数进行拟合时,包括:
    基于所述目标人体的正面人体图像对参数化人体模板的人体参数进行拟合,得到正面重建结果;
    基于所述目标人体的背面人体图像对参数化人体模板的人体参数进行拟合,得到背面重建结果;
    将所述正面重建结果和背面重建结果进行融合。
  14. 根据权利要求11至13任一所述的装置,其特征在于,所述纹理重建模块,具体用于:
    根据所述初始三维模型、以及所述目标人体的正面人体图像和背面人体图像进行人 体纹理重建,得到所述目标人体对应的带有纹理的三维人体模型。
  15. 根据权利要求11至14任一所述的装置,其特征在于,所述融合模块,在用于将所述目标人体的第一三维网格模型和第二三维网格模型进行融合得到初始三维模型时,包括:
    从所述第二三维网格模型中获取目标人体的上半身区域对应的上半身模型,所述上半身区域是至少包括所述目标人体的头部的区域;使用所述上半身模型替换所述第一三维网格模型中的对应部分,得到所述初始三维模型。
  16. 根据权利要求11至14任一所述的装置,其特征在于,
    所述装置还包括:局部重建模块;所述局部重建模块,用于基于所述目标人体的正面人体图像,对所述目标人体的局部部位进行几何重建,得到局部三维网格模型;
    所述融合模块,在用于将所述目标人体的第一三维网格模型和第二三维网格模型进行融合得到初始三维模型时,包括:将所述局部三维网格模型、第一三维网格模型和第二三维网格模型进行融合,得到初始三维模型。
  17. 根据权利要求11至16任一所述的装置,其特征在于,
    所述图像特征重建模块,具体用于:通过第一深度神经网络分支对所述目标人体的正面人体图像进行三维重建,得到第一人体模型;通过第二深度神经网络分支对所述正面人体图像中的局部图像进行三维重建,得到第二人体模型;其中,所述局部图像包括所述目标人体的局部区域;将所述第一人体模型和第二人体模型进行融合,得到融合人体模型;对所述融合人体模型进行网格化处理,得到目标人体的第二三维网格模型。
  18. 一种电子设备,其特征在于,包括:存储器、处理器,所述存储器用于存储计算机可读指令,所述处理器用于调用所述计算机指令,实现权利要求1至10任一所述的方法。
  19. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述程序被处理器执行时实现权利要求1至10任一所述的方法。
  20. 一种计算机程序产品,包括计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至10任一所述的方法。
PCT/CN2021/115537 2021-03-31 2021-08-31 三维人体重建方法、装置、设备及存储介质 WO2022205762A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110349813.1A CN112950769A (zh) 2021-03-31 2021-03-31 三维人体重建方法、装置、设备及存储介质
CN202110349813.1 2021-03-31

Publications (1)

Publication Number Publication Date
WO2022205762A1 true WO2022205762A1 (zh) 2022-10-06

Family

ID=76231618

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/115537 WO2022205762A1 (zh) 2021-03-31 2021-08-31 三维人体重建方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN112950769A (zh)
WO (1) WO2022205762A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117523152A (zh) * 2024-01-04 2024-02-06 广州趣丸网络科技有限公司 一种三维人脸重建方法、装置、计算机设备和存储介质

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112950769A (zh) * 2021-03-31 2021-06-11 深圳市慧鲤科技有限公司 三维人体重建方法、装置、设备及存储介质
CN113012282B (zh) * 2021-03-31 2023-05-19 深圳市慧鲤科技有限公司 三维人体重建方法、装置、设备及存储介质
WO2023024036A1 (zh) * 2021-08-26 2023-03-02 华为技术有限公司 一种人物三维模型的重建方法及装置
CN113989434A (zh) * 2021-10-27 2022-01-28 聚好看科技股份有限公司 一种人体三维重建方法及设备
CN115223023B (zh) * 2022-09-16 2022-12-20 杭州得闻天下数字文化科技有限公司 基于立体视觉和深度神经网络的人体轮廓估计方法及装置
CN116704097B (zh) * 2023-06-07 2024-03-26 好易购家庭购物有限公司 基于人体姿态一致性和纹理映射的数字化人形象设计方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084884A (zh) * 2019-04-28 2019-08-02 叠境数字科技(上海)有限公司 一种人体模型面部区域重建方法
CN111508079A (zh) * 2020-04-22 2020-08-07 深圳追一科技有限公司 虚拟服饰试穿方法、装置、终端设备及存储介质
CN111968169A (zh) * 2020-08-19 2020-11-20 北京拙河科技有限公司 动态人体三维重建方法、装置、设备和介质
US20200374420A1 (en) * 2019-05-22 2020-11-26 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and storage medium
CN112950769A (zh) * 2021-03-31 2021-06-11 深圳市慧鲤科技有限公司 三维人体重建方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084884A (zh) * 2019-04-28 2019-08-02 叠境数字科技(上海)有限公司 一种人体模型面部区域重建方法
US20200374420A1 (en) * 2019-05-22 2020-11-26 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and storage medium
CN111508079A (zh) * 2020-04-22 2020-08-07 深圳追一科技有限公司 虚拟服饰试穿方法、装置、终端设备及存储介质
CN111968169A (zh) * 2020-08-19 2020-11-20 北京拙河科技有限公司 动态人体三维重建方法、装置、设备和介质
CN112950769A (zh) * 2021-03-31 2021-06-11 深圳市慧鲤科技有限公司 三维人体重建方法、装置、设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117523152A (zh) * 2024-01-04 2024-02-06 广州趣丸网络科技有限公司 一种三维人脸重建方法、装置、计算机设备和存储介质
CN117523152B (zh) * 2024-01-04 2024-04-12 广州趣丸网络科技有限公司 一种三维人脸重建方法、装置、计算机设备和存储介质

Also Published As

Publication number Publication date
CN112950769A (zh) 2021-06-11

Similar Documents

Publication Publication Date Title
WO2022205762A1 (zh) 三维人体重建方法、装置、设备及存储介质
WO2022205760A1 (zh) 三维人体重建方法、装置、设备及存储介质
JP7249390B2 (ja) 単眼カメラを用いたリアルタイム3d捕捉およびライブフィードバックのための方法およびシステム
US10846903B2 (en) Single shot capture to animated VR avatar
KR101608253B1 (ko) 이미지 기반 멀티 뷰 3d 얼굴 생성
Vasudevan et al. High-quality visualization for geographically distributed 3-D teleimmersive applications
JP7448566B2 (ja) クロスリアリティシステムにおけるスケーラブル3次元オブジェクト認識
CN109636831A (zh) 一种估计三维人体姿态及手部信息的方法
CN109978984A (zh) 人脸三维重建方法及终端设备
US20190026935A1 (en) Method and system for providing virtual reality experience based on ultrasound data
WO2022237249A1 (zh) 三维重建方法、装置和系统、介质及计算机设备
CN116228943B (zh) 虚拟对象面部重建方法、面部重建网络训练方法及装置
CN110458924A (zh) 一种三维脸部模型建立方法、装置和电子设备
CN114450719A (zh) 人体模型重建方法、重建系统及存储介质
CN115115805A (zh) 三维重建模型的训练方法、装置、设备及存储介质
CN114219001A (zh) 一种模型融合方法及相关装置
CN107464278B (zh) 全视向的球体光场渲染方法
WO2023160074A1 (zh) 一种图像生成方法、装置、电子设备以及存储介质
CN111105489A (zh) 数据合成方法和装置、存储介质和电子装置
CN115272608A (zh) 一种人手重建方法及设备
Akbar et al. Refining Human 3D Reconstruction from 2D Images
Wang et al. Automatic generation of game content customized by players: Generate 3D game character based on pictures
CN117218279A (zh) 生成服装数据的方法及装置
Shanbhag et al. Face to Face Augmented Reality-Broadening the Horizons of AR Communication
Wang A Framework for Realistic Clothed Avatar Reconstruction and Animation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21934401

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 22.01.2024)