CN112348957A - Three-dimensional portrait real-time reconstruction and rendering method based on multi-view depth camera - Google Patents

Three-dimensional portrait real-time reconstruction and rendering method based on multi-view depth camera Download PDF

Info

Publication number
CN112348957A
CN112348957A CN202011225534.6A CN202011225534A CN112348957A CN 112348957 A CN112348957 A CN 112348957A CN 202011225534 A CN202011225534 A CN 202011225534A CN 112348957 A CN112348957 A CN 112348957A
Authority
CN
China
Prior art keywords
dimensional
portrait
human body
cameras
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011225534.6A
Other languages
Chinese (zh)
Inventor
徐迪
王凯
毛文涛
孙立
张旭
李臻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Shadow Creator Information Technology Co Ltd
Original Assignee
Shanghai Shadow Creator Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Shadow Creator Information Technology Co Ltd filed Critical Shanghai Shadow Creator Information Technology Co Ltd
Priority to CN202011225534.6A priority Critical patent/CN112348957A/en
Publication of CN112348957A publication Critical patent/CN112348957A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Graphics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Geometry (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a real-time reconstruction and rendering method of a three-dimensional portrait based on a multi-view depth camera, which relates to the field of computer vision, and comprises the steps of calibrating a plurality of cameras to obtain relative poses among the cameras, transposing depth information output by the cameras into a unified three-dimensional coordinate system according to the relative poses to form point clouds, respectively judging whether each voxel is occupied by the point clouds, if so, respectively constructing a truncated symbol distance function for each voxel according to the relative poses and a pre-established portrait mask, carrying out weighted average on the truncated symbol distance functions corresponding to each voxel and utilizing a point cloud gridding algorithm to obtain a three-dimensional human body grid, carrying out weighted average on color information to obtain a three-dimensional human body grid carrying the color information, inputting the three-dimensional human body grid into a pre-established countermeasure neural network to render the three-dimensional human body grid, and a corresponding two-dimensional portrait is obtained, so that the real-time capture of the three-dimensional portrait is realized, and the imaging quality is improved.

Description

Three-dimensional portrait real-time reconstruction and rendering method based on multi-view depth camera
Technical Field
The invention relates to the field of computer vision, in particular to a three-dimensional portrait real-time reconstruction and rendering method based on a multi-view depth camera.
Background
At present, a plurality of three-dimensional whole body portrait volume shooting systems mainly adopt 2 types of schemes, one type adopts a high-definition color camera, depth information is lacked, the amount of calculation is huge by relying on a traditional feature point matching method, and real-time performance is difficult to achieve; one type employs depth cameras, but fails to address the problem of depth camera aberrations to black objects and to objects with finer structures, resulting in lower quality.
Therefore, the prior art fails to effectively solve the problems of real-time capture and high-quality rendering of three-dimensional portraits.
Disclosure of Invention
In order to overcome the defects of the prior art, the embodiment of the invention provides a three-dimensional portrait real-time reconstruction and rendering method based on a multi-view depth camera, which comprises the following steps:
calibrating a plurality of cameras to obtain relative poses among the cameras, and constructing a unified three-dimensional coordinate system according to the relative poses;
dividing a plurality of voxels in the three-dimensional coordinate system, and transposing the depth information output by the plurality of cameras into a uniform three-dimensional coordinate system to form a point cloud;
respectively judging whether each voxel is occupied by the point cloud, if so, respectively constructing a symbol distance function based on truncation for each voxel according to the relative pose and a pre-established portrait mask;
carrying out weighted average on the truncated symbol distance function corresponding to each voxel and obtaining a human body three-dimensional grid by utilizing a point cloud meshing algorithm;
constructing uv chartlets according to the relative poses, and projecting the color information of each camera to a human body three-dimensional grid in a chartlet mode;
calculating the weight of each overlapped area of the uv map according to the relation between the included angle and the distance between the three-dimensional point on the human body three-dimensional grid and the camera, and carrying out weighted average on the color information according to the weight to obtain the human body three-dimensional grid carrying the color information;
and inputting the human body three-dimensional grid into a pre-constructed antagonistic neural network to render the human body three-dimensional grid, so as to obtain a corresponding two-dimensional portrait.
Preferably, calibrating the plurality of cameras to obtain the relative poses between the plurality of cameras includes:
the method comprises the steps of using a plurality of calibration checkerboards, vertically placing the calibration checkerboards in the visible range of a plurality of cameras, extracting corner point information and sub-pixel-level corner point information of each calibration checkerboard, and obtaining the relative poses of the cameras according to the corner point information and the sub-pixel-level corner point information.
Preferably, the creation process of the portrait mask includes:
and taking the image of the depth value of the camera within the preset range as a human body image, and generating a soft-segmentation first human image mask.
Preferably, the creation process of the antagonistic neural network comprises:
constructing a first generation antagonistic neural network by utilizing a generator with a framework of U-Net and a discriminator with a framework of DenseNet-121, wherein the first generation antagonistic neural network is used for view angle synthesis;
and constructing a second generation antagonistic neural network by utilizing a combination of a generator with a U-Net framework and a super-resolution convolutional layer and a discriminator of the 5-layer convolutional neural network, wherein the second generation antagonistic neural network has the effect of image enhancement.
Preferably, after generating the soft segmented first portrait mask, the method further comprises:
and optimizing the first portrait mask by adopting an active contour model algorithm to obtain a second portrait mask.
The method for reconstructing and rendering the three-dimensional portrait in real time based on the multi-view depth camera provided by the embodiment of the invention has the following beneficial effects:
(1) by adopting a visual shell fusion scheme, a portrait mask and a visual shell are generated according to color information and depth information, and objects (such as black objects and thin structures) with poor shooting effects of a depth camera are completed, so that the aim of completing portrait information is fulfilled, real-time capture of a three-dimensional portrait is realized, and the imaging quality is improved;
(2) the three-dimensional portrait is rendered by adopting the generated antagonistic neural network, so that noise points are reduced, missing information is supplemented, the resolution is improved, and the imaging quality is further improved.
Detailed Description
The present invention will be described in detail with reference to the following embodiments.
The embodiment of the invention provides a three-dimensional portrait real-time reconstruction and rendering method based on a multi-view depth camera, which comprises the following steps:
s101, calibrating the cameras to obtain relative poses among the cameras, and constructing a unified three-dimensional coordinate system according to the relative poses.
S102, dividing a plurality of voxels in a three-dimensional coordinate system, and transposing depth information output by a plurality of cameras into a unified three-dimensional coordinate system to form a point cloud.
S103, respectively judging whether each voxel is occupied by the point cloud, if so, respectively constructing a symbol distance function based on truncation for each voxel according to the relative pose and a pre-established portrait mask.
Wherein the expression based on the truncated symbol distance function is: sdf (x) ═ depth (pic (x)) -cam (x), where pic (x) is the projection of the voxel center x on the depth image, depth (pic (x)) is the measured depth between the camera and the nearest object surface point on the observation ray intersection x, cam (x) is the distance between the voxel and the camera along the optical axis, and thus sdf (x) is also the distance along the optical axis.
And polymerizing the two-dimensional portrait masks to generate a visual shell, and calculating the corresponding truncation distance by the visual shell. (a visual shell is a three-dimensional convex hull of an object formed by the convergence of all known two-dimensional contours (portrait masks) of the object in space, and can be considered as a reasonable approximation of the three-dimensional shape of the object).
And S104, carrying out weighted average on the truncated symbol distance functions corresponding to the voxels, and obtaining a human body three-dimensional grid by using a point cloud gridding algorithm.
And S105, constructing uv maps according to the relative poses, and projecting the color information of each camera to a human body three-dimensional grid in a map form.
And S106, calculating the weight of each overlapped area of the uv map according to the relation between the included angle and the distance between the three-dimensional point on the human body three-dimensional grid and the camera, and carrying out weighted average on the color information according to the weight to obtain the human body three-dimensional grid carrying the color information.
As a specific embodiment, the generation process of the human body three-dimensional grid carrying color information is as follows:
and respectively calculating the direction v from the point to the optical axes of two adjacent cameras according to the unit normal vector n of the corresponding three-dimensional point p in each region where the overlay appears on the maplAnd vrDot multiplication with unit normal vector n to obtain wlAnd wrThen by wlAnd wrAnd performing weighted average to obtain the human body three-dimensional grid with color information as the weight of the color values of the two cameras at the p point of the three-dimensional point.
And S107, inputting the human body three-dimensional grid into a pre-constructed antagonistic neural network to render the human body three-dimensional grid, so as to obtain a corresponding two-dimensional portrait.
Optionally, calibrating the plurality of cameras to obtain the relative poses between the plurality of cameras includes:
the method comprises the steps of using a plurality of calibration checkerboards, vertically placing the calibration checkerboards in the visible range of a plurality of cameras, extracting corner point information and sub-pixel-level corner point information of each calibration checkerboard, and obtaining the relative pose among the cameras according to the corner point information and the sub-pixel-level corner point information.
Optionally, the creation process of the portrait mask includes:
and taking the image of the depth value of the camera within the preset range as a human body image, and generating a soft-segmentation first human image mask.
Optionally, the creation process of the antagonistic neural network comprises:
constructing a first generation antagonistic neural network by utilizing a generator with a framework of U-Net and a discriminator with a framework of DenseNet-121, wherein the first generation antagonistic neural network is used for view angle synthesis;
and constructing a second generation antagonistic neural network by utilizing a combination of a generator with a U-Net framework and a super-resolution convolutional layer and a discriminator of the 5-layer convolutional neural network, wherein the second generation antagonistic neural network has the effect of image enhancement.
As a specific embodiment of the present invention, the first generation of input data of the anti-neural network is a human body image captured by each camera, a corresponding skeleton image generated by the depth camera, and a corresponding skeleton image corresponding to a new angle of view to be rendered, respectively, and the output data is a synthesized human body image under the new angle of view; and the input data of the second generation antagonistic neural network are respectively a synthetic human body image, a two-dimensional image of the human body three-dimensional grid under the current visual angle, a corresponding human body normal map and a confidence map which are obtained by the first generation antagonistic neural network, and the output data is a high-quality two-dimensional rendering image.
Optionally, after generating the soft-segmented first portrait mask, the method further comprises:
and optimizing the first portrait mask by adopting an active contour model algorithm to obtain a second portrait mask.
The method for reconstructing and rendering the three-dimensional portrait in real time based on the multi-view depth camera provided by the embodiment of the invention comprises the steps of calibrating a plurality of cameras to obtain relative poses among the cameras, constructing a unified three-dimensional coordinate system according to the relative poses, dividing a plurality of voxels in the three-dimensional coordinate system, transposing depth information output by the cameras into the unified three-dimensional coordinate system to form point clouds, respectively judging whether each voxel is occupied by the point clouds, if so, respectively constructing a truncated symbolic distance function for each voxel according to the relative poses and a pre-established portrait mask, carrying out weighted average on the truncated symbolic distance functions corresponding to each voxel and utilizing a point cloud meshing algorithm to obtain a human three-dimensional grid, constructing uv chartlet according to the relative poses, and respectively projecting color information of each camera to the human three-dimensional grid in a chartlet form, for each region where the uv maps are overlapped, calculating the weight according to the relation between the included angle and the distance between the three-dimensional point on the human body three-dimensional grid and the camera, carrying out weighted average on the color information according to the weight to obtain the human body three-dimensional grid carrying the color information, inputting the human body three-dimensional grid into a pre-constructed antagonistic neural network to render the human body three-dimensional grid to obtain a corresponding two-dimensional portrait, realizing the real-time capture of the three-dimensional portrait and improving the imaging quality.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (6)

1. A three-dimensional portrait real-time reconstruction and rendering method based on a multi-view depth camera is characterized by comprising the following steps:
calibrating a plurality of cameras to obtain relative poses among the cameras, and constructing a unified three-dimensional coordinate system according to the relative poses;
dividing a plurality of voxels in the three-dimensional coordinate system, and transposing the depth information output by the plurality of cameras into a uniform three-dimensional coordinate system to form a point cloud;
respectively judging whether each voxel is occupied by the point cloud, if so, respectively constructing a symbol distance function based on truncation for each voxel according to the relative pose and a pre-established portrait mask;
carrying out weighted average on the truncated symbol distance function corresponding to each voxel and obtaining a human body three-dimensional grid by utilizing a point cloud meshing algorithm;
constructing uv chartlets according to the relative poses, and projecting the color information of each camera to a human body three-dimensional grid in a chartlet mode;
calculating the weight of each overlapped area of the uv map according to the relation between the included angle and the distance between the three-dimensional point on the human body three-dimensional grid and the camera, and carrying out weighted average on the color information according to the weight to obtain the human body three-dimensional grid carrying the color information;
and inputting the human body three-dimensional grid into a pre-constructed antagonistic neural network to render the human body three-dimensional grid, so as to obtain a corresponding two-dimensional portrait.
2. The method for reconstructing and rendering the three-dimensional portrait in real time based on the multi-view depth camera according to claim 1, wherein calibrating the plurality of cameras to obtain the relative poses between the plurality of cameras comprises:
the method comprises the steps of using a plurality of calibration checkerboards, vertically placing the calibration checkerboards in the visible range of a plurality of cameras, extracting corner point information and sub-pixel-level corner point information of each calibration checkerboard, and obtaining the relative poses of the cameras according to the corner point information and the sub-pixel-level corner point information.
3. The method for real-time reconstruction and rendering of three-dimensional portrait based on multi-view depth camera as claimed in claim 1, wherein the creation process of the portrait mask comprises:
and taking the image of the depth value of the camera within the preset range as a human body image, and generating a soft-segmentation first human image mask.
4. The method for reconstructing and rendering the three-dimensional portrait based on the multi-view depth camera in real time as claimed in claim 1, wherein the creation process of the antagonistic neural network comprises:
constructing a first generation antagonistic neural network by utilizing a generator with a framework of U-Net and a discriminator with a framework of DenseNet-121, wherein the first generation antagonistic neural network is used for view angle synthesis;
and constructing a second generation antagonistic neural network by utilizing a combination of a generator with a U-Net framework and a super-resolution convolutional layer and a discriminator of the 5-layer convolutional neural network, wherein the second generation antagonistic neural network has the effect of image enhancement.
5. The method for multi-view depth camera based three-dimensional real-time reconstruction and rendering of a portrait according to claim 3, wherein after generating the soft segmented first portrait mask, the method further comprises:
and optimizing the first portrait mask by adopting an active contour model algorithm to obtain a second portrait mask.
6. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of claims 1-5 are implemented when the computer program is executed by the processor.
CN202011225534.6A 2020-11-05 2020-11-05 Three-dimensional portrait real-time reconstruction and rendering method based on multi-view depth camera Pending CN112348957A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011225534.6A CN112348957A (en) 2020-11-05 2020-11-05 Three-dimensional portrait real-time reconstruction and rendering method based on multi-view depth camera

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011225534.6A CN112348957A (en) 2020-11-05 2020-11-05 Three-dimensional portrait real-time reconstruction and rendering method based on multi-view depth camera

Publications (1)

Publication Number Publication Date
CN112348957A true CN112348957A (en) 2021-02-09

Family

ID=74429822

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011225534.6A Pending CN112348957A (en) 2020-11-05 2020-11-05 Three-dimensional portrait real-time reconstruction and rendering method based on multi-view depth camera

Country Status (1)

Country Link
CN (1) CN112348957A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112991458A (en) * 2021-03-09 2021-06-18 武汉大学 Rapid three-dimensional modeling method and system based on voxels
WO2024001961A1 (en) * 2022-06-29 2024-01-04 先临三维科技股份有限公司 Scanned image rendering method and apparatus, electronic device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109242954A (en) * 2018-08-16 2019-01-18 叠境数字科技(上海)有限公司 Multi-view angle three-dimensional human body reconstruction method based on template deformation
WO2019196308A1 (en) * 2018-04-09 2019-10-17 平安科技(深圳)有限公司 Device and method for generating face recognition model, and computer-readable storage medium
CN111815757A (en) * 2019-06-29 2020-10-23 浙江大学山东工业技术研究院 Three-dimensional reconstruction method for large component based on image sequence

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019196308A1 (en) * 2018-04-09 2019-10-17 平安科技(深圳)有限公司 Device and method for generating face recognition model, and computer-readable storage medium
CN109242954A (en) * 2018-08-16 2019-01-18 叠境数字科技(上海)有限公司 Multi-view angle three-dimensional human body reconstruction method based on template deformation
CN111815757A (en) * 2019-06-29 2020-10-23 浙江大学山东工业技术研究院 Three-dimensional reconstruction method for large component based on image sequence

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
席小霞;宋文爱;邱子璇;史磊;: "基于RGB-D值的三维图像重建系统研究", 测试技术学报, no. 05, 30 October 2015 (2015-10-30) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112991458A (en) * 2021-03-09 2021-06-18 武汉大学 Rapid three-dimensional modeling method and system based on voxels
CN112991458B (en) * 2021-03-09 2023-02-24 武汉大学 Rapid three-dimensional modeling method and system based on voxels
WO2024001961A1 (en) * 2022-06-29 2024-01-04 先临三维科技股份有限公司 Scanned image rendering method and apparatus, electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN110223370B (en) Method for generating complete human texture map from single-view picture
KR101199475B1 (en) Method and apparatus for reconstruction 3 dimension model
CN107833270A (en) Real-time object dimensional method for reconstructing based on depth camera
CN111243071A (en) Texture rendering method, system, chip, device and medium for real-time three-dimensional human body reconstruction
CN108898665A (en) Three-dimensional facial reconstruction method, device, equipment and computer readable storage medium
CN111091624B (en) Method for generating high-precision drivable human face three-dimensional model from single picture
CN111968238A (en) Human body color three-dimensional reconstruction method based on dynamic fusion algorithm
CN113012293A (en) Stone carving model construction method, device, equipment and storage medium
CN111932673B (en) Object space data augmentation method and system based on three-dimensional reconstruction
CN109410133B (en) Face texture repairing method based on 3DMM
CN106652015B (en) Virtual character head portrait generation method and device
CN110246146A (en) Full parallax light field content generating method and device based on multiple deep image rendering
CN109147025B (en) RGBD three-dimensional reconstruction-oriented texture generation method
WO2002013144A1 (en) 3d facial modeling system and modeling method
CN113192179A (en) Three-dimensional reconstruction method based on binocular stereo vision
CN112348957A (en) Three-dimensional portrait real-time reconstruction and rendering method based on multi-view depth camera
CN113313828B (en) Three-dimensional reconstruction method and system based on single-picture intrinsic image decomposition
CN109685879A (en) Determination method, apparatus, equipment and the storage medium of multi-view images grain distribution
CN109461197B (en) Cloud real-time drawing optimization method based on spherical UV and re-projection
CN110245199A (en) A kind of fusion method of high inclination-angle video and 2D map
CN113160335A (en) Model point cloud and three-dimensional surface reconstruction method based on binocular vision
CN114549669B (en) Color three-dimensional point cloud acquisition method based on image fusion technology
CN109064533A (en) A kind of 3D loaming method and system
CN107590858A (en) Medical sample methods of exhibiting and computer equipment, storage medium based on AR technologies
CN116363290A (en) Texture map generation method for large-scale scene three-dimensional reconstruction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination