CN113223132A - Indoor scene virtual roaming method based on reflection decomposition - Google Patents

Indoor scene virtual roaming method based on reflection decomposition Download PDF

Info

Publication number
CN113223132A
CN113223132A CN202110429676.2A CN202110429676A CN113223132A CN 113223132 A CN113223132 A CN 113223132A CN 202110429676 A CN202110429676 A CN 202110429676A CN 113223132 A CN113223132 A CN 113223132A
Authority
CN
China
Prior art keywords
picture
reflection
pixel
depth
plane
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110429676.2A
Other languages
Chinese (zh)
Other versions
CN113223132B (en
Inventor
许威威
许佳敏
吴秀超
朱紫涵
鲍虎军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202110429676.2A priority Critical patent/CN113223132B/en
Publication of CN113223132A publication Critical patent/CN113223132A/en
Application granted granted Critical
Publication of CN113223132B publication Critical patent/CN113223132B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Graphics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Geometry (AREA)
  • Image Generation (AREA)

Abstract

The invention discloses an indoor scene virtual roaming method based on reflection decomposition, which comprises the steps of firstly, obtaining a rough global triangular mesh model projection as an initial depth map by utilizing three-dimensional reconstruction, aligning the depth edge to a color edge, and converting the aligned depth map into a simplified triangular mesh; detecting planes in the global triangular mesh model, if a certain plane is a reflection plane, constructing a double-layer expression on a reflection area for each picture which can see the reflection plane for correctly rendering the reflection effect of the surface of an object; and giving a virtual visual angle, drawing the virtual visual angle picture by using the neighborhood picture and the triangular mesh, and drawing the reflection area by using the foreground background picture and the foreground background triangular mesh. The method can perform virtual roaming with large degree of freedom in a large indoor scene with a reflection effect under the condition of smaller storage requirement. The method has the advantages of good rendering effect, larger roaming freedom degree, capability of drawing partial reflection, highlight and other effects, and robust result.

Description

Indoor scene virtual roaming method based on reflection decomposition
Technical Field
The invention relates to the technical field of picture-based rendering and virtual viewpoint synthesis, in particular to a method for performing indoor scene virtual roaming by combining a picture-based rendering technology with reflection decomposition.
Background
The purpose of the virtual roaming of the indoor scene is to construct a system, give internal and external parameters of a virtual camera and output a drawing picture of a virtual viewpoint. The existing mature virtual roaming application is mainly based on a series of panoramic pictures, each panoramic picture is taken as a center to perform pure-rotation virtual roaming, most systems for moving panoramic pictures are performed by using simple interpolation, and the visual error is large at this time. For large degrees of freedom virtual roaming, there are many methods available to make object level observations or view point movement observations of a part of a scene, including explicit acquisition of The light field around The target object using a light field camera, see Gortler, Steven j., et al, "The lumigraph," Proceedings of The 23rd annual Conference on Computer graphics and interactive technology.1996, or using The photographs of ordinary cameras to make scene expression and interpolation using a neural network, see millenhall, Ben, et al, "rf: Representing scenes as a new radial field for view synthesis," Proceedings of The European Conference Computer vision.2020. For larger indoor scenes, the current latest methods can achieve rendering with relatively Free viewpoints, but the rendering effect is not good enough, see Riegler and koltun, "Free View synthesis," Proceedings of the European Conference on Computer vision.2020. In particular, for the various reflection types (ground, table, mirror, etc.) present in large indoor scenes, there is still no system that can better handle indoor roaming with such complex materials.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an indoor scene virtual roaming method based on reflection decomposition, which can perform virtual roaming with large degree of freedom in a large indoor scene with reflection effect under the condition of smaller storage requirement.
In order to achieve the purpose, the invention adopts the following technical scheme: an indoor scene virtual roaming method based on reflection decomposition comprises the following steps:
s1: shooting a picture of a scene which is sufficiently covered in a target indoor scene, and performing three-dimensional reconstruction on the indoor scene based on the shot picture to obtain a rough global triangular mesh model of the indoor scene and the inside and outside parameters of the camera;
s2: for each picture, projecting the global triangular mesh model into a corresponding depth map, aligning the depth edge to a color edge, converting the aligned depth map into a triangular mesh, and carrying out mesh simplification on the triangular mesh;
s3: detecting a plane in the global triangular mesh model, detecting whether the plane is a reflection plane or not by utilizing the color consistency between adjacent images, and if so, constructing a double-layer expression on a reflection area for each picture which can see the reflection plane for correctly rendering the reflection effect of the surface of an object;
the double-layer expression comprises a foreground-background double-layer triangular mesh and two decomposed images of a foreground and a background, the foreground triangular mesh is used for expressing the surface geometry of an object, the background triangular mesh is used for expressing the mirror image of the scene geometry on a reflecting plane, the foreground image is used for expressing the surface texture of the object after the reflection component is removed, and the background image is used for expressing the reflection component of the scene on the surface of the object;
s4: and giving a virtual visual angle, drawing the virtual visual angle picture by using the neighborhood picture and the triangular mesh, and drawing the reflection area by using the foreground background picture and the foreground background triangular mesh.
Further, in S2, aligning the depth edge of the depth map to the color edge of the original picture, and acquiring the aligned depth map, specifically:
firstly, a normal map corresponding to the depth map is calculated, and then for each pixel i in the depth map, the depth value d of each pixel i is calculatediThree-dimensional point v converted into local coordinate system according to camera internal referenceiCalculating the planar distance dt between adjacent points i, jij=max(|(vi-vj)·ni|,|(vi-vj)·nj|),ni,njNormal vectors to points i, j, respectively, if dtijGreater than λ max (1, min (d)i,dj) Record the pixel as a depth edge pixel, where λ is an edge detection threshold;
for each picture, after all depth edge pixels are obtained, calculating the local two-dimensional gradient of the depth edge by utilizing Sobel convolution, and then traversing one pixel by one pixel along the direction of the edge two-dimensional gradient and the opposite direction thereof by taking each depth edge pixel as a starting point until one of two sides traverses to a color edge pixel; after traversing to the color edge pixel, deleting the depth values of all pixels of the intermediate path from the starting point pixel to the color edge pixel; and defining the pixel of the deleted depth value as a non-aligned pixel, defining the pixel of the non-deleted depth value as an aligned pixel, and carrying out interpolation filling by using the peripheral non-deleted depth value for each deleted depth value.
Further, for each deleted depth value, performing interpolation filling by using surrounding undeleted depth values, specifically: for each unaligned pixel p to be interpolatediCalculate its geodesic distance d to all other aligned pixelsg(pi,pj) Finding m nearest aligned pixels by using geodesic distance, and calculating interpolated depth value
Figure BDA0003030949730000021
Wherein
Figure BDA0003030949730000022
Representing a pixel piIs aligned to the nearest neighbor set of pixels, wg(i,j)=exp(-dg(pi,pj)),
Figure BDA0003030949730000023
Represents a pixel piProjection to pjThe local plane equation of (a) is composed of vjAnd njAnd (6) calculating.
Further, in S3, detecting a plane and a reflection plane in the global triangular mesh model specifically includes:
detecting a plane in the global triangular mesh model, reserving the plane with the area larger than the area threshold value, projecting the plane onto a visible picture, and recording the picture set with the visible plane as
Figure BDA0003030949730000024
For the
Figure BDA0003030949730000025
Each picture I inkCalculate its K neighbor picture set
Figure BDA0003030949730000026
K neighbor calculation is obtained according to the overlapping rate of the vertexes in the global triangular mesh model after plane reflection;
by using
Figure BDA0003030949730000027
Constructing a matching cost body, and judging that the plane is in the picture IkWhether the reflection component is enough or not is judged by the following method: for each pixel, after mirroring the global triangular mesh model according to a plane equation, searching a cost corresponding to a mirrored depth value in a matched cost body, judging whether a cost position is a local minimum point, if the number of pixels of the local minimum point in the picture is greater than a pixel number threshold, considering that the plane has a reflection component on the picture, and if the number of visible pictures with reflection components on a certain plane is greater than a picture number threshold, considering that the plane is a reflection plane.
Further, in S3, for each reflection plane, the two-dimensional reflection area β on each visible picture is calculatedkThe method specifically comprises the following steps: projecting the reflection plane onto a visible image to obtain a projection depth map, performing expansion operation on the projection depth map, comparing the expanded projection depth map with the aligned depth map to obtain an accurate two-dimensional reflection area, and projecting the projection depth map to a depth map of the two-dimensional reflection areaEach pixel with a depth value in the depth map is screened by utilizing the three-dimensional point distance and the normal included angle, and the screened pixel area is taken as the reflection area beta of the reflection plane on the picturek
Further, in S3, for each picture that can see the reflection plane, a double-layer expression is constructed on the reflection area, specifically:
taking the projection depth map as an initial foreground depth map, mirroring the camera internal and external parameters of the image into a virtual camera according to the plane equation, rendering the initial background depth map in the virtual camera by using a global triangular mesh model, and converting the initial foreground background depth map into a simplified two-layer triangular mesh
Figure BDA0003030949730000031
And
Figure BDA0003030949730000032
calculating two layers of foreground and background pictures by using iterative optimization algorithm
Figure BDA0003030949730000033
And
Figure BDA0003030949730000034
and further optimize
Figure BDA0003030949730000035
And
Figure BDA0003030949730000036
before optimization, all related original pictures are subjected to inverse gamma correction in advance for subsequent decomposition;
the goal of the optimization is to minimize the energy function:
Figure BDA0003030949730000037
wherein in the optimization objective
Figure BDA0003030949730000038
Representing the rigid body transformation of the triangular mesh of the reflecting layer, the initial values of which are respectively an identity matrix and 0,
Figure BDA0003030949730000039
and
Figure BDA00030309497300000310
optimizing only the three-dimensional positions of the vertices of the mesh without changing the topology, Ed、Es、EpRespectively a data term, a smoothing term, a prior term, lambdas、λpU represents the weight of each term
Figure BDA00030309497300000311
A pixel of (1); specifically, the method comprises the following steps:
Figure BDA00030309497300000312
Figure BDA00030309497300000313
Figure BDA00030309497300000314
Figure BDA00030309497300000315
wherein H is a Laplace matrix; function omega-1Returning the two-dimensional coordinates, and imaging Ik′Point u in (1) is projected to image I according to depth value and camera internal and external parametersk
Figure BDA00030309497300000316
To represent
Figure BDA00030309497300000317
Projection is obtainedA depth map of (a); v represents
Figure BDA00030309497300000318
A vertex in (1);
to minimize the energy function, an alternating optimization scheme is used, which is first fixed for each round of optimization
Figure BDA00030309497300000319
And
Figure BDA00030309497300000320
optimization
Figure BDA00030309497300000321
Wherein
Figure BDA00030309497300000322
Is calculated using the following formula:
Figure BDA00030309497300000323
Figure BDA0003030949730000041
giving an initial value, and optimizing by using a nonlinear conjugate gradient method; then, fix
Figure BDA0003030949730000042
Optimization
Figure BDA0003030949730000043
And
Figure BDA0003030949730000044
the conjugate gradient method was also used; alternately optimizing for one round at a time, carrying out two rounds of optimization in the whole optimization process, and after the first round of optimization, utilizing the consistency constraint of the foreground pictures among a plurality of visual angles to carry out the first round of optimization
Figure BDA0003030949730000045
De-noising, in particular, known after a first round of optimization
Figure BDA0003030949730000046
And
Figure BDA0003030949730000047
acquiring a denoised image by using the following formula
Figure BDA0003030949730000048
And
Figure BDA0003030949730000049
Figure BDA00030309497300000410
Figure BDA00030309497300000411
use of
Figure BDA00030309497300000412
And
Figure BDA00030309497300000413
as
Figure BDA00030309497300000414
Continues the second round of optimization, further, adds a prior term to the total energy equation in the second round of optimization:
Figure BDA00030309497300000415
wherein λgThe prior term weight is used for restricting the second round of optimization;
after two rounds of optimization, utilize
Figure BDA00030309497300000416
Transformation of
Figure BDA00030309497300000417
Obtaining a final two-layer simplified triangular mesh
Figure BDA00030309497300000418
And after decomposition
Figure BDA00030309497300000419
For correctly rendering the reflection effect of the object surface.
Further, in S4, a neighborhood picture set is calculated according to the inside and outside parameters of the virtual camera, the local coordinate system of the current virtual camera is divided into 8 quadrants according to the coordinate axis plane, a series of neighborhood pictures are further selected in each quadrant, and the optical center direction of the pictures is utilized
Figure BDA00030309497300000420
And virtual camera optical center direction
Figure BDA00030309497300000421
Angle of (2)
Figure BDA00030309497300000422
And picture optical center tkAnd virtual camera optical center tnDistance | tk-tnII, dividing each quadrant into a plurality of areas again; then, in each region, a similarity d is selectedkThe smallest 1 picture is added to the neighborhood picture set,
Figure BDA00030309497300000423
wherein λ is a distance proportion weight;
after the neighborhood picture set is obtained, drawing each picture in the neighborhood picture set to a virtual viewpoint according to the corresponding simplified triangular mesh, specifically:
a) computing a robust depth map for patch renderingFor each pixel of the device, its rendering cost c (t) is calculatedk,tn,x):
c(tk,tn,x)=∠(tk-x,tn-x)*π/180+max(0,1-‖tn-x‖/‖tk-x‖)
Wherein t iskAnd tnRepresenting the three-dimensional coordinates of optical centers of a picture and a virtual camera, x represents the three-dimensional coordinates of a corresponding three-dimensional point of the pixel, each pixel has a series of triangular patches rendered, wherein, the points represent the intersection points of light rays determined by the patches and the pixels, if the rendering cost of a certain point is more than the minimum rendering cost of all the points in the pixel plus a range threshold value lambda, the point does not participate in the computation of the depth map, and thus the depths of all the points participating in the computation of the depth map are compared to obtain the minimum value which is taken as the depth value of the pixel;
b) calculating the depth map of the virtual camera, adding the picture into a triangular mesh for drawing as a texture map, and regarding the pixel of each virtual camera picture, the color of a point near the depth map is set according to a set weight wkAnd mixing to obtain the final rendering color.
Further, in S4, the reflection region β in the neighborhood picture is divided into twokAlso drawing to the current virtual viewpoint to obtain the reflection area beta of the current virtual viewpointnFor the pixels in the reflection area, the two layers of images of the foreground and the background and the simplified triangular mesh are required to be used for drawing, and the calculation of the depth map and the color mixing are respectively carried out on the two layers of images because of the two layers of images
Figure BDA0003030949730000051
And
Figure BDA0003030949730000052
the method is obtained by decomposition after inverse gamma correction is carried out, and in a rendering stage, two layers of mixed pictures are added, and a correct picture with a reflection effect is obtained by carrying out gamma correction once.
Further, in S4, in order to reduce the storage size, all pictures are downsampled to 1/n for storage, n is greater than or equal to 1, and the virtual window is set to the original size during rendering.
Further, training a super-resolution neural network to compensate definition loss caused by downsampling of a stored picture, and reducing possible drawing errors, specifically:
after the depth picture and the color picture are rendered and obtained at each new virtual visual angle, rendering errors are reduced and definition is improved by using a depth neural network; the network uses the color picture and the depth picture of the current frame plus the color picture and the depth picture of the previous frame as input; firstly, a three-layer convolution network is utilized to respectively extract features of a depth color picture of a current frame and a depth color picture of a previous frame, next, the features of the previous frame are mapped to the current frame in a distortion mode, the initial correspondence is obtained by depth map calculation, an alignment module is utilized to further fit a local two-dimensional offset to further align the features of the previous frame and the next frame, the aligned features of the previous frame and the next frame are combined and input into a super-resolution module realized through a U-Net convolution neural network, and a high-definition picture of the current frame is output.
The invention has the beneficial effects that:
1. a set of complete flow is constructed, a large amount of shooting data can be processed, and virtual viewpoint roaming with larger degree of freedom is realized for large-scale indoor scenes;
2. detecting a reflecting surface in an indoor scene and a reflecting area in a picture, and constructing double-layer expression on the reflecting area, so that a reflecting effect can be better rendered in the roaming process of the indoor scene, and the rendering sense of reality is greatly improved;
3. by connecting a special super-resolution neural network, the method reduces the rendering error and reduces the picture resolution required by supporting the roaming of a single scene, thereby reducing the storage and memory consumption.
Drawings
Fig. 1 is a flowchart of an indoor scene virtual roaming method based on reflection decomposition according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a global triangular mesh model according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a two-layer expression construction result of a reflection region according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a rendering result of a virtual viewpoint with reflection according to an embodiment of the present invention;
FIG. 5 is a comparison graph of the results of whether to use a super-resolution neural network according to an embodiment of the present invention;
fig. 6 is a diagram of a super-resolution neural network structure according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following drawings and specific embodiments, it being understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, an embodiment of the present invention provides a method for indoor scene virtual roaming based on reflection decomposition, where the method includes the following steps:
(1) the method comprises the steps of shooting pictures of enough covered scenes in a target indoor scene, carrying out three-dimensional reconstruction on the indoor scene based on the shot pictures, and acquiring a rough global triangular mesh model of the indoor scene, which is referred by the inside and the outside of a camera, as shown in figure 2.
Specifically, a three-dimensional reconstruction software COLMAP or RealityCapture can be adopted to acquire the camera internal and external parameters and the global triangular mesh model.
(2) And for each picture, projecting the global triangular mesh model into a corresponding depth map, aligning the depth edge to the color edge, converting the aligned depth map into a triangular mesh, and carrying out mesh simplification on the triangular mesh.
Specifically, since the global triangular mesh model includes some errors, the depth edge of the projected depth map is aligned to the color edge of the original picture, and the aligned depth map is obtained, which specifically includes the following steps:
firstly, a normal map corresponding to the depth map is calculated, and then for each pixel i in the depth map, the depth value d of each pixel i is calculatediThree-dimensional point v converted into local coordinate system according to camera internal referenceiCalculating the planar distance dt between adjacent points i, jij=max(|(vi-vj)·ni|,|(vi-vj)·nj|),ni,njNormal vectors to points i, j, respectively, if dtijGreater than λ max (1, min (d)i,dj) It is regarded as a depth edge pixel, where λ is an edge detection threshold, and λ is 0.01 in this embodiment.
For each picture, after all depth edge pixels are obtained, calculating a local two-dimensional gradient of a depth edge by utilizing Sobel convolution, and then traversing one pixel by one pixel along the direction of the edge two-dimensional gradient and the opposite direction thereof by taking each depth edge pixel as a starting point until one of two sides traverses to a color edge pixel, wherein the color edge pixel is obtained by a Canny edge extraction algorithm; after traversing to the color edge pixel, deleting the depth values of all pixels of the intermediate path from the starting point pixel to the color edge pixel; defining the pixel with the deleted depth value as a non-aligned pixel, defining the pixel with the non-deleted depth value as an aligned pixel, and for each deleted depth value, carrying out interpolation filling by using the peripheral non-deleted depth value, specifically, for each non-aligned pixel p to be interpolatediCalculate its geodesic distance d to all other aligned pixelsg(pi,pj) See Revaud, Jerome, et al, "Edge-predicting interpolation of corrections for optical flow," Proceedings of the IEEE conference on computer vision and pattern registration. 2015, finding m (m 4 in this example) nearest aligned pixels using geodesic distance, calculating interpolated depth value
Figure BDA0003030949730000061
Wherein
Figure BDA0003030949730000062
Representing a pixel piIs aligned to the nearest neighbor set of pixels, wg(i,j)=exp(-dg(pi,pj)),
Figure BDA0003030949730000063
Represents a pixel piProjection to pjThe local plane equation of (a) is composed of vjAnd njAnd (6) calculating.
Specifically, after the depth maps are aligned, the aligned depth maps are converted into triangular meshes, specifically: and converting the depth value into a three-dimensional coordinate, connecting all the horizontal and vertical edges and connecting one oblique edge, and disconnecting the corresponding edge when the depth edge in the previous step is met to obtain the triangular mesh.
Specifically, a mesh simplification algorithm is called to perform mesh simplification on the generated triangular mesh, see Garland, Michael, and Paul S.Heckbert. "Surface location using square error metrics." Proceedings of the 24th annual reference on Computer graphics and interactive techniques.1997.
(3) Detecting a plane in the global triangular mesh model, detecting whether the plane is a reflection plane or not by utilizing the color consistency between adjacent images, and if so, constructing a double-layer expression on a reflection area for each picture which can see the reflection plane for correctly rendering the reflection effect of the surface of the object. Fig. 3 is a schematic diagram of a reflection region double-layer expression construction result provided in the embodiment of the present invention.
The double-layer expression comprises a foreground-background double-layer triangular mesh and two decomposed images of a foreground and a background, the foreground triangular mesh is used for expressing the surface geometry of an object, the background triangular mesh is used for expressing the mirror image of the scene geometry on a reflecting plane, the foreground image is used for expressing the surface texture of the object after reflection components are removed, and the background image is used for expressing the reflection components of the scene on the surface of the object.
Specifically, first, a plane in the global triangular mesh model is detected, and a plane with an area larger than an area threshold (in this embodiment, the area threshold is 0.09 m)2) Projecting a plane on the visible picture, and recording the picture set in which the plane is visible as
Figure BDA0003030949730000071
For the
Figure BDA0003030949730000072
Each picture I inkThen, K-nearest neighbor (K equals 6 in this embodiment) picture set is calculated
Figure BDA0003030949730000073
The K neighbors are obtained by sequencing the overlapping rate of the vertexes in the global triangular mesh model after plane reflection, and the K neighbors comprise a picture IkBy itself, the overlap ratio must be highest. Then utilize
Figure BDA0003030949730000074
Constructing a matching cost (cost volume), see Sinha, Sudipt N., et al, "Image-based rendering for scenes with transformations," ACM Transformations On Graphics (TOG)31.4(2012):1-10, judging that the plane is in picture IkSpecifically, for each pixel, after mirroring the global triangular mesh model according to the plane equation, finding a cost corresponding to the mirrored depth value in the matching cost body, determining whether the cost position is a local minimum point, if the number of pixels of the local minimum point in the picture is greater than the threshold of the number of pixels (50 in this embodiment), the plane is considered to have a reflection component on the picture, and if the number of visible pictures with reflection components on a plane is greater than the threshold of the number of pictures (5 in this embodiment), the plane is considered to be a reflection plane.
Specifically, for each reflection plane, the two-dimensional reflection area β of the reflection plane on each visible picture is calculatedkSpecifically, a reflection plane (with a three-dimensional boundary) is projected onto a visible picture to obtain a projection depth map, the projection depth map is subjected to a dilation operation (a 9x9 window may be adopted), then the dilated projection depth map is compared with the depth map aligned in the previous step to obtain an accurate two-dimensional reflection area, for each pixel with a depth value in the projection depth map, a three-dimensional point distance and a normal included angle are used for screening (a three-dimensional point distance is less than 0.03 meter and the normal included angle is less than 60 degrees is reserved), and the screened pixel area is used as a reflection area β of the reflection plane on the picturek(ii) a Meanwhile, the plane equation of the plane is utilized to obtain an initial two-layer depth map,specifically, the projection depth map is used as an initial foreground depth map, camera internal and external parameters of the image are mirrored into a virtual camera according to the plane equation, then the initial background depth map is rendered in the virtual camera by using a global triangular mesh model, attention is paid to the fact that a rendered near clipping plane needs to be modified into a reflection plane, and then the initial foreground background depth map is converted into a simplified two-layer triangular mesh according to the method of the step 2)
Figure BDA0003030949730000081
And
Figure BDA0003030949730000082
next, calculating two layers of foreground and background pictures by using an iterative optimization algorithm
Figure BDA0003030949730000083
And
Figure BDA0003030949730000084
and further optimize
Figure BDA0003030949730000085
And
Figure BDA0003030949730000086
before optimization, all relevant original pictures are pre-gamma corrected for subsequent decomposition.
The goal of the optimization is to minimize the energy function:
Figure BDA0003030949730000087
wherein in the optimization objective
Figure BDA0003030949730000088
Representing the rigid body transformation of the triangular mesh of the reflecting layer, the initial values of which are respectively an identity matrix and 0,
Figure BDA0003030949730000089
and
Figure BDA00030309497300000810
optimizing only the three-dimensional positions of the vertices of the mesh without changing the topology, Ed、Es、EpRespectively a data term, a smoothing term, a prior term, lambdas、λpAre the weights of the terms 0.04 and 0.01, respectively, u represents
Figure BDA00030309497300000811
A pixel of (1); specifically, the method comprises the following steps:
Figure BDA00030309497300000812
Figure BDA00030309497300000813
Figure BDA00030309497300000814
Figure BDA00030309497300000815
wherein H is a Laplace matrix; function omega-1Returning the two-dimensional coordinates, and imaging Ik′Point u in (1) is projected to image I according to depth value and camera internal and external parametersk
Figure BDA00030309497300000816
To represent
Figure BDA00030309497300000817
Projecting the obtained depth map; v represents
Figure BDA00030309497300000818
The vertex in (1).
In order to minimize the aboveEnergy function, using alternate optimization scheme, for each round of optimization, is first fixed
Figure BDA00030309497300000819
And
Figure BDA00030309497300000820
optimization
Figure BDA00030309497300000821
Wherein
Figure BDA00030309497300000822
Is calculated using the following formula:
Figure BDA00030309497300000823
Figure BDA00030309497300000824
giving an initial value, and optimizing by using a nonlinear conjugate gradient method, wherein the iteration number is 30; then, fix
Figure BDA00030309497300000825
Optimization
Figure BDA00030309497300000826
And
Figure BDA00030309497300000827
the conjugate gradient method is also used, and 30 times of iteration is carried out; alternately one round of optimization at a time, performing two rounds of optimization in total in the whole optimization process, and after the first round of optimization, utilizing the consistency constraint of the foreground pictures (surface colors) among a plurality of view angles to perform the first round of optimization
Figure BDA00030309497300000828
De-noising, in particular, known after a first round of optimization
Figure BDA00030309497300000829
And
Figure BDA00030309497300000830
acquiring a denoised image by using the following formula
Figure BDA00030309497300000831
And
Figure BDA00030309497300000832
Figure BDA00030309497300000833
Figure BDA00030309497300000834
use of
Figure BDA0003030949730000091
And
Figure BDA0003030949730000092
as
Figure BDA0003030949730000093
Continues the second round of optimization, further, adds a prior term to the total energy equation in the second round of optimization:
Figure BDA0003030949730000094
wherein the prior term weight λgEqual to 0.05 for constraining the second round of optimization.
After two rounds of optimization, utilize
Figure BDA0003030949730000095
Transformation of
Figure BDA0003030949730000096
Obtaining a final two-layer simplified triangular mesh
Figure BDA0003030949730000097
And after decomposition
Figure BDA0003030949730000098
For correctly rendering the reflection effect of the object surface.
(4) And giving a virtual visual angle, drawing the virtual visual angle picture by using the neighborhood picture and the triangular mesh, and drawing the reflection area by using the foreground background picture and the foreground background triangular mesh. Fig. 4 is a schematic diagram of a rendering result of a virtual viewpoint with reflection according to an embodiment of the present invention.
Specifically, the online rendering process targets internal and external parameters of a given virtual camera, and the output is a virtual picture corresponding to the virtual camera. Specifically, the method comprises the following steps: calculating a neighborhood picture set according to internal and external parameters of the virtual camera, dividing a local coordinate system of the current virtual camera into 8 quadrants according to a coordinate axis plane, further selecting a series of neighborhood pictures in each quadrant, and utilizing the optical center direction of the pictures
Figure BDA0003030949730000099
And virtual camera optical center direction
Figure BDA00030309497300000910
Angle of (2)
Figure BDA00030309497300000911
And picture optical center tkAnd virtual camera optical center tnDistance | tk-tnII, dividing each quadrant into a plurality of areas again; preferably, the division is 9 regions, and the 9 regions are
Figure BDA00030309497300000912
In the range of [0 °,10 °, [10 °,20 °, [20 °, ∞), ] tk-tnII in the permutation and combination of [0,0.6 ], [0.6,1.2 ], [1.2,1.8) of the respective three intervals; in-line with the aboveThen, in each region, a similarity d is selectedkAdding the smallest 1 picture into the neighborhood picture set:
Figure BDA00030309497300000913
wherein the distance-to-weight λ is equal to 0.1.
After the neighborhood picture set is obtained, drawing each picture in the neighborhood picture set to a virtual viewpoint according to the corresponding simplified triangular mesh, specifically:
a) a robust depth map is calculated, and for each pixel of a patch shader, the rendering cost c (t) is calculatedk,tn,x):
c(tk,tn,x)=∠(tk-x,tn-x)*π/180+max(0,1-‖tn-x‖/‖tk-x‖)
Wherein, tkAnd tnRepresenting the three-dimensional coordinates of optical centers of the picture and the virtual camera, x represents the three-dimensional coordinates of a corresponding three-dimensional point of the pixel, each pixel has a series of triangular patches rendered to, here, "points" represent the intersection points of rays determined by the patches and the pixels, if the rendering cost of a certain point is too large and is larger than the minimum rendering cost of all points in the pixel + a range threshold value lambda, where lambda is 0.17 in the embodiment, the point does not participate in the calculation of the depth map, and thus the depths of all points participating in the calculation of the depth map are compared and taken as the minimum value, which is the depth value of the pixel.
b) Calculating the depth map of the virtual camera, adding the picture into a triangular mesh for drawing as a texture map, and regarding the pixel of each virtual camera picture, the color of a point near the depth map (with the distance less than 3cm) is set according to a set weight wk(wk=exp(-dk/0.033)) to obtain a final rendering color.
Specifically, a reflection region beta in a neighborhood picture is determinedkAlso drawing to the current virtual viewpoint to obtain the reflection area beta of the current virtual viewpointnFor pixels in the reflective region, it is advantageousDrawing by using the pictures of the two layers of the foreground and the background and the simplified triangular mesh, and respectively carrying out depth map calculation and color mixing on the images of the two layers according to the steps, wherein the pictures of the two layers are the pictures
Figure BDA0003030949730000101
And
Figure BDA0003030949730000102
the method is obtained by decomposition after inverse gamma correction is carried out, two layers of mixed pictures are added in a rendering stage, and then the correct picture with the reflection effect is obtained by carrying out gamma correction once.
Specifically, in the rendering step, in order to reduce storage, all pictures are down-sampled to 1/n for storage (n is greater than or equal to 1, and n in this embodiment is 4), the size of the virtual window is set to the original size during rendering, so that the resolution of the rendered virtual viewpoint picture is unchanged but is fuzzy, and the definition is improved by using a super-resolution neural network in the next step.
(5) Training a super-resolution neural network to compensate definition loss caused by downsampling of a stored picture, and reducing possible drawing errors; fig. 5 is a comparison diagram showing results of whether to use the super-resolution neural network according to the embodiment of the present invention, and fig. 6 is a structural diagram of the super-resolution neural network according to the embodiment of the present invention.
Specifically, after the depth picture and the color picture are obtained in each new virtual visual angle rendering mode, the depth neural network is used for reducing rendering errors and improving definition. Specifically, the network uses the current frame color picture and the depth picture plus the previous frame color picture and the depth picture as input, and the purpose of using the previous frame picture and the next frame picture is to add more effective information and improve the timing sequence stability; firstly, a three-layer convolution network is utilized to respectively extract features of a depth color picture of a current frame and a depth color picture of a previous frame, next, the features of the previous frame are subjected to distortion mapping (warp) to the current frame, the initial correspondence is obtained by the depth map calculation, an alignment module (realized by the convolution neural network of three layers of convolution layers) is utilized to further fit a local two-dimensional offset to further align the features of the previous frame and the next frame due to the fact that the depth map is not completely accurate, the aligned features of the previous frame and the next frame are combined (concat) and input into a super-resolution module (realized by the U-Net convolution neural network), and a high-definition picture of the current frame is output.
In one embodiment, a computer device is provided, which includes a memory and a processor, the memory stores computer readable instructions, and when executed by the processor, the processor executes the steps in the indoor scene virtual roaming method based on reflection decomposition in the embodiments.
In one embodiment, a storage medium storing computer readable instructions is provided, and the computer readable instructions, when executed by one or more processors, cause the one or more processors to perform the steps of the reflection decomposition-based indoor scene virtual roaming method in the embodiments. The storage medium may be a nonvolatile storage medium.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
The above description is only for the purpose of illustrating the preferred embodiments of the one or more embodiments of the present disclosure, and is not intended to limit the scope of the one or more embodiments of the present disclosure, and any modifications, equivalent substitutions, improvements, etc. made within the spirit and principle of the one or more embodiments of the present disclosure should be included in the scope of the one or more embodiments of the present disclosure.

Claims (10)

1. An indoor scene virtual roaming method based on reflection decomposition is characterized by comprising the following steps:
s1: shooting a picture of a scene which is sufficiently covered in a target indoor scene, and performing three-dimensional reconstruction on the indoor scene based on the shot picture to obtain a rough global triangular mesh model of the indoor scene and the inside and outside parameters of the camera;
s2: for each picture, projecting the global triangular mesh model into a corresponding depth map, aligning the depth edge to a color edge, converting the aligned depth map into a triangular mesh, and carrying out mesh simplification on the triangular mesh;
s3: detecting a plane in the global triangular mesh model, detecting whether the plane is a reflection plane or not by utilizing the color consistency between adjacent images, and if so, constructing a double-layer expression on a reflection area for each picture which can see the reflection plane for correctly rendering the reflection effect of the surface of an object;
the double-layer expression comprises a foreground-background double-layer triangular mesh and two decomposed images of a foreground and a background, the foreground triangular mesh is used for expressing the surface geometry of an object, the background triangular mesh is used for expressing the mirror image of the scene geometry on a reflecting plane, the foreground image is used for expressing the surface texture of the object after the reflection component is removed, and the background image is used for expressing the reflection component of the scene on the surface of the object;
s4: and giving a virtual visual angle, drawing the virtual visual angle picture by using the neighborhood picture and the triangular mesh, and drawing the reflection area by using the foreground background picture and the foreground background triangular mesh.
2. The method according to claim 1, wherein in S2, aligning the depth edge of the depth map to the color edge of the original picture to obtain an aligned depth map, specifically:
firstly, a normal map corresponding to the depth map is calculated, and then for each pixel i in the depth map, the depth value d of each pixel i is calculatediThree-dimensional point v converted into local coordinate system according to camera internal referenceiCalculating the planar distance dt between adjacent points i, jij=max(|(vi-vj)·ni|,|(vi-vj)·nj|),ni,njNormal vectors to points i, j, respectively, if dtijGreater than λ max (1, min (d)i,dj) Record the pixel as a depth edge pixel, where λ is an edge detection threshold;
for each picture, after all depth edge pixels are obtained, calculating the local two-dimensional gradient of the depth edge by utilizing Sobel convolution, and then traversing one pixel by one pixel along the direction of the edge two-dimensional gradient and the opposite direction thereof by taking each depth edge pixel as a starting point until one of two sides traverses to a color edge pixel; after traversing to the color edge pixel, deleting the depth values of all pixels of the intermediate path from the starting point pixel to the color edge pixel; and defining the pixel of the deleted depth value as a non-aligned pixel, defining the pixel of the non-deleted depth value as an aligned pixel, and carrying out interpolation filling by using the peripheral non-deleted depth value for each deleted depth value.
3. The method as claimed in claim 2, wherein for each deleted depth value, interpolation filling is performed by using surrounding undeleted depth values, specifically: for each unaligned pixel p to be interpolatediCalculate its geodesic distance d to all other aligned pixelsg(pi,pj) Finding m nearest aligned pixels by using geodesic distance, and calculating interpolated depth value
Figure FDA0003030949720000021
Wherein
Figure FDA0003030949720000022
Representing a pixel piIs aligned to the nearest neighbor set of pixels, wg(i,h)=exp(-dg(pi,pj)),
Figure FDA0003030949720000023
Represents a pixel piProjection to pjThe local plane equation of (a) is composed of vjAnd njAnd (6) calculating.
4. The method of claim 1, wherein in S3, the detecting the plane and the reflection plane in the global triangular mesh model specifically includes:
detecting a plane in the global triangular mesh model, reserving the plane with the area larger than the area threshold value, projecting the plane onto a visible picture, and recording the picture set with the visible plane as
Figure FDA0003030949720000024
For the
Figure FDA0003030949720000025
Each picture I inkCalculate its K neighbor picture set
Figure FDA0003030949720000026
K neighbor calculation is obtained according to the overlapping rate of the vertexes in the global triangular mesh model after plane reflection;
by using
Figure FDA0003030949720000027
Constructing a matching cost body, and judging that the plane is in the picture IkWhether the reflection component is enough or not is judged by the following method: for each pixel, after mirroring the global triangular mesh model according to a plane equation, searching a cost corresponding to a mirrored depth value in a matched cost body, judging whether a cost position is a local minimum point, if the number of pixels of the local minimum point in the picture is greater than a pixel number threshold, considering that the plane has a reflection component on the picture, and if the number of visible pictures with reflection components on a certain plane is greater than a picture number threshold, considering that the plane is a reflection plane.
5. The method of claim 1, wherein in the step S3, for each reflection plane, a two-dimensional reflection area β on each visible picture is calculatedkThe method specifically comprises the following steps: projecting the reflecting plane onto a visible pictureObtaining a projection depth map, performing expansion operation on the projection depth map, comparing the expanded projection depth map with the aligned depth map to obtain an accurate two-dimensional reflection area, screening each pixel with a depth value in the projection depth map by using the distance between three-dimensional points and the included angle of a normal line, and taking the screened pixel area as a reflection area beta of the reflection plane on the imagek
6. The method according to claim 5, wherein in step S3, a double-layer representation is constructed on the reflection area for each picture that can see the reflection plane, specifically:
taking the projection depth map as an initial foreground depth map, mirroring the camera internal and external parameters of the image into a virtual camera according to the plane equation, rendering the initial background depth map in the virtual camera by using a global triangular mesh model, and converting the initial foreground background depth map into a simplified two-layer triangular mesh
Figure FDA0003030949720000028
And
Figure FDA0003030949720000029
calculating two layers of foreground and background pictures by using iterative optimization algorithm
Figure FDA00030309497200000210
And
Figure FDA00030309497200000211
and further optimize
Figure FDA00030309497200000212
And
Figure FDA00030309497200000213
before optimization, all related original pictures are subjected to inverse gamma correction in advance and used for post-processingContinuing to decompose;
the goal of the optimization is to minimize the energy function:
Figure FDA00030309497200000214
wherein in the optimization objective
Figure FDA00030309497200000215
Representing the rigid body transformation of the triangular mesh of the reflecting layer, the initial values of which are respectively an identity matrix and 0,
Figure FDA00030309497200000216
and
Figure FDA00030309497200000217
optimizing only the three-dimensional positions of the vertices of the mesh without changing the topology, Ed、Es、EpRespectively a data term, a smoothing term, a prior term, lambdas、λpU represents the weight of each term
Figure FDA0003030949720000031
A pixel of (1); specifically, the method comprises the following steps:
Figure FDA0003030949720000032
Figure FDA0003030949720000033
Figure FDA0003030949720000034
Figure FDA0003030949720000035
wherein H is a Laplace matrix; function omega-1Returning the two-dimensional coordinates, and imaging Ik′Point u in (1) is projected to image I according to depth value and camera internal and external parametersk
Figure FDA0003030949720000036
To represent
Figure FDA0003030949720000037
Projecting the obtained depth map; v represents
Figure FDA0003030949720000038
A vertex in (1);
to minimize the energy function, an alternating optimization scheme is used, which is first fixed for each round of optimization
Figure FDA0003030949720000039
And
Figure FDA00030309497200000310
optimization
Figure FDA00030309497200000311
Wherein
Figure FDA00030309497200000312
Is calculated using the following formula:
Figure FDA00030309497200000313
Figure FDA00030309497200000314
giving an initial value, and optimizing by using a nonlinear conjugate gradient method; then, fix
Figure FDA00030309497200000315
Optimization
Figure FDA00030309497200000316
And
Figure FDA00030309497200000317
the conjugate gradient method was also used; alternately optimizing for one round at a time, carrying out two rounds of optimization in the whole optimization process, and after the first round of optimization, utilizing the consistency constraint of the foreground pictures among a plurality of visual angles to carry out the first round of optimization
Figure FDA00030309497200000318
De-noising, in particular, known after a first round of optimization
Figure FDA00030309497200000319
And
Figure FDA00030309497200000320
Figure FDA00030309497200000321
acquiring a denoised image by using the following formula
Figure FDA00030309497200000322
And
Figure FDA00030309497200000323
Figure FDA00030309497200000324
Figure FDA00030309497200000325
use of
Figure FDA00030309497200000326
And
Figure FDA00030309497200000327
as
Figure FDA00030309497200000328
Continues the second round of optimization, further, adds a prior term to the total energy equation in the second round of optimization:
Figure FDA00030309497200000329
wherein λgThe prior term weight is used for restricting the second round of optimization;
after two rounds of optimization, utilize
Figure FDA00030309497200000330
Transformation of
Figure FDA00030309497200000331
Obtaining a final two-layer simplified triangular mesh
Figure FDA00030309497200000332
And after decomposition
Figure FDA00030309497200000333
For correctly rendering the reflection effect of the object surface.
7. The method of claim 1, wherein in step S4, a neighborhood picture set is calculated according to the inside and outside parameters of the virtual camera, the local coordinate system of the current virtual camera is divided into 8 quadrants according to the coordinate axis plane, and in each quadrant, the local coordinate system is further divided into 8 quadrantsSelecting a series of neighborhood pictures, and utilizing the optical center direction of the pictures
Figure FDA00030309497200000334
And virtual camera optical center direction
Figure FDA0003030949720000041
Angle of (2)
Figure FDA0003030949720000042
And picture optical center tkAnd virtual camera optical center tnDistance | tk-tnII, dividing each quadrant into a plurality of areas again; then, in each region, a similarity d is selectedkThe smallest 1 picture is added to the neighborhood picture set,
Figure FDA0003030949720000043
wherein λ is a distance proportion weight;
after the neighborhood picture set is obtained, drawing each picture in the neighborhood picture set to a virtual viewpoint according to the corresponding simplified triangular mesh, specifically:
a) a robust depth map is calculated, and for each pixel of a patch shader, the rendering cost c (t) is calculatedk,tn,x):
c(tk,tn,x)=∠(tk-x,tn-x)*π/180+max(0,1-||tn-x||/||tk-x||)
Wherein t iskAnd tnRepresenting the three-dimensional coordinates of optical centers of a picture and a virtual camera, x represents the three-dimensional coordinates of a corresponding three-dimensional point of the pixel, each pixel has a series of triangular patches rendered, wherein, the points represent the intersection points of light rays determined by the patches and the pixels, if the rendering cost of a certain point is more than the minimum rendering cost of all the points in the pixel plus a range threshold value lambda, the point does not participate in the computation of the depth map, and thus the depths of all the points participating in the computation of the depth map are compared to obtain the minimum value which is taken as the depth value of the pixel;
b) calculating the depth map of the virtual camera, adding the picture into a triangular mesh for drawing as a texture map, and regarding the pixel of each virtual camera picture, the color of a point near the depth map is set according to a set weight wkAnd mixing to obtain the final rendering color.
8. The method of claim 1, wherein in step S4, the reflection region β in the neighborhood picture is determined according to the reflection decompositionkAlso drawing to the current virtual viewpoint to obtain the reflection area beta of the current virtual viewpointnFor the pixels in the reflection area, the two layers of images of the foreground and the background and the simplified triangular mesh are required to be used for drawing, and the calculation of the depth map and the color mixing are respectively carried out on the two layers of images because of the two layers of images
Figure FDA0003030949720000044
And
Figure FDA0003030949720000045
the method is obtained by decomposition after inverse gamma correction is carried out, and in a rendering stage, two layers of mixed pictures are added, and a correct picture with a reflection effect is obtained by carrying out gamma correction once.
9. The method of claim 1, wherein in step S4, in order to reduce the storage size, all pictures are down-sampled to 1/n storage, n ≧ 1, and the virtual window is set to the original size during rendering.
10. The method for virtually roaming indoor scenes based on reflection decomposition as claimed in claims 1-9, wherein training a super-resolution neural network compensates for the loss of sharpness caused by downsampling of stored pictures while reducing possible rendering errors, specifically:
after the depth picture and the color picture are rendered and obtained at each new virtual visual angle, rendering errors are reduced and definition is improved by using a depth neural network; the network uses the color picture and the depth picture of the current frame plus the color picture and the depth picture of the previous frame as input; firstly, a three-layer convolution network is utilized to respectively extract features of a depth color picture of a current frame and a depth color picture of a previous frame, next, the features of the previous frame are mapped to the current frame in a distortion mode, the initial correspondence is obtained by depth map calculation, an alignment module is utilized to further fit a local two-dimensional offset to further align the features of the previous frame and the next frame, the aligned features of the previous frame and the next frame are combined and input into a super-resolution module realized through a U-Net convolution neural network, and a high-definition picture of the current frame is output.
CN202110429676.2A 2021-04-21 2021-04-21 Indoor scene virtual roaming method based on reflection decomposition Active CN113223132B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110429676.2A CN113223132B (en) 2021-04-21 2021-04-21 Indoor scene virtual roaming method based on reflection decomposition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110429676.2A CN113223132B (en) 2021-04-21 2021-04-21 Indoor scene virtual roaming method based on reflection decomposition

Publications (2)

Publication Number Publication Date
CN113223132A true CN113223132A (en) 2021-08-06
CN113223132B CN113223132B (en) 2022-05-17

Family

ID=77088240

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110429676.2A Active CN113223132B (en) 2021-04-21 2021-04-21 Indoor scene virtual roaming method based on reflection decomposition

Country Status (1)

Country Link
CN (1) CN113223132B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114972617A (en) * 2022-06-22 2022-08-30 北京大学 Scene illumination and reflection modeling method based on conductive rendering
CN116761017A (en) * 2023-08-18 2023-09-15 湖南马栏山视频先进技术研究院有限公司 High availability method and system for video real-time rendering
CN117272758A (en) * 2023-11-20 2023-12-22 埃洛克航空科技(北京)有限公司 Depth estimation method, device, computer equipment and medium based on triangular grid

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102592275A (en) * 2011-12-16 2012-07-18 天津大学 Virtual viewpoint rendering method
CN106952328A (en) * 2016-12-28 2017-07-14 北京大学 The method for drafting and system of a kind of Large-scale Macro virtual scene
CN107845134A (en) * 2017-11-10 2018-03-27 浙江大学 A kind of three-dimensional rebuilding method of the single body based on color depth camera
US20190051051A1 (en) * 2016-04-14 2019-02-14 The Research Foundation For The State University Of New York System and Method for Generating a Progressive Representation Associated with Surjectively Mapped Virtual and Physical Reality Image Data
US10325402B1 (en) * 2015-07-17 2019-06-18 A9.Com, Inc. View-dependent texture blending in 3-D rendering

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102592275A (en) * 2011-12-16 2012-07-18 天津大学 Virtual viewpoint rendering method
US10325402B1 (en) * 2015-07-17 2019-06-18 A9.Com, Inc. View-dependent texture blending in 3-D rendering
US20190051051A1 (en) * 2016-04-14 2019-02-14 The Research Foundation For The State University Of New York System and Method for Generating a Progressive Representation Associated with Surjectively Mapped Virtual and Physical Reality Image Data
CN106952328A (en) * 2016-12-28 2017-07-14 北京大学 The method for drafting and system of a kind of Large-scale Macro virtual scene
CN107845134A (en) * 2017-11-10 2018-03-27 浙江大学 A kind of three-dimensional rebuilding method of the single body based on color depth camera

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WEIWEI XU等: "Survey of 3D modeling using depth cameras", 《VIRTUAL REALITY & INTELLIGENT HARDWARE》 *
华炜等: "包含整体镜面反射的虚拟场景实时漫游算法", 《软件学报》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114972617A (en) * 2022-06-22 2022-08-30 北京大学 Scene illumination and reflection modeling method based on conductive rendering
CN114972617B (en) * 2022-06-22 2023-04-07 北京大学 Scene illumination and reflection modeling method based on conductive rendering
CN116761017A (en) * 2023-08-18 2023-09-15 湖南马栏山视频先进技术研究院有限公司 High availability method and system for video real-time rendering
CN116761017B (en) * 2023-08-18 2023-10-17 湖南马栏山视频先进技术研究院有限公司 High availability method and system for video real-time rendering
CN117272758A (en) * 2023-11-20 2023-12-22 埃洛克航空科技(北京)有限公司 Depth estimation method, device, computer equipment and medium based on triangular grid
CN117272758B (en) * 2023-11-20 2024-03-15 埃洛克航空科技(北京)有限公司 Depth estimation method, device, computer equipment and medium based on triangular grid

Also Published As

Publication number Publication date
CN113223132B (en) 2022-05-17

Similar Documents

Publication Publication Date Title
CN113223132B (en) Indoor scene virtual roaming method based on reflection decomposition
WO2022222077A1 (en) Indoor scene virtual roaming method based on reflection decomposition
US11727587B2 (en) Method and system for scene image modification
Wang et al. Neuris: Neural reconstruction of indoor scenes using normal priors
US6476803B1 (en) Object modeling system and process employing noise elimination and robust surface extraction techniques
KR101195942B1 (en) Camera calibration method and 3D object reconstruction method using the same
Li et al. Detail-preserving and content-aware variational multi-view stereo reconstruction
JP2007257287A (en) Image registration method
Hejazifar et al. Fast and robust seam estimation to seamless image stitching
CN111553841B (en) Real-time video splicing method based on optimal suture line updating
Ma et al. An operational superresolution approach for multi-temporal and multi-angle remotely sensed imagery
JP2000268179A (en) Three-dimensional shape information obtaining method and device, two-dimensional picture obtaining method and device and record medium
CN113781621A (en) Three-dimensional reconstruction processing method, device, equipment and storage medium
Alsadik Guided close range photogrammetry for 3D modelling of cultural heritage sites
Rothermel et al. Photometric multi-view mesh refinement for high-resolution satellite images
CN112862683A (en) Adjacent image splicing method based on elastic registration and grid optimization
Wan et al. Drone image stitching using local mesh-based bundle adjustment and shape-preserving transform
Hu et al. IMGTR: Image-triangle based multi-view 3D reconstruction for urban scenes
CN113706431A (en) Model optimization method and related device, electronic equipment and storage medium
CN112132971A (en) Three-dimensional human body modeling method, device, electronic equipment and storage medium
CN116805356A (en) Building model construction method, building model construction equipment and computer readable storage medium
JP2002520969A (en) Automated 3D scene scanning from motion images
Rüther et al. Laser Scanning in heritage documentation
Bielski et al. Order independent image compositing
Ling et al. Large-scale and efficient texture mapping algorithm via loopy belief propagation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant