CN109903372B

CN109903372B - Depth map super-resolution completion method and high-quality three-dimensional reconstruction method and system

Info

Publication number: CN109903372B
Application number: CN201910079993.9A
Authority: CN
Inventors: 李建伟; 高伟; 吴毅红
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2019-01-28
Filing date: 2019-01-28
Publication date: 2021-03-23
Anticipated expiration: 2039-01-28
Also published as: CN109903372A

Abstract

The invention relates to a depth map super-resolution completion method and a high-quality three-dimensional reconstruction method and system, wherein the method comprises the following steps: learning from an original LR depth image to be supplemented through SRC-Net to obtain an HR depth image; based on the gradient sensitivity detection, eliminating outer points in the HR depth image to obtain a processed HR depth image; learning from the HR color image through SRC-Net, and determining a normal map and a boundary map; performing ambiguity measurement on the HR color image to obtain ambiguity information; and optimizing the HR depth image according to the normal map, the boundary map and the ambiguity information to obtain a complete HR depth image. The method is based on a depth super-resolution and completion network, a gradient sensitivity outlier detection and elimination algorithm and a ambiguity and boundary constraint depth image adaptive optimization algorithm, and can perform super-resolution and completion operations on the original LR depth image, so that a completed HR depth image can be obtained, the difficulty of indoor scene three-dimensional reconstruction is reduced, and the reconstruction accuracy is improved.

Description

Depth map super-resolution completion method and high-quality three-dimensional reconstruction method and system

Technical Field

The invention relates to an image super-resolution and completion technology in the field of image processing and a three-dimensional reconstruction technology in the field of computer vision, in particular to a depth map super-resolution completion method and a high-quality three-dimensional reconstruction method and system.

Background

High-precision three-dimensional reconstruction of an indoor scene is one of challenging research subjects in computer vision, and relates to theories and technologies in multiple fields of computer vision, computer graphics, pattern recognition, optimization and the like.

The three-dimensional reconstruction technology aims at obtaining depth information of a scene or an object, and can be divided into two categories, namely passive measurement and active measurement according to the obtaining mode of the depth information. Passive measurements generally use the reflection of the surrounding environment, such as natural light, to acquire images using a camera, and obtain three-dimensional spatial information of an object through a specific algorithm, i.e., a vision-based three-dimensional reconstruction technique. Active measurement refers to the use of a light source or energy source such as laser, sound wave, electromagnetic wave, etc. to emit to a target object, and the depth information of the object is obtained by receiving the returned light wave. The active measurement method comprises the following steps: time of Flight (TOF), Structured Light (Structured Light), and triangulation. In recent years, the appearance of consumer-grade RGB-D cameras greatly promotes the indoor scene three-dimensional reconstruction technology. The RGB-D camera is a novel visual sensor combining active measurement and passive measurement, and can shoot a two-dimensional color image and actively transmit a signal to a target object to obtain object depth information. Common consumer grade RGB-D cameras are TOF cameras based on the time-of-flight method, microsoft Kinect, woo xution, intel real sense, etc. based on the structured light method. The KinectFusion algorithm proposed by Newcombe et al uses Kinect to obtain depth information of each Point in an image, estimates a camera pose through an Iterative approximate Closest Point (ICP) algorithm, and performs volume data fusion through a curved surface hidden Function (TSDF) iteration to obtain a dense three-dimensional model.

The three-dimensional reconstruction of indoor scenes based on consumer-grade RGB-D cameras generally has the following problems: (1) the depth image acquired by the RGB-D camera has low resolution and large noise, which can cause camera attitude estimation errors and make the details of the object surface difficult to maintain; (2) due to the fact that transparent or high-reflectivity objects exist in the indoor scene, the depth image acquired by the RGB-D camera is empty and missing; (3) the depth distance acquired by the RGB-D camera is limited, and the corresponding color image can provide high-resolution complete scene information. These problems make the three-dimensional reconstruction applications based on consumer grade RGB-D cameras relatively limited.

Disclosure of Invention

In order to solve the problems in the prior art, namely, to improve the accuracy of indoor scene reproduction and reduce reproduction difficulty, the invention provides a depth map super-resolution completion method and a high-quality three-dimensional reconstruction method and system.

In order to solve the technical problems, the invention provides the following scheme:

a depth map super-resolution completion method for three-dimensional reconstruction, the method comprising:

learning from an original low-resolution LR depth image to be supplemented through a depth super-resolution and supplementation network SRC-Net to obtain a high-resolution HR depth image;

based on gradient sensitivity detection, eliminating outer points in the HR depth image to obtain a processed HR depth image;

learning from the HR color image through SRC-Net, and determining a normal map and a boundary map;

performing ambiguity measurement on the HR color image to obtain ambiguity information;

and optimizing the HR depth image according to the normal map, the boundary map and the ambiguity information to obtain a complete HR depth image.

Optionally, the removing, based on the gradient sensitivity detection, an outlier in the HR depth image to obtain a processed HR depth image specifically includes:

calculating a gradient map G by using a Sobel operator_i：

g_i(u)＝Sobel(u)；

Wherein, g_i(u) is the gradient value corresponding to pixel u;

calculating a mask image M based on the gradient sensitivity_i：

m_i(u)＝0,g_i(u)≥g_h

m_i(u)＝1,g_i(u)＜g_h；

Wherein m is_i(u) is a mask value corresponding to pixel u, g_hIs a set gradient threshold;

using a mask image M_iFor high resolution depth map D_iAnd performing corrosion operation, and removing outer points to obtain a processed HR depth image.

Optionally, the performing ambiguity measurement on the HR color image to obtain ambiguity information specifically includes:

filtering the HR color image in the horizontal direction and the vertical direction respectively through an average filter to obtain a Re-blu image;

calculating the difference in the horizontal direction and the vertical direction of the original LR depth image and the Re-blu image respectively to obtain a horizontal difference and a vertical difference;

determining a difference map of the original LR depth image and the Re-blu image according to the horizontal difference and the vertical difference;

summing and normalizing the difference map to obtain a processing map;

calculating a blurriness measure Blur of the processing graph:

Blur＝max(R_H,R_V)

wherein R is_HIs a normalized horizontal direction difference value, R_VIs the normalized vertical direction difference value.

Optionally, the method optimizes the normal map, the boundary map, and the ambiguity information to obtain a complemented HR depth map, which specifically includes:

constructing an objective function according to the normal graph, the boundary graph and the ambiguity information;

and optimizing the HR depth image according to the objective function to obtain a completed HR depth image.

Optionally, according to the objective function, the HR depth image is optimized to obtain a completed HR depth image, which specifically includes:

determining an optimization function according to the objective function; the objective function comprises a first optimization term, a second optimization term and a third optimization term, and the optimization function is the weighted sum of the first optimization term, the second optimization term and the third optimization term;

E＝λ_DE_D+λ_SE_S+λ_NE_NB_nB_b

wherein the first optimization term E_DRepresenting the estimated depth D (p) and the observed depth D at pixel p_o(p) distance; third optimization term E_NRepresenting the consistency of the estimated depth with the predicted surface normal n (p); second optimization term E_SIndicating that the same pixel is promoted between neighboring pixels, where v (p, q) denotes a tangent vector between pixel p and pixel q, q denotes a neighboring pixel of pixel p; b is_n∈[0,1]Representing weighting of the normal term according to the predicted probability of a pixel on the occlusion boundary b (p); b is_b∈[0,1]Indicating that the normal term is weighted according to the blurriness of the color image; lambda [ alpha ]_D、λ_sAnd λ_NAll are preset reference coefficients;

and optimizing according to the optimization function to obtain a complete HR depth map.

Optionally, λ_DValue of 1000, λ_sValue of 1, λ_NThe value is 0.001.

a depth map super resolution completion system for three-dimensional reconstruction, the system comprising:

the super-resolution processing unit is used for learning from an original LR depth image to be supplemented through SRC-Net to obtain an HR depth image;

the outlier removing unit is used for removing outliers in the HR depth image based on gradient sensitivity detection to obtain a processed HR depth image;

the information extraction unit is used for learning from the HR color image through SRC-Net and determining a normal map and a boundary map;

the fuzzy measurement unit is used for measuring the fuzzy degree of the HR color image to obtain fuzzy degree information;

and the optimization unit is used for optimizing the HR depth image according to the normal map, the boundary map and the ambiguity information to obtain a complete HR depth map.

an indoor scene three-dimensional reconstruction method comprises the following steps:

calculating a three-dimensional point and a normal vector of each pixel in the supplemented HR depth map under a corresponding camera coordinate system;

estimating the attitude of the camera of the current frame by an iterative approximate nearest neighbor ICP algorithm according to the three-dimensional points and the normal vector;

performing volume data fusion through a Truncated Symbolic Distance Function (TSDF) model iteration according to the camera track information to obtain fusion data;

and performing surface estimation according to the fusion data and the posture of the current frame camera to obtain an indoor scene three-dimensional model.

Optionally, the three-dimensional point v under the corresponding camera coordinate system is calculated according to the following formula_i(u) and normal vector n_i(u)：

v_i(u)＝z_i(u)K^-1[u,1]^T

n_i(u)＝(v_i(u+1,v)-v_i(u,v))×(v_i(u,v+1)-v_i(u,v))；

Wherein K is camera internal parameter obtained by calibration, and z_iAnd (u) is the depth value corresponding to pixel u.

an indoor scene three-dimensional reconstruction system, comprising:

the preprocessing unit is used for calculating three-dimensional points and normal vectors in a corresponding camera coordinate system for each pixel in the supplemented HR depth map;

the estimation unit is used for estimating the attitude of the current frame camera by iterative approximation nearest neighbor point ICP algorithm according to the three-dimensional points and the normal vector;

the fusion unit is used for carrying out volume data fusion through the iteration of a Truncated Symbolic Distance Function (TSDF) model according to the camera track information to obtain fusion data;

and the modeling unit is used for carrying out surface estimation according to the fusion data and the posture of the current frame camera to obtain an indoor scene three-dimensional model.

According to the embodiment of the invention, the invention discloses the following technical effects:

the method is based on the depth super-resolution and completion network, the gradient sensitivity outer point detection and elimination algorithm and the ambiguity and boundary constraint depth image adaptive optimization algorithm, and can complete the original low-resolution LR depth image acquired by the RGB-D camera, so that a completed HR depth map can be obtained, the difficulty of indoor scene three-dimensional reconstruction is reduced, and the reconstruction accuracy is improved.

Drawings

FIG. 1 is a flow chart of a depth map super resolution completion method for three-dimensional reconstruction according to the present invention;

FIG. 2 is a schematic block diagram of a depth map super-resolution completion system for three-dimensional reconstruction according to the present invention;

FIG. 3 is a flow chart of a method for three-dimensional reconstruction of an indoor scene according to the present invention;

fig. 4 is a schematic block structure diagram of an indoor scene three-dimensional reconstruction system according to the present invention.

Description of the symbols:

the system comprises a super-resolution processing unit-1, an outlier rejection unit-2, an information extraction unit-3, a fuzzy measurement unit-4, an optimization unit-5, a preprocessing unit-6, an estimation unit-7, a fusion unit-8 and a modeling unit-9.

Detailed Description

Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and are not intended to limit the scope of the present invention.

The invention provides a depth map super-resolution completion method for three-dimensional reconstruction, which is based on a depth super-resolution and completion network, an outer point detection and elimination algorithm of gradient sensitivity and a self-adaptive optimization algorithm of a ambiguity and boundary constraint depth image, and can complete an original low-resolution LR depth image acquired by an RGB-D camera, thereby obtaining a completed HR depth map, and being beneficial to reducing the difficulty of three-dimensional reconstruction of an indoor scene and improving the accuracy of reconstruction.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

As shown in fig. 1, the depth map super-resolution completion method for three-dimensional reconstruction of the present invention includes:

step 100: learning from an original LR depth image to be supplemented through SRC-Net to obtain an HR depth image;

step 200: based on gradient sensitivity detection, eliminating outer points in the HR depth image to obtain a processed HR depth image;

step 300: learning from the HR color image through SRC-Net, and determining a normal map and a boundary map;

step 400: performing ambiguity measurement on the HR color image to obtain ambiguity information;

step 500: and optimizing the HR depth image according to the normal map, the boundary map and the ambiguity information to obtain a complete HR depth image.

In step 100, the Depth super-resolution and Completion network (SRC-Net) functionally includes the following two parts:

(1) the depth super-resolution is realized by adopting a Laplacian pyramid structure network with shared weight, a low-resolution depth image is taken as input, and high-frequency residual error information of the super-resolution depth image is predicted on each layer of pyramid. Each layer of pyramid structure has side output, and supervision information is introduced to enable the network to be trained better. In order to enlarge the receptive field of the high-frequency characteristic diagram, a recursive network with shared weights is adopted on each pyramid by a method of increasing the network depth. By adopting a recursive network weight sharing and local cross-channel connection strategy, because parameters in each convolution in a block are shared, the receptive field of the recursive network is calculated as follows:

RF＝(K_size-1)(N-1)+K_size (1)；

wherein, K_sizeIs the size of the convolution kernel, N is the number of convolutions, and RF represents the receptive field of the nth layer.

The cascade structure comprises two branches: a feature extraction branch and an image reconstruction branch. For each level (s-level), the input image is operated on with an upsampled level with one scale (scale) equal to 2. Then, the up-sampling layer adds the residual image obtained by predicting the characteristic extraction branch of the current layer. And inputs the added HR (High-resolution) image into the next stage (s + 1).

(2) The depth completion is mainly to guide the HR depth image through the HR color image and perform depth optimization completion. The invention predicts the surface normal and occlusion boundary from the HR color image separately based on two VGG-16 networks with symmetric encoders and decoders, and then uses the learned normal information and boundary information for deep optimization. The normal information is used to estimate the corresponding depth value; occlusion boundaries provide information for depth discontinuities and help preserve boundary sharpness.

The training process of the invention adopting the deep super-resolution and completion network (SRC-Net) is as follows:

(1) deep super-resolution: the objective of the laplacian pyramid network training is to learn a mapping function, so that an HR image is generated from an LR (low-resolution) image to approximate a true-value image with high resolution. We respectively select 795 depth images from the NYU-v2 data set of the real scene and 372 depth images from three sequences (kt0, kt1 and kt3) of the virtual room of the ICL-NUIM data set of the synthetic scene as training data. In order to obtain the LR depth image, the HR image is scaled to the LR image using a down-sampling method, and the down-sampling multiple is set to 2, 4, and 8. For data augmentation, we perform operations such as random scaling (scaling factor range is [0.5,1]), random rotation (90 degrees, 180 degrees, 270 degrees), and horizontal and vertical flipping on the training data.

(2) And (3) deep completion: the goal of two VGG-16 network training is to learn the mapping function, generate surface normal and occlusion boundaries from the HR color map, respectively, to approximate the high resolution truth normal and truth boundary. We performed network training using 54755 RGB-D images of the SUNCG-RGBD dataset for the composite scene, and 59743 rendered supplemented RGB-D images with the ScanNet dataset for the real scene.

Further, for different learning tasks, the invention employs the following two loss functions:

(1) the loss function for learning the HR depth image uses charbonier as a penalty function, and since the network is cascaded, the loss of error is required for the output of each stage. The loss function is defined as follows:

wherein D represents the predicted depth, D^*The depth of the truth is shown, M is the number of samples per training, L is the number of pyramid layers, and ε is 1 e-3.

(2) The penalty functions for learning normal and occlusion boundaries in HR color images are defined as follows:

wherein B denotes a prediction boundary, B^*Representing a truth boundary; n denotes the prediction Normal, N^*Representing a true value method; n is the number of effective pixels.

In order to effectively detect and remove error outliers in an HR depth image, normalization processing is carried out on the HR depth image, gradient is calculated, outliers are determined by detecting gradient change, and the outliers are removed.

In step 200, the removing, based on the gradient sensitivity detection, outliers in the HR depth image to obtain a processed HR depth image specifically includes:

step 201: calculating a gradient map G by using a Sobel operator_i：

g_i(u)＝Sobel(u) (5)；

Wherein, g_iAnd (u) is the gradient value corresponding to pixel u.

Step 202: according to the gradient sensitivity, the meterComputation mask image M_i：

m_i(u)＝0,g_i(u)≥g_h

m_i(u)＝1,g_i(u)＜g_h (6)；

Wherein m is_i(u) is a mask value corresponding to pixel u, g_hIs a set gradient threshold.

Step 203: using a mask image M_iFor high resolution depth map D_iAnd performing corrosion operation, and removing outer points to obtain a processed HR depth image.

Since the color image inevitably has motion blur when scanning an indoor scene using a consumer RGB-D camera, the optimization effect is affected if the depth is optimized directly using normal and boundary information obtained from the blurred color image. In order to ensure the quality of a depth image, no-reference ambiguity measurement is carried out on the quality of a color image, and subsequent depth image completion optimization is restrained by ambiguity information.

Specifically, in step 400, the performing ambiguity measurement on the HR color image to obtain ambiguity information includes:

step 401: and filtering the HR color image in the horizontal direction and the vertical direction through an average filter to obtain a Re-blu image.

Step 402: and respectively calculating the difference of the original LR depth image and the Re-blu in the horizontal direction and the vertical direction to obtain a horizontal difference and a vertical difference.

Step 403: and determining a difference map of the original LR depth image and the Re-blu image according to the horizontal difference and the vertical difference.

Step 404: and summing and normalizing the difference maps to obtain a processing map.

Step 405: calculating a blurriness measure Blur of the processing graph:

Blur＝max(R_H,R_V) (7)；

Further, in step 500, according to the normal map, the boundary map, and the ambiguity information, the optimization is performed to obtain a completed HR depth map, which specifically includes:

step 501: and constructing an objective function according to the normal graph, the boundary graph and the ambiguity information. In the present embodiment, three objective functions are obtained in total, namely, the first optimization term E_DSecond optimization term E_NThird optimization term E_s。

Step 502: and optimizing the HR depth image according to the objective function to obtain a completed HR depth image.

Preferably, according to the objective function, the HR depth image is optimized to obtain a completed HR depth image, which specifically includes:

step 5021: determining an optimization function E according to the objective function; the objective function comprises a first optimization term, a second optimization term and a third optimization term, and the optimization function is a weighted sum of the first optimization term, the second optimization term and the third optimization term.

Specifically, the following formula (8):

E＝λ_DE_D+λ_SE_S+λ_NE_NB_nB_b

wherein the first optimization term E_DRepresenting the estimated depth D (p) and the observed depth D at pixel p_o(p) distance; third optimization term E_NRepresenting the consistency of the estimated depth with the predicted surface normal n (p); second optimization term E_SRepresenting facilitated neighboring imagesThe pixels have the same pixel with each other, wherein v (p, q) represents a tangent vector between the pixel p and a pixel q, and q represents a neighboring pixel of the pixel p; b is_n∈[0,1]Representing weighting of the normal term according to the predicted probability of a pixel on the occlusion boundary b (p); b is_b∈[0,1]Indicating that the normal term is weighted according to the blurriness of the color image; lambda [ alpha ]_D、λ_sAnd λ_NAre all preset reference coefficients.

Step 5022: and optimizing according to the optimization function to obtain a complete HR depth map.

In the present embodiment, λ_DValue of 1000, λ_sValue of 1, λ_NThe value is 0.001.

In addition, the invention also provides a depth map super-resolution completion system for three-dimensional reconstruction. As shown in fig. 2, the depth map super-resolution completion system for three-dimensional reconstruction according to the present invention includes a super-resolution processing unit 1, an outlier rejection unit 2, an information extraction unit 3, a fuzzy measurement unit 4, and an optimization unit 5.

Specifically, the super-resolution processing unit 1 is configured to learn from an original LR depth image to be supplemented through SRC-Net to obtain an HR depth image.

The outlier rejection unit 2 is configured to reject outliers in the HR depth image based on gradient sensitivity detection, so as to obtain a processed HR depth image.

The information extraction unit 3 is used for learning from the HR color image through SRC-Net, and determining a normal map and a boundary map.

The ambiguity measuring unit 4 is used for measuring ambiguity of the HR color image to obtain ambiguity information.

And the optimization unit 5 is configured to optimize the HR depth image according to the normal map, the boundary map, and the ambiguity information to obtain a completed HR depth map.

Further, the invention also provides an indoor scene three-dimensional reconstruction method. As shown in fig. 3, the method for reconstructing an indoor scene in three dimensions according to the present invention includes:

step 600: and calculating a three-dimensional point and a normal vector of each pixel in the supplemented HR depth map under a corresponding camera coordinate system.

Step 700: and estimating the attitude of the current frame camera by an ICP (inductively coupled plasma) algorithm according to the three-dimensional points and the normal vector.

Step 800: and carrying out volume data fusion through TSDF model iteration according to the camera track information to obtain fusion data.

Step 900: and performing surface estimation according to the fusion data and the posture of the current frame camera to obtain an indoor scene three-dimensional model.

Processing an original LP depth image acquired by a consumer RGB-D camera through steps 100-600 to obtain a completed HR depth image; then, for each pixel u in the HR depth map, a three-dimensional point v in the corresponding camera coordinate system is calculated_i(u) and normal vector n_i(u), the calculation formula is as follows:

In step 700, the current depth map and the depth map generated by performing ray projection on the three-dimensional model under the previous frame view angle are registered through an ICP (Iterative close Point) algorithm, so as to obtain the pose of the current frame camera.

Transformation matrix T of current frame camera attitude relative to global coordinate system_g,iBy minimizing the point-to-plane distance error E (T)_g,i) Calculated, the formula is as follows:

wherein,

is the projected pixel of the pixel u,

is a three-dimensional point v_i(u) a homogeneous coordinate form of (u),

and

is the three-dimensional point and normal vector predicted from the previous frame.

In step 800, based on the estimation result of the camera pose, the HR depth image of each frame is fused by using a TSDF (Truncated Signed Distance Function) model.

Three-dimensional spaces are represented using a voxel grid of resolution m, i.e. each space is divided into m blocks, each grid v storing two values: truncating the symbol distance function f_i(v) And weight w thereof_i(v) In that respect Truncating the symbol distance function f_i(v) Is defined as follows:

f_i(v)＝[K^-1z_i(u)[u^T,1]^T]_z-[v_i]_z (11)；

wherein f is_i(v) The distance from the mesh to the surface of the object model is shown, and the positive and negative indicate whether the mesh is on the occluded side or the visible side of the surface, and the zero crossing points are points on the surface. In this embodiment, the weights are averaged and fixed to 1.

The iterative formula for TSDF volumetric data fusion is as follows:

and performing light projection on the volume data obtained by fusion under the posture of the current frame camera to obtain surface point cloud, registering the estimated surface and the depth map acquired in real time in a camera tracking part, and finally extracting through a MarchingCube algorithm to obtain a three-dimensional model.

The invention provides a depth image super-resolution and completion method based on depth learning, which comprises a depth image super-resolution and completion (SRC-Net) network, an outlier elimination algorithm based on gradient sensitivity and a depth image self-adaptive optimization algorithm based on ambiguity and boundary constraint, and is applied to an indoor scene offline three-dimensional reconstruction system. The effect of super-resolution and completion of the depth image provided by the standard data set shows that: the depth image super-resolution and completion method can effectively process the original low-resolution depth image into a high-resolution completed depth image. The results of three-dimensional reconstruction of indoor scene data provided by the standard data set all show that: the indoor scene three-dimensional reconstruction system can obtain a complete and accurate high-quality indoor scene model, and has good robustness and expansibility.

Preferably, the invention also provides an indoor scene three-dimensional reconstruction system. As shown in fig. 4, the indoor scene three-dimensional reconstruction system of the present invention includes a preprocessing unit 6, an estimating unit 7, a fusing unit 8, and a modeling unit 9.

Specifically, the preprocessing unit 6 is configured to calculate a three-dimensional point and a normal vector in a corresponding camera coordinate system for each pixel in the supplemented HR depth map.

The estimation unit 7 is configured to estimate the pose of the current frame camera by iteratively approximating an ICP algorithm to a nearest neighbor point according to the three-dimensional point and the normal vector.

The fusion unit 9 is configured to perform volume data fusion through a truncated symbolic distance function TSDF model iteration according to the camera trajectory information to obtain fusion data.

The modeling unit 10 is configured to perform surface estimation according to the fusion data and the pose of the current frame camera to obtain an indoor scene three-dimensional model.

Compared with the prior art, the depth map super-resolution completion method and system for three-dimensional reconstruction, and the indoor scene three-dimensional reconstruction method and system have the same beneficial effects, and are not repeated herein.

So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims

1. A depth map super-resolution completion method for three-dimensional reconstruction, the method comprising:

2. The depth map super-resolution completion method for three-dimensional reconstruction according to claim 1, wherein the method for removing outliers in the HR depth image based on gradient sensitivity detection to obtain the processed HR depth image specifically comprises:

calculating a gradient map G by using a Sobel operator_i：

g_i(u)＝Sobel(u)；

Wherein, g_i(u) is the gradient value corresponding to pixel u;

calculating a mask image M based on the gradient sensitivity_i：

3. The depth map super-resolution completion method for three-dimensional reconstruction according to claim 1, wherein the blur degree measurement is performed on the HR color image to obtain blur degree information, and specifically comprises:

summing and normalizing the difference map to obtain a processing map;

calculating a blurriness measure Blur of the processing graph:

Blur＝max(R_H,R_V)

4. The depth map super-resolution completion method for three-dimensional reconstruction according to claim 1, wherein the HR depth image is optimized according to the normal map, the boundary map, and the ambiguity information to obtain a completed HR depth map, specifically comprising:

5. The depth map super-resolution completion method for three-dimensional reconstruction according to claim 4, wherein the HR depth image is optimized according to the objective function to obtain a completed HR depth map, specifically comprising:

E＝λ_DE_D+λ_SE_S+λ_NE_NB_nB_b

6. The depth map super resolution completion for three dimensional reconstruction of claim 5Method characterized by λ_DValue of 1000, λ_sValue of 1, λ_NThe value is 0.001.

7. A depth map super resolution completion system for three-dimensional reconstruction, the system comprising:

8. An indoor scene three-dimensional reconstruction method is characterized by comprising the following steps:

acquiring a complemented HR depth map by the depth map super resolution complementing method for three-dimensional reconstruction of any one of claims 1-6;

9. The method of claim 8, wherein the three-dimensional points v in the corresponding camera coordinate system are calculated according to the following formula_i(u) and normal vector n_i(u)：

10. An indoor scene three-dimensional reconstruction system, characterized in that the indoor scene three-dimensional reconstruction system comprises:

a preprocessing unit, configured to calculate three-dimensional points and normal vectors in a corresponding camera coordinate system for each pixel in a supplemented HR depth map obtained by the depth map super-resolution supplementation method for three-dimensional reconstruction according to any one of claims 1 to 6;