CN117078851A - Single-view three-dimensional point cloud reconstruction method - Google Patents
Single-view three-dimensional point cloud reconstruction method Download PDFInfo
- Publication number
- CN117078851A CN117078851A CN202311030758.5A CN202311030758A CN117078851A CN 117078851 A CN117078851 A CN 117078851A CN 202311030758 A CN202311030758 A CN 202311030758A CN 117078851 A CN117078851 A CN 117078851A
- Authority
- CN
- China
- Prior art keywords
- depth
- point cloud
- features
- dimensional point
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 230000004927 fusion Effects 0.000 claims abstract description 39
- 238000013507 mapping Methods 0.000 claims abstract description 10
- 238000005065 mining Methods 0.000 claims abstract description 6
- 238000011176 pooling Methods 0.000 claims description 8
- 238000012549 training Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000006116 polymerization reaction Methods 0.000 claims description 3
- 238000011160 research Methods 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000007499 fusion processing Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000013506 data mapping Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Multimedia (AREA)
- Geometry (AREA)
- Computer Graphics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a single-view three-dimensional point cloud reconstruction method, which comprises the following steps: the depth map acquisition module is constructed by adopting the network structure which is the same as a high-quality monocular depth estimation network, and acquires a depth map I corresponding to a color image I by carrying out depth estimation on the input color image I D The method comprises the steps of carrying out a first treatment on the surface of the Constructing a depth image feature learning module, taking a depth image estimated from an input color image I as input, and mining depth information for assisting a three-dimensional point cloud reconstruction process; designing a color-depth information fusion module to extract M layers of image features by a color image feature learning moduleM levels of depth features extracted by the depth feature learning moduleA multi-stage fusion strategy from shallow to deep is adopted for input, and depth features and image features are fused on each level; constructing a three-dimensional point cloud reconstruction module to learn M-th-level RGB-D fusion characteristicsBy mapping the features to 3D space for input, reconstruction of a three-dimensional point cloud is achieved.
Description
Technical Field
The invention relates to the field of deep learning and three-dimensional point cloud reconstruction, in particular to a single-view three-dimensional point cloud reconstruction method.
Background
With the rapid development of the fields of three-dimensional vision, generation type artificial intelligence and the like, the three-dimensional point cloud reconstruction technology is widely applied to various scenes such as automatic driving, virtual reality, robot navigation and the like. As an important branch of three-dimensional point cloud reconstruction research, single-view three-dimensional point cloud reconstruction aims at deducing the three-dimensional geometry and structure of an object from a single two-dimensional view, and is receiving more and more attention from researchers.
In recent years, due to the strong feature learning capability of the deep neural network, single-view three-dimensional point cloud reconstruction based on deep learning has become a mainstream research direction in the field of single-view three-dimensional point cloud reconstruction. Fan et al propose the first network PSGN (point set generation network) to reconstruct a three-dimensional point cloud using a deep learning method, which is reconstructed using a codec that introduces an hourglass structure, has a high degree of flexibility, can better combine global and local information, and is excellent in reconstructing an object having a complex structure. Mandikal et al propose a reconstruction network combining image coding and point cloud coding that first trains a point cloud self-encoder to learn the potential representation space of the point cloud, then maps a two-dimensional image to the potential representation space with the image encoder, and finally achieves single-view point cloud reconstruction by the image encoder and the point cloud decoder. Mandikal et al propose a location-aware segmentation penalty to better constrain the network to generate a three-dimensional point cloud. Further, jiang et al propose a geometric countering loss by maintaining geometric consistency of the predicted and real point clouds over different perspectives to normalize the reconstructed point clouds as a whole.
The method has made a certain research progress in the field of single-view three-dimensional point cloud reconstruction. However, single views lack an expression of object depth, which results in a three-dimensional point cloud reconstructed with only a single view often being insufficiently accurate. Therefore, how to acquire depth information to assist the reconstruction process and further improve the reconstruction quality of the three-dimensional point cloud has important research significance.
Disclosure of Invention
In order to alleviate the problem of insufficient expression of single view to object depth information, the invention provides a single view three-dimensional point cloud reconstruction method, which takes a depth map estimated from a single Zhang Caise image as an aid to supplement necessary depth information in the reconstruction process, thereby realizing higher-quality three-dimensional point cloud reconstruction, and is described in detail below:
a single view three-dimensional point cloud reconstruction method, the method comprising:
constructing a color image feature learning module, taking a single color image I as input, and excavating geometric features and semantic features of each input view;
the depth map acquisition module is constructed by adopting the network structure which is the same as a high-quality monocular depth estimation network, and acquires a depth map I corresponding to a color image I by carrying out depth estimation on the input color image I D ;
Constructing a depth image feature learning module, taking a depth image estimated from an input color image I as input, and mining depth information for assisting a three-dimensional point cloud reconstruction process;
designing a color-depth information fusion module to extract M layers of image features by a color image feature learning moduleAnd M levels of depth features extracted by the depth feature learning module +.>A multi-stage fusion strategy from shallow to deep is adopted for input, and depth features and image features are fused on each level;
constructing a three-dimensional point cloud reconstruction module to learn M-th-level RGB-D fusion characteristicsBy mapping the features to 3D space for input, reconstruction of a three-dimensional point cloud is achieved.
The depth map acquisition module is as follows:
mapping from a three-dimensional space to a depth space by utilizing real three-dimensional point cloud data and camera parameters in a shape Net data set, and converting the three-dimensional point cloud into a depth map to generate a pseudo-label of the depth map;
the input color images and the corresponding pseudo labels of the depth map are input into a depth map acquisition module in pairs for supervised fine tuning training;
the input color image I is sent to the module for depth estimation to obtain a depth image I corresponding to the I D 。
Wherein, the color-depth information fusion module is:
r is R m 、D m And RGB-D characteristics obtained by carrying out characteristic fusion at the (m-1) th levelSending the features into a cascade layer to cascade the features along the channel dimension, and performing self-adaptive fusion on the cascaded features through a convolution layer to obtain a primary fused mth-level RGB-D feature F m :
Wherein, [. Cndot. ] represents a concatenation operation, conv (. Cndot.) represents a convolution operation with a convolution kernel of 3 x 3;
will F m Sending into an attention unit for feature selection, wherein the attention unit consists of a global pooling layer, a full connection layer and a Sigmoid layer; employing a global pooling layer to aggregate F m Is represented as a feature descriptor of the aggregated feature:
wherein,representing the characteristics after polymerization, h m 、w m And c m Respectively represent the characteristics F m Height, width and channel dimension number;
based onAcquiring interdependence among characteristic channels by adopting full connection layer to obtain F m The channel values of the obtained attention profile are normalized to (0, 1) by means of the Sigmoid layer, the attention profile finally learned by the attention unit>Expressed as:
wherein, FC (·) represents the mapping process of the full connection layer, sigmoid (·) represents the normalization operation implemented by the Sigmoid function;
based on the obtained attention profileFor F m Weighting channel dimension, and finally outputting RGB-D fusion characteristic after attention enhancement>
Wherein f scale (. Cndot.) means channel-level multiplication between the attention profile and the original profile;
self-adaptive learning is carried out by adopting a self-adaptive block formed by a plurality of convolution layers, and finally the mth level RGB-D fusion characteristic is obtained
The color-depth information fusion module finally fuses the features by M-th level RGB-D through carrying out M-level feature fusion from shallow to deepFor output, the features contain both image geometry information, semantic information, and depth information.
The technical scheme provided by the invention has the beneficial effects that:
1. the invention provides a single-view three-dimensional point cloud reconstruction method, which uses a depth map estimated from an input color image as an aid to obtain necessary depth information in a reconstruction process, so that the problem of insufficient expression of the single view on object depth information is effectively solved, and the quality of single-view three-dimensional point cloud reconstruction is improved;
2. the invention constructs a depth map acquisition module which utilizes the real three-dimensional point cloud data mapping to generate a depth map pseudo tag for the fine tuning training of the constrained depth estimation process, thereby estimating a more accurate depth image from an input color image and further mining more effective depth information;
3. the invention designs a color-depth information fusion module which adopts a shallow-to-deep multi-stage fusion strategy to fully aggregate color information learned from an input view and depth information mined from a depth view, thereby capturing RGB-D fusion characteristics which are more beneficial to the reconstruction process.
Drawings
Fig. 1 is a flowchart of a single view three-dimensional point cloud reconstruction method.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in further detail below.
The following describes a specific embodiment of a single-view three-dimensional point cloud reconstruction method according to the present invention by way of example.
1. Building color image feature learning module
First, a color image feature learning module is constructed, which takes a single color image I as input, aiming at mining the geometric and semantic features of each input view. In particular, the module co-extracts image features of a total of M levelsWherein R is m Representing the learned image feature of the mth level, M is set to 5.
The formula of the feature extraction process is as follows:
wherein E is C (·) represents a color image feature learning module consisting of M convolutional layers and corresponding pooling layers in the VGG16 network.
The VGG16 network is well known to those skilled in the art, and the embodiments of the present invention will not be described in detail.
2. Building depth map acquisition module
The depth map acquisition module is constructed by adopting the same network structure as DenseDepth (high quality monocular depth estimation network), and acquires a depth map I corresponding to a color image I by performing depth estimation on the input color image I D . Specifically, the constructed depth map acquisition module performs pre-training on the autopilot scene data set KITTI first, and then performs fine-tuning training on the three-dimensional model data set shape Net to achieve more accurate depth estimation performance. However, the process is not limited to the above-described process,the process of fine tuning the module requires a real depth map as a supervisory signal, while the shaanenet dataset does not provide a real depth map label corresponding to the color image. In order to solve the problems, mapping from a three-dimensional space to a depth space is performed by using real three-dimensional point cloud data in a shape net data set and camera parameters, and the three-dimensional point cloud is converted into a depth map so as to generate a depth map pseudo tag. Then, the input color image and the corresponding depth map pseudo tag are input into a depth map acquisition module in pairs for supervised fine tuning training. After finishing the fine tuning training of the depth map acquisition module, the input color image I is sent to the module for depth estimation so as to obtain a depth map I corresponding to the I D And is further used for mining depth information.
The DenseDepth network, the KITTI data set, and the shape Net data set are all known to those skilled in the art, and the description of the embodiments of the present invention is omitted.
3. Building depth image feature learning module
The depth image feature learning module takes a depth image estimated from an input color image I as input to mine depth information for assisting a three-dimensional point cloud reconstruction process. Specifically, the constructed depth image feature learning module will first estimate the resulting single channel depth map I D Map coding feeds three-way representations into a depth feature extractor to mine I D Depth characteristics, three-channel representation includes: horizontal parallax, height above ground, and the angle of the local surface normal of the pixel coincident with the inferred gravitational direction.
The feature extractor co-learns depth features of M levelsD m Representing the learned depth features of the mth level, M is set to 5. The formula of the feature extraction process is as follows:
wherein the method comprises the steps of,E D (·) represents a depth feature extractor consisting of a VGG16 network that holds 5 convolution blocks and corresponding pooling layers.
4. Design color-depth information fusion module
In order to more effectively utilize the color image features and the depth features to reconstruct a higher-quality three-dimensional point cloud, a color-depth information fusion module is designed, specifically, M layers of image features extracted by a color image feature learning moduleAnd M levels of depth features extracted by the depth feature learning module +.>For input, the color-depth information fusion module adopts a multi-stage fusion strategy from shallow to deep, and fuses depth features and image features on each level.
Taking the m-th level feature fusion process as an example, R is firstly taken as a reference m 、D m And RGB-D characteristics obtained by carrying out characteristic fusion at the (m-1) th levelSending the features into a cascade layer to cascade the features along the channel dimension, and performing self-adaptive fusion on the cascaded features through a convolution layer to obtain a primary fused mth-level RGB-D feature F m :
Wherein [ (S)]Representing a concatenation operation, conv (·) represents a convolution operation with a convolution kernel of 3×3. In particular, when m=1,
in order for the fusion process to focus on information useful for reconstruction, redundant information useless for reconstruction is discarded, F m Into the attention unitAnd performing feature selection. In particular, the attention unit consists of a global pooling layer, a fully connected layer and a Sigmoid layer. First, global pooling layer is employed to aggregate F m Is represented as a feature descriptor of the aggregated feature:
wherein,representing the characteristics after polymerization, h m 、w m And c m Respectively represent the characteristics F m Height, width and channel dimension number.
Second, based onAcquiring interdependencies among feature channels by using full connection layer to obtain F m Is a feature of the attention map of (2). Finally, the channel values of the resulting attention profile are normalized to (0, 1) using the Sigmoid layer. Attention module finally learned attention profile +.>Expressed as:
wherein, FC (·) represents the mapping process of the full connection layer, sigmoid (·) represents the normalization operation implemented by the Sigmoid function.
Thereafter, based on the obtained attention profileFor F m Weighting channel dimension, and finally outputting RGB-D fusion characteristic after attention enhancement>The formula for the above procedure is as follows:
wherein f scale (·) means channel-level multiplication between the attention profile and the original profile.
In the process of obtainingThen, adaptive learning is carried out by adopting an adaptive block formed by a plurality of convolution layers, and finally the m-th level RGB-D fusion characteristic is obtained>
Wherein Conv n (. Cndot.) represents the convolution operation of n convolution layers. Specifically, for the 1 st level fusion to the M th level fusion, the number of convolution layers n in the adaptive block is set to (2,3,3,3,1), respectively.
The color-depth information fusion module finally fuses the features by M-th level RGB-D through carrying out M-level feature fusion from shallow to deepFor output, the features contain both image geometry information, semantic information, and depth information.
5. Building three-dimensional point cloud reconstruction module
And finally, constructing a three-dimensional point cloud reconstruction module. With learned M-th level RGB-D fusion featuresFor input, the module implements a three-dimensional point cloud by mapping the features to 3D spaceThe formula of the reconstruction process is as follows:
wherein P is E R N×3 And (3) representing the reconstructed three-dimensional point cloud, wherein N represents the number of space points contained in the reconstructed point cloud, N is set to 1024, REC (·) represents a three-dimensional point cloud reconstruction module, and the three-dimensional point cloud reconstruction module adopts the same structure as a predictor in a classical single-view point cloud reconstruction network PSGN.
As known to those skilled in the PSGN arts, the embodiment of the present invention will not be described in detail.
In order to optimize the reconstruction process of the three-dimensional point cloud, the single-view point cloud reconstruction method provided by the embodiment of the invention adopts a Chamfer Distance (CD) loss to restrict the Distance between the reconstructed point cloud and the real point cloud, and the formula of the CD loss is expressed as follows:
wherein p is i Representing a reconstruction point cloud P g Any one of the spatial points, p j Representing a real point cloud P t Any one of the spatial points in the above. The smaller the value of the CD loss, the smaller the distance from the reconstructed point cloud to the real point cloud, thus representing a higher accuracy of the reconstruction process.
Those skilled in the art will appreciate that the drawings are schematic representations of only one preferred embodiment, and that the above-described embodiment numbers are merely for illustration purposes and do not represent advantages or disadvantages of the embodiments.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.
Claims (3)
1. A single view three-dimensional point cloud reconstruction method, the method comprising:
constructing a color image feature learning module, taking a single color image I as input, and excavating geometric features and semantic features of each input view;
the depth map acquisition module is constructed by adopting the network structure which is the same as a high-quality monocular depth estimation network, and acquires a depth map I corresponding to a color image I by carrying out depth estimation on the input color image I D ;
Constructing a depth image feature learning module, taking a depth image estimated from an input color image I as input, and mining depth information for assisting a three-dimensional point cloud reconstruction process;
designing a color-depth information fusion module to extract M layers of image features by a color image feature learning moduleAnd M levels of depth features extracted by the depth feature learning module +.>A multi-stage fusion strategy from shallow to deep is adopted for input, and depth features and image features are fused on each level;
constructing a three-dimensional point cloud reconstruction module to learn M-th-level RGB-D fusion characteristicsBy mapping the features to 3D space for input, reconstruction of a three-dimensional point cloud is achieved.
2. The single-view three-dimensional point cloud reconstruction method according to claim 1, wherein the depth map acquisition module is:
mapping from a three-dimensional space to a depth space by utilizing real three-dimensional point cloud data and camera parameters in a shape Net data set, and converting the three-dimensional point cloud into a depth map to generate a pseudo-label of the depth map;
the input color images and the corresponding pseudo labels of the depth map are input into a depth map acquisition module in pairs for supervised fine tuning training;
the input color image I is sent to the module for depth estimation to obtain a depth image I corresponding to the I D 。
3. The method for reconstructing a single-view three-dimensional point cloud according to claim 1, wherein the color-depth information fusion module is:
r is R m 、D m And RGB-D characteristics obtained by carrying out characteristic fusion at the (m-1) th levelSending the features into a cascade layer to cascade the features along the channel dimension, and performing self-adaptive fusion on the cascaded features through a convolution layer to obtain a primary fused mth-level RGB-D feature F m :
Wherein, [. Cndot. ] represents a concatenation operation, conv (. Cndot.) represents a convolution operation with a convolution kernel of 3 x 3;
will F m Sending into an attention unit for feature selection, wherein the attention unit consists of a global pooling layer, a full connection layer and a Sigmoid layer; employing a global pooling layer to aggregate F m Is represented as a feature descriptor of the aggregated feature:
wherein,representing the characteristics after polymerization, h m 、w m And c m Respectively represent the characteristics F m Height, width and channel dimension number;
based onAcquiring interdependence among characteristic channels by adopting full connection layer to obtain F m The channel values of the obtained attention profile are normalized to (0, 1) by means of the Sigmoid layer, the attention profile finally learned by the attention unit>Expressed as:
wherein, FC (·) represents the mapping process of the full connection layer, sigmoid (·) represents the normalization operation implemented by the Sigmoid function;
based on the obtained attention profileFor F m Weighting channel dimension, and finally outputting RGB-D fusion characteristic after attention enhancement>
Wherein f scale (. Cndot.) means channel-level multiplication between the attention profile and the original profile;
self-adaptive learning is carried out by adopting a self-adaptive block formed by a plurality of convolution layers, and finally the mth level RGB-D fusion characteristic is obtained
The color-depth information fusion module finally fuses the features by M-th level RGB-D through carrying out M-level feature fusion from shallow to deepFor output, the features contain both image geometry information, semantic information, and depth information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311030758.5A CN117078851A (en) | 2023-08-15 | 2023-08-15 | Single-view three-dimensional point cloud reconstruction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311030758.5A CN117078851A (en) | 2023-08-15 | 2023-08-15 | Single-view three-dimensional point cloud reconstruction method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117078851A true CN117078851A (en) | 2023-11-17 |
Family
ID=88707319
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311030758.5A Pending CN117078851A (en) | 2023-08-15 | 2023-08-15 | Single-view three-dimensional point cloud reconstruction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117078851A (en) |
-
2023
- 2023-08-15 CN CN202311030758.5A patent/CN117078851A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111652966B (en) | Three-dimensional reconstruction method and device based on multiple visual angles of unmanned aerial vehicle | |
CN111325794B (en) | Visual simultaneous localization and map construction method based on depth convolution self-encoder | |
CN109377530B (en) | Binocular depth estimation method based on depth neural network | |
CN111739078B (en) | Monocular unsupervised depth estimation method based on context attention mechanism | |
CN111462329B (en) | Three-dimensional reconstruction method of unmanned aerial vehicle aerial image based on deep learning | |
CN111259945B (en) | Binocular parallax estimation method introducing attention map | |
Zhou et al. | Underwater vision enhancement technologies: A comprehensive review, challenges, and recent trends | |
CN111783582A (en) | Unsupervised monocular depth estimation algorithm based on deep learning | |
CN111931787A (en) | RGBD significance detection method based on feature polymerization | |
Zhan et al. | Self-supervised learning for single view depth and surface normal estimation | |
CN111105432A (en) | Unsupervised end-to-end driving environment perception method based on deep learning | |
CN115187638B (en) | Unsupervised monocular depth estimation method based on optical flow mask | |
CN113284173B (en) | End-to-end scene flow and pose joint learning method based on false laser radar | |
CN113313732A (en) | Forward-looking scene depth estimation method based on self-supervision learning | |
Wei et al. | Bidirectional hybrid lstm based recurrent neural network for multi-view stereo | |
CN114429555A (en) | Image density matching method, system, equipment and storage medium from coarse to fine | |
CN113222033A (en) | Monocular image estimation method based on multi-classification regression model and self-attention mechanism | |
CN115294282A (en) | Monocular depth estimation system and method for enhancing feature fusion in three-dimensional scene reconstruction | |
CN114581571A (en) | Monocular human body reconstruction method and device based on IMU and forward deformation field | |
Ubina et al. | Intelligent underwater stereo camera design for fish metric estimation using reliable object matching | |
CN112509021A (en) | Parallax optimization method based on attention mechanism | |
CN115511759A (en) | Point cloud image depth completion method based on cascade feature interaction | |
CN116152442B (en) | Three-dimensional point cloud model generation method and device | |
Hou et al. | Joint learning of image deblurring and depth estimation through adversarial multi-task network | |
CN117078851A (en) | Single-view three-dimensional point cloud reconstruction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |