CN112085776A - Method for estimating scene depth of unsupervised monocular image by direct method - Google Patents

Method for estimating scene depth of unsupervised monocular image by direct method Download PDF

Info

Publication number
CN112085776A
CN112085776A CN202010754803.1A CN202010754803A CN112085776A CN 112085776 A CN112085776 A CN 112085776A CN 202010754803 A CN202010754803 A CN 202010754803A CN 112085776 A CN112085776 A CN 112085776A
Authority
CN
China
Prior art keywords
image
convolution
depth estimation
deconvolution
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010754803.1A
Other languages
Chinese (zh)
Other versions
CN112085776B (en
Inventor
张治国
孙业昊
孙浩然
王海霞
卢晓
盛春阳
李玉霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University of Science and Technology
Original Assignee
Shandong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University of Science and Technology filed Critical Shandong University of Science and Technology
Priority to CN202010754803.1A priority Critical patent/CN112085776B/en
Publication of CN112085776A publication Critical patent/CN112085776A/en
Application granted granted Critical
Publication of CN112085776B publication Critical patent/CN112085776B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a direct method unsupervised monocular image scene depth estimation method, which belongs to the field of computer vision and depth estimation and comprises the following steps: constructing a neural network; calculating an image reprojection error; image mask calculation and camera pose updating. The method overcomes the defects that monocular image depth estimation has high requirement on environment, is easily interfered by low texture regions, has poor camera pose estimation precision and the like, and not only obviously improves the depth estimation precision but also has auxiliary action on positioning and navigation of a moving vehicle by combining the traditional monocular depth estimation problem with a visual odometer; the method has the advantages of high precision, strong flexibility, wide application range and the like, can be used for sensing, preventing collision and positioning navigation of the surrounding environment of equipment such as an automatic driving vehicle, a mobile robot and the like, and has various application scenes.

Description

Method for estimating scene depth of unsupervised monocular image by direct method
Technical Field
The invention belongs to the field of computer vision, neural networks and depth estimation, and particularly relates to a direct unsupervised monocular image scene depth estimation method.
Background
At present, depth estimation is rapidly developed under the promotion of related technologies such as neural networks and sensors, and is widely applied to the fields of intelligent robots, pedestrian recognition, face unlocking, VR application, automatic driving and the like. The primary task of depth estimation is to estimate the distance from the front object to the camera according to a single color image collected by the camera.
There are two main ways to obtain three-dimensional information corresponding to a scene from a real scene: the method is characterized in that one method is to use a sensor capable of sensing three-dimensional depth information of a scene to acquire the depth information in the scene, and the other method is to recover the three-dimensional depth information from a two-dimensional image corresponding to the scene. Structured light and time-of-flight ranging (ToF) are currently common sensor-based depth estimation algorithms. The structured light technology is composed of a special projector and a camera, and the camera acquires three-dimensional information of a scene by acquiring the change of specific light information emitted by the projector after the light information is projected to the scene. Structured light uses a special sensor to acquire high-precision scene three-dimensional information, and is widely applied to the fields of face unlocking, safe payment and the like at present. But the depth estimation method is not suitable for the depth estimation in the road scene because the technical principle limits that the method can be only used in the range of short-distance object ranging and a small scene. ToF is another commonly used depth information acquisition technique that uses the round-trip time-of-flight of a signal between two transceivers to obtain depth information. Depth cameras and laser radar ranging devices which are commonly used in mobile phones acquire high-precision depth maps of scenes by using a ToF technology, the ToF technology is widely applied to the fields of AR, somatosensory games and automatic driving, however, laser radars and sensors required by ToF are expensive, and the size of a roof laser radar applied to automatic driving is too large and has limitation, so that the recovery of three-dimensional depth information from two-dimensional images corresponding to the scenes gradually becomes mainstream.
The depth estimation based on the images can be divided into supervised learning and unsupervised learning, and the supervised learning method depends on a depth three-dimensional map corresponding to a training picture. Usually, the deep three-dimensional map is acquired by a radar sensor, so the scale of a training data set of the supervised learning method based on the neural network is usually very small, and the cost for acquiring the data set is also very high, which greatly limits the mobility and the adaptability of the supervised learning method.
Unsupervised learning can be generally divided into multi-ocular, monocular, and binocular depth estimation methods depending on the number of cameras needed to restore depth. The traditional depth estimation method is mainly based on the geometric constraint of feature point matching and environment assumption, the binocular method and the multi-view depth estimation method need accurate camera external parameters, and errors caused by external parameter change cannot be eliminated. The monocular image depth estimation method only needs the internal reference of the camera and does not need the characteristic matching process, so that the algorithm is simpler and wider in application range.
Disclosure of Invention
Aiming at the technical problems in the prior art, the invention provides a direct unsupervised monocular image depth estimation method, which not only obviously improves the depth estimation precision, but also has auxiliary effects on the positioning and navigation of a moving vehicle, has reasonable design, overcomes the defects of the prior art and has good effect.
In order to achieve the purpose, the invention adopts the following technical scheme:
a direct method unsupervised monocular image scene depth estimation method comprises the following steps:
step 1: constructing a depth estimation neural network, taking a monocular continuous image as an input image, and outputting a depth estimation image by using the depth estimation neural network;
step 2: calculating an initial camera pose, calculating a re-projection image by using a previous frame image and a depth estimation neural image of an input image and the camera pose, calculating a re-projection error of the re-projection image and a current frame image, and updating parameters of a depth estimation neural network by using back propagation to obtain a new depth estimation image;
and step 3: and (3) calculating an image mask by utilizing the reprojected image and the input image, updating the camera pose estimation between the front frame image and the rear frame image, and repeating the iteration steps 2 and 3.
Preferably, in step 1, the method specifically comprises the following steps:
step 1.1: the constructed depth estimation neural network is in a full convolution U shape, and the convolution part uses a convolution network in a Res-Net18 network structure as a main structure network;
step 1.2: the deconvolution part comprises a plurality of deconvolution layers and ReLu activated layer stacking as a main body structure, and each deconvolution layer is connected with the last convolution layer in the convolution blocks with the same mesoscale in the convolution part to form a final deconvolution layer;
step 1.3: monocular continuous images are used as a training data set, and sample expansion and generalization operations such as image inversion, gamma transformation, color channel change and the like are carried out before the monocular continuous images are input into a depth estimation neural network.
Preferably, in step 2, the method specifically comprises the following steps:
step 2.1: pre-calibrating intrinsic parameters of the camera or obtaining the intrinsic parameters of the camera from the data set;
step 2.2: calculating the initial camera pose by using the current frame image and the previous frame image and adopting a direct method;
step 2.3: calculating a reprojection image by using a last frame image of the input image, the depth estimation image and the camera pose;
step 2.4: calculating a reprojection error between the current frame image and the reprojection image;
step 2.5: and updating parameters of the depth estimation neural network by utilizing the depth estimation image and the image reprojection error output by the depth estimation neural network through back propagation to obtain a new depth estimation image.
Preferably, in step 3, the method specifically comprises the following steps:
step 3.1: calculating a similarity error according to the reprojected image and the current frame image to obtain an image mask;
step 3.2: calculating a Jacobian matrix between the last frame of image and the obtained camera pose;
step 3.3: multiplying the image mask by the Jacobian matrix to obtain an improved Jacobian matrix;
step 3.4: and updating the camera pose estimation between the front frame image and the rear frame image by using the improved Jacobian matrix.
Step 3.5: and repeating the iteration steps 2 and 3.
Preferably, the convolution part uses a convolution network in a Res-Net18 network structure as a main structure network, and is composed of 5 convolution blocks and 5 deconvolution blocks, where the 1 st convolution block includes 1 convolution group, the convolution group includes 1 convolution layer, the input of the convolution layer is a 3-channel color image, the output is 64 channels, the 2 nd, 3 rd, 4 th and 5 th convolution blocks include 2 convolution groups, each convolution group includes 2 convolution layers, the number of channels included in the 2 nd, 3 th, 4 th and 5 th convolution blocks is 64, 128, 256 and 512, the number of deconvolution groups in the deconvolution blocks is 2 times of the number of convolution groups in the corresponding scale convolution blocks, and the number of channels included in each convolution group in each deconvolution block is 512, 256, 128, 64 and 1.
Preferably, the last volume block and the next volume block are connected by a maximum pool operation, the maximum pool operation also scales the size of the next volume block in the two adjacent volume blocks to be one half of the size of the last volume block, the size of the next deconvolution block in the two adjacent volume blocks is 2 times of the size of the last deconvolution block, the volume layer and the deconvolution layer in each volume unit and the deconvolution group are connected by 3 × 3 convolution, and in addition, the content of the second volume block is copied into the fourth deconvolution block, and the content of the fifth volume block is copied into the third deconvolution block.
The invention has the following beneficial technical effects:
according to the method, a monocular camera is used for acquiring a two-dimensional image of an environment, and then a three-dimensional depth map corresponding to the two-dimensional image is obtained through calculation by using a designed full convolution neural network. When the network is trained, the invention uses the designed image reprojection error and the image mask, thereby increasing the training efficiency and the depth estimation precision. The image mask used in the method can effectively remove interference factors such as a texture area, a moving vehicle and the like in a road environment, greatly improve the accuracy of monocular depth estimation, and simultaneously reduce the requirement on the environment and the training cost; in addition, the positioning method provided by the invention can be used for acquiring the depth map in front of the camera in real time, can be used for navigation and automatic driving of automobiles of ground mobile robots, and is also suitable for unmanned aerial vehicles flying in the air.
The method not only overcomes the defects of strong dependence on a radar sensor, high requirement on environment, inflexibility and the like of the traditional depth estimation method, but also effectively overcomes the problem of cavities generated by low texture areas in monocular depth estimation, and is also very suitable for positioning and navigating the unmanned aerial vehicle; the intelligent mobile navigation system has the advantages of high precision, strong flexibility, wide application range and the like, can be used for navigation and obstacle avoidance of intelligent mobile equipment such as mobile robots and unmanned aerial vehicles in indoor environments, and widens application scenes.
Drawings
FIG. 1 is a schematic diagram of an unsupervised monocular depth estimation module based on a full convolution neural network according to the present invention;
FIG. 2 is a schematic diagram of a training process for a full convolution neural network of the present invention;
Detailed Description
The invention is described in further detail below with reference to the following figures and detailed description:
as shown in fig. 1, a direct unsupervised monocular image scene depth estimation method includes:
step 1: the constructed depth estimation neural network is in a full convolution U shape and comprises a convolution part and a deconvolution part. The depth estimation neural network takes monocular continuous images as input images, and the depth estimation neural network is used for outputting depth estimation images.
Step 1 comprises the following substeps:
step 1.1: the convolution portion uses the convolution network in the Res-Net18 network structure as the main structure network. The convolution part is composed of a plurality of convolution blocks, the last convolution block is connected with the next convolution block through maximum pool operation, each convolution block comprises a plurality of convolution groups, each convolution group comprises a plurality of convolution layers, and meanwhile, the maximum pool operation also scales the size of the next convolution block in the two adjacent convolution blocks to be one half of the size of the last convolution block. As shown in fig. 2, in the embodiment of the present invention, the number of convolution blocks is 5, and each convolution module includes a plurality of different convolution groups. The 1 st convolution block comprises 1 convolution group, the convolution group comprises 1 convolution layer, the input of the convolution layer is 3-channel color image, and the output is 64 channels; the 2 nd, 3 rd, 4 th and 5 th convolution blocks respectively comprise 2 convolution groups, each convolution group respectively comprises 2 convolution layers, and the number of channels contained in the 2 nd, 3 rd, 4 th and 5 th convolution blocks is respectively 64, 128, 256 and 512. The convolution kernel of 7 × 7 is used in the 1 st volume block, the convolution kernel of 3 × 3 is used in the 2 nd, 3 rd, 4 th and 5 th volume blocks, and a Linear correction unit (ReLu) is used between the volume layers in each volume block as an activation function.
Step 1.2: the deconvolution part is composed of a plurality of deconvolution blocks. The deconvolution block with the same scale in the deconvolution part consists of two parts, namely a basic block with the scale in the convolution part being the same as that in the current deconvolution part and a deconvolution block with the current scale. The next deconvolution block has a dimension 2 times the dimension of the last deconvolution block. Each deconvolution block contains several deconvolution groups. As shown in fig. 2, the number of deconvolution blocks in the present invention is 5, the number of deconvolution groups in a deconvolution block is 2 times the number of convolution groups in a corresponding scale convolution block, the number of convolution layers included in each deconvolution group is the same as the number of convolution layers in the corresponding scale convolution group, and the number of channels included in each convolution group in each deconvolution block is 512, 256, 128, 64, and 1, respectively.
Step 1.3: and constructing a depth estimation neural network by utilizing the convolution part and the deconvolution part, inputting the constructed neural network for depth estimation by taking a monocular continuous image as a training data set, and performing sample expansion and generalization operations such as image inversion, gamma conversion, color channel change and the like before inputting the depth estimation neural network.
Step 2: calculating a re-projection image by using a previous frame image, a depth estimation image and a camera pose of an input image, calculating a re-projection error of the re-projection image and a current frame image to obtain a re-projection error of the image, and updating parameters of a depth estimation neural network by using back propagation to obtain a new depth estimation image.
Step 2 comprises the following substeps:
step 2.1: the internal parameters of the camera are calibrated in advance or obtained from the data set, and the internal parameters of the camera are calibrated in an off-line mode by using a Zhang Yongyou camera calibration method and adopting a checkerboard.
Step 2.2: and calculating the initial camera pose by using the current frame image and the previous frame image and adopting a direct method.
Step 2.3: a re-projection image will be calculated using the last frame image of the input image and the depth estimation image and the camera pose;
step 2.4: and calculating a re-projection error between the current frame image and the re-projection image, wherein the re-projection error is a loss function of the depth estimation neural network.
The loss function in the depth estimation neural network adopts the error between a re-projection image and an original image, and is defined as follows:
Figure BDA0002611173520000051
wherein, Ii
Figure BDA0002611173520000052
Respectively representing the pixel values of the input image and the re-projected image, and i represents the index of the pixel value.
Step 2.5: and updating parameters of the depth estimation network through back propagation by using the depth estimation image and the image reprojection error output by the depth estimation neural network to obtain a new depth estimation image.
And step 3: and (3) calculating an image mask by utilizing the reprojected image and the input image, updating the camera pose estimation between the front frame image and the rear frame image, and repeating the iteration steps 2 and 3.
The step 3 comprises the following substeps:
step 3.1: and calculating a similarity error according to the re-projected image and the current frame image to obtain an image mask.
Specifically, the image mask is a binary image. When the reprojection error is smaller than the difference between the images of the previous and subsequent frames, the image mask is set to 1, otherwise, it is set to 0. In the method of the invention, the image masks of the low texture area and the moving vehicle area are 0, and the image masks of the other areas are 1.
Step 3.3: and calculating a Jacobian matrix between the image of the last frame and the obtained camera pose.
Step 3.4: the image mask is multiplied by the Jacobian matrix to obtain an improved Jacobian matrix.
Step 3.5: and updating the camera pose estimation between the front frame image and the rear frame image by using the improved Jacobian matrix.
Step 3.6: and repeating the iteration steps 2 and 3.
The iteration times are set to be 6 ten thousand, the best estimation effect is achieved on the model stored in the 5 th ten thousand training, and the total training time is 40 hours.
The above is a complete implementation process of the present embodiment.
It is to be understood that the above description is not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art may make modifications, alterations, additions or substitutions within the spirit and scope of the present invention.

Claims (6)

1. A direct method unsupervised monocular image scene depth estimation method is characterized by comprising the following steps: the method comprises the following steps:
step 1: constructing a depth estimation neural network, taking a monocular continuous image as an input image, and outputting a depth estimation image by using the depth estimation neural network;
step 2: calculating an initial camera pose, calculating a re-projection image by using a previous frame image, a depth estimation image and the camera pose of an input image, calculating a re-projection error of the re-projection image and a current frame image, and updating parameters of a depth estimation network by using back propagation to obtain a new depth estimation image;
and step 3: and (3) calculating an image mask by utilizing the reprojected image and the input image, updating the camera pose estimation between the front frame image and the rear frame image, and repeating the iteration steps 2 and 3.
2. The direct method unsupervised monocular image scene depth estimation method of claim 1, characterized by: in step 1, the method specifically comprises the following steps:
step 1.1: the constructed depth estimation neural network is in a full convolution U shape, and the convolution part uses a convolution network in a Res-Net18 network structure as a main structure network;
step 1.2: the deconvolution part comprises a plurality of deconvolution layers and ReLu activated layer stacking as a main body structure, and each deconvolution layer is connected with the last convolution layer in the convolution blocks with the same mesoscale in the convolution part to form a final deconvolution layer;
step 1.3: monocular continuous images are used as a training data set, and sample expansion and generalization operations such as image inversion, gamma transformation, color channel change and the like are carried out before the monocular continuous images are input into a depth estimation neural network.
3. The direct method unsupervised monocular image scene depth estimation method of claim 1, characterized by: in the step 2, the method specifically comprises the following steps:
step 2.1: pre-calibrating intrinsic parameters of the camera or obtaining the intrinsic parameters of the camera from the data set;
step 2.2: calculating the initial camera pose by using the current frame image and the previous frame image and adopting a direct method;
step 2.3: calculating a reprojection image by using a last frame image of the input image, the depth estimation image and the camera pose;
step 2.4: calculating a reprojection error between the current frame image and the reprojection image;
step 2.5: and updating parameters of the depth estimation neural network by utilizing the depth estimation image and the image reprojection error output by the depth estimation neural network through back propagation to obtain a new depth estimation image.
4. The direct method unsupervised monocular image scene depth estimation method of claim 1, characterized by: in step 3, the method specifically comprises the following steps:
step 3.1: calculating a similarity error according to the reprojected image and the current frame image to obtain an image mask;
step 3.2: calculating a Jacobian matrix between the last frame of image and the obtained camera pose;
step 3.3: multiplying the image mask by the Jacobian matrix to obtain an improved Jacobian matrix;
step 3.4: updating the camera pose estimation between the front frame image and the rear frame image by using the improved Jacobian matrix;
step 3.5: and repeating the iteration steps 2 and 3.
5. The direct method unsupervised monocular image scene depth estimation method of claim 2, characterized by: the convolution part uses a convolution network in a Res-Net18 network structure as a main structure network, and comprises 5 convolution blocks and 5 deconvolution blocks, wherein the 1 st convolution block comprises 1 convolution group, the convolution group comprises 1 convolution layer, the input of the convolution layer is a 3-channel color image, the output is 64 channels, the 2 nd, 3 rd, 4 th and 5 th convolution blocks respectively comprise 2 convolution groups, each convolution group comprises 2 convolution layers, the number of channels contained in the 2 nd, 3 th, 4 th and 5 th convolution blocks is respectively 64, 128, 256 and 512, the number of deconvolution groups in the deconvolution blocks is 2 times of the number of convolution groups in the corresponding scale convolution blocks, and the number of channels contained in each convolution group in each deconvolution block is respectively 512, 256, 128, 64 and 1.
6. The direct method unsupervised monocular image scene depth estimation method of claim 2, characterized by: the last volume block and the next volume block are connected through a maximum pool operation, the maximum pool operation also scales the size of the next volume block in the two adjacent volume blocks to be one half of the size of the last volume block, the size of the next deconvolution block in the two adjacent volume blocks is 2 times of the size of the last deconvolution block, the convolution layer and the deconvolution layer in each volume unit and the deconvolution group are connected through a 3 x 3 convolution mode, in addition, the content of the second volume block is copied into the fourth deconvolution block, and the content of the fifth volume block is copied into the third deconvolution block.
CN202010754803.1A 2020-07-31 2020-07-31 Direct method unsupervised monocular image scene depth estimation method Active CN112085776B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010754803.1A CN112085776B (en) 2020-07-31 2020-07-31 Direct method unsupervised monocular image scene depth estimation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010754803.1A CN112085776B (en) 2020-07-31 2020-07-31 Direct method unsupervised monocular image scene depth estimation method

Publications (2)

Publication Number Publication Date
CN112085776A true CN112085776A (en) 2020-12-15
CN112085776B CN112085776B (en) 2022-07-19

Family

ID=73735816

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010754803.1A Active CN112085776B (en) 2020-07-31 2020-07-31 Direct method unsupervised monocular image scene depth estimation method

Country Status (1)

Country Link
CN (1) CN112085776B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112927279A (en) * 2021-02-24 2021-06-08 中国科学院微电子研究所 Image depth information generation method, device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109472830A (en) * 2018-09-28 2019-03-15 中山大学 A kind of monocular visual positioning method based on unsupervised learning
CN111402310A (en) * 2020-02-29 2020-07-10 同济大学 Monocular image depth estimation method and system based on depth estimation network
CN111462231A (en) * 2020-03-11 2020-07-28 华南理工大学 Positioning method based on RGBD sensor and IMU sensor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109472830A (en) * 2018-09-28 2019-03-15 中山大学 A kind of monocular visual positioning method based on unsupervised learning
CN111402310A (en) * 2020-02-29 2020-07-10 同济大学 Monocular image depth estimation method and system based on depth estimation network
CN111462231A (en) * 2020-03-11 2020-07-28 华南理工大学 Positioning method based on RGBD sensor and IMU sensor

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
RUI WANG ET AL.: "Recurrent Neural Network for (Un-)supervised Learning of Monocular Video Visual Odometry and Depth", 《CVPR》 *
朱奇光等: "移动机器人混合的半稠密视觉里程计算法", 《仪器仪表学报》 *
马成齐等: "抗遮挡的单目深度估计算法", 《计算机工程与应用》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112927279A (en) * 2021-02-24 2021-06-08 中国科学院微电子研究所 Image depth information generation method, device and storage medium

Also Published As

Publication number Publication date
CN112085776B (en) 2022-07-19

Similar Documents

Publication Publication Date Title
CN110163930B (en) Lane line generation method, device, equipment, system and readable storage medium
CN105300403B (en) A kind of vehicle mileage calculating method based on binocular vision
CN113269837B (en) Positioning navigation method suitable for complex three-dimensional environment
CN111862235B (en) Binocular camera self-calibration method and system
CN103093479A (en) Target positioning method based on binocular vision
CN111862234B (en) Binocular camera self-calibration method and system
CN111862673A (en) Parking lot vehicle self-positioning and map construction method based on top view
KR20200071293A (en) Localization method and apparatus based on 3d colored map
CN105551020A (en) Method and device for detecting dimensions of target object
CN110992424B (en) Positioning method and system based on binocular vision
CN114693787B (en) Parking garage map building and positioning method, system and vehicle
CN112669354A (en) Multi-camera motion state estimation method based on vehicle incomplete constraint
CN113239072B (en) Terminal equipment positioning method and related equipment thereof
CN116222577B (en) Closed loop detection method, training method, system, electronic equipment and storage medium
CN115272596A (en) Multi-sensor fusion SLAM method oriented to monotonous texture-free large scene
CN116051758A (en) Height information-containing landform map construction method for outdoor robot
CN112085776B (en) Direct method unsupervised monocular image scene depth estimation method
CN113624223B (en) Indoor parking lot map construction method and device
CN114529585A (en) Mobile equipment autonomous positioning method based on depth vision and inertial measurement
CN117058474B (en) Depth estimation method and system based on multi-sensor fusion
CN110864670B (en) Method and system for acquiring position of target obstacle
CN114648639B (en) Target vehicle detection method, system and device
CN115830116A (en) Robust visual odometer method
Pfeiffer et al. Ground truth evaluation of the Stixel representation using laser scanners
CN113625271B (en) Simultaneous positioning and mapping method based on millimeter wave radar and binocular camera

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant