CN109377530A - A kind of binocular depth estimation method based on deep neural network - Google Patents

A kind of binocular depth estimation method based on deep neural network Download PDF

Info

Publication number
CN109377530A
CN109377530A CN201811453789.0A CN201811453789A CN109377530A CN 109377530 A CN109377530 A CN 109377530A CN 201811453789 A CN201811453789 A CN 201811453789A CN 109377530 A CN109377530 A CN 109377530A
Authority
CN
China
Prior art keywords
image
network
depth
layer
binocular
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811453789.0A
Other languages
Chinese (zh)
Other versions
CN109377530B (en
Inventor
侯永宏
吕晓冬
许贤哲
陈艳芳
赵健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei Kaitong Information Technology Service Co ltd
Zhejiang Qiqiao Lianyun Biosensor Technology Co ltd
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201811453789.0A priority Critical patent/CN109377530B/en
Publication of CN109377530A publication Critical patent/CN109377530A/en
Application granted granted Critical
Publication of CN109377530B publication Critical patent/CN109377530B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The binocular depth estimation method based on deep neural network that the present invention relates to a kind of, steps are as follows: 1) enhancing data by pretreatment the left and right visual point image of input;2) the multiple dimensioned network model of building binocular depth estimation, model include multiple convolutional layers, active coating, residual error connection, the connection of multiple dimensioned pondization and linearly up-sample layer;3) allowable loss function makes it obtain minimum in continuous training process as a result, to obtain optimal network weight;4) image to be processed is input in network model, obtains corresponding depth map, and constantly repeated these above steps until network convergence or reach frequency of training.The present invention uses the thought of unsupervised learning, and the left and right visual point image obtained merely with binocular camera is as network inputs.Camera inside and outside parameter is set individual model parameter by the adaptivity design of the network, therefore can be suitable for multiple camera systems under the premise of not needing modification network.

Description

Binocular depth estimation method based on depth neural network
Technical Field
The invention belongs to the field of multimedia image processing, relates to computer vision and deep learning technology, and discloses a binocular depth estimation method based on a deep neural network.
Technical Field
Depth estimation is always a popular research direction in the field of computer vision, and three-dimensional data provided by a depth map provides required information for implementation of applications such as three-dimensional reconstruction, Augmented Reality (AR), intelligent navigation and the like. Meanwhile, the position relation expressed by the depth map is very important in a plurality of image tasks, so that an image processing algorithm can be further simplified. Currently, the more common depth estimation is mainly classified into two categories, namely monocular depth estimation and binocular depth estimation.
The monocular depth estimation method only uses one camera, the camera captures continuous image frames in the traditional algorithm, and projection transformation is carried out through an interframe motion model so as to estimate the image depth. The monocular depth estimation based on deep learning is to train a deep neural network by using a data set with real depth information and regress depth by using the deep neural network obtained by learning. The algorithm has simple equipment and low cost and can be suitable for dynamic scenes. But at the same time, because of the lack of scale information, the depth information is usually not accurate enough, and the performance is often seriously degraded in an unknown scene. The binocular estimation method uses two calibrated cameras to view the same object from two different perspectives. Finding the same space point under different visual angles, calculating the parallax between corresponding pixel points, and then converting the parallax into depth through triangulation. The traditional binocular estimation adopts a stereo matching algorithm, so that the calculated amount is large, and the effect on low-texture scenes is poor. The binocular depth estimation based on the deep learning mostly adopts a supervised learning method, and due to the strong learning capacity of a neural network, compared with the traditional method, the accuracy and the speed of the conventional method are greatly improved.
However, supervised learning usually depends too much on the real value, but the real value may have problems of error and noise, sparse depth information, difficult calibration of hardware equipment, and the like, so that the estimated depth value is not accurate enough. The unsupervised learning method has been considered as a research direction in which artificial intelligence can really and effectively learn itself in the real world, and therefore, in recent years, the image depth estimation method based on unsupervised learning has become a research hotspot.
Disclosure of Invention
The invention aims to provide a binocular depth estimation method based on a depth neural network, which adopts the idea of unsupervised learning, only utilizes left and right viewpoint images acquired by a binocular camera as network input, and does not need to acquire depth information of the input images in advance as a training label. Meanwhile, the adaptive design of the network sets the internal and external parameters of the camera as independent model parameters, so that the method is suitable for a plurality of camera systems on the premise of not modifying the network. In addition, the neural network is basically not influenced by illumination, noise and the like, and the robustness is high.
The technical scheme for realizing the purpose of the invention is as follows:
a binocular depth estimation method based on a depth neural network comprises the following steps:
1) performing corresponding image preprocessing such as cutting and transformation on the input left and right viewpoint images to perform data enhancement, wherein the image preprocessing comprises mild affine deformation, random horizontal rotation, random scale jitter, random contrast, brightness, saturation, sharpness and the like, so that the number of samples is further increased, the training optimization of network parameters is facilitated, and the generalization capability of a network is enhanced;
2) and constructing a multi-scale network model of binocular depth estimation, wherein the model comprises a plurality of convolution layers, an activation layer, a residual connection, a multi-scale pooling connection, a linear up-sampling layer and the like.
(a) The network adopts three residual error network structures to carry out multi-scale convolution on input, and each residual error module comprises two convolution layers and an identity mapping. Except for the first convolution kernel of 3 x 3, the rest are 7 x 7 in size.
(b) The second, sixth and fourteenth layers in the network are multi-scale pooling modules, and the average pooling operation is performed on the outputs of the second and sixth layers, with a step size of 4, a kernel size of 4 × 4, a step size of 2, and a kernel size of 2 × 2, respectively, and the convolution is performed by 1 × 1 together with the output of the fourteenth layer.
(c) The left view and the right view are processed through a front-end network, and feature information of the left view and the right view is associated through feature correlation operation after passing through a multi-scale pooling module, so that feature correlation between the two views is calculated:
c(x1,x2)=∑o∈[-k,k]×[-k,k]<fl(x1+o),fr(x2+o)>
c is left image feature in x1The image block and the right image feature centered on x2Correlation of image blocks centered, flIs a left picture feature, frFor the right graph feature, the image block size is k × k.
(d) And then, the network recovers the original resolution of the image according to the correlation characteristics, and depth maps with different scales are obtained by utilizing deconvolution, upsampling and the like. In the linear up-sampling operation, the image is generated by adopting bilinear interpolation for the output of the upper layer, and jump layer connection is carried out with the upper sampling layer by utilizing residual learning, and finally the image is restored to the original size.
3) Setting initialization parameters according to a designed network model, and designing a loss function to obtain a minimization result in a continuous training process so as to obtain the optimal network weight.
The pixel values of the left and right views of the network input are respectively represented as Il、IrWhen the network obtains the predicted depth map of the left imageUsing the camera internal reference matrix K-1I to be in the image coordinate systemrConverting into a camera coordinate system, converting into the camera coordinate system of the left image by using the external reference matrix T, and then converting into the image coordinate system of the left image again by using the internal reference matrix K, thereby obtaining a transition imageThe specific formula is as follows:
whereinprIs the corresponding image pixel value. The projection transformation enables the pixel coordinates in the transition graph to be continuous values, so that the pixel value of each coordinate is determined by using a 4-neighborhood interpolation method, and finally the target graph is obtained
Where w is proportional to the spatial distance between the target point and the proximate point, anda,bwab=1。
construction of reconstruction loss function using Huber loss function
4) Inputting the image to be processed into the network model to obtain corresponding depth map, and repeating the above steps
Until the network converges or the training times are reached.
The invention provides a deep neural network based on unsupervised learning, which is used for carrying out network model training on left and right images without real depth information so as to obtain a monocular depth map. The invention adopts the advantage of multiple visual angles of the binocular camera, and realizes the output mapping from the input of a binocular image to a monocular depth image by utilizing the representation learning method of a multilayer representation form, namely a convolutional neural network. The network model obtains different scale receptive fields through multilayer down-sampling operation, utilizes a residual error structure to extract the characteristics of an input image, and adopts a multi-scale pooling module to strengthen the local texture detail information of the image, thereby improving the accuracy and the robustness of the network model. The upper sampling layer adopts a bilinear interpolation method, and a residual error structure is reused to learn information of a plurality of upper sampling layers, so that information loss in the process of recovering the size of the image is reduced, and the accuracy of depth estimation is further ensured.
The invention has the advantages and beneficial effects that:
1. the binocular depth estimation method based on the depth neural network is based on an unsupervised learning method, and the accuracy of the predicted depth value is ensured by utilizing the strong learning capacity of the depth convolution network.
2. The invention uses residual error connection for feature extraction for multiple times, completes multi-scale information fusion by utilizing skip layer connection in up-sampling, reduces the loss and loss of the traditional convolution in information transmission to a certain extent, ensures the integrity of information and greatly improves the network convergence speed.
3. According to the method, images with different scales are obtained through multiple downsampling, and different receptive fields of the images are obtained through a multi-scale pooling module to strengthen local texture details.
4. The characteristic correlation operation in the network carries out the characteristic correlation of the left view and the right view, is not easily influenced by noise, and improves the robustness of the network model.
5. The input image of the network does not have real depth information, the network calculates a target image by predicting a depth image, camera parameters and original input, and constructs a loss function by constructing a difference value between the target image and the original input so as to realize network parameter optimization, so that the whole network can finish training in an unsupervised learning mode.
6. The parameter information of the camera is set outside the network as a part of network parameters, so that the model is suitable for various camera systems with different configurations and has strong self-adaptive capability.
Drawings
Fig. 1 is a diagram of a neural network model for binocular depth estimation.
Detailed Description
The present invention will be described in further detail with reference to the following embodiments, which are illustrative only and not limiting, and the scope of the present invention is not limited thereby.
1) And performing corresponding image preprocessing such as cutting and transformation on the input left and right viewpoint images to perform data enhancement.
The invention adopts the images of left and right visual angles acquired by the binocular camera as network input and can output a monocular depth map under a left camera coordinate system or a right camera coordinate system. For convenience of description, the output monocular depth maps mentioned herein are all depth maps of the left image. The input image in the invention needs the RGB image of left and right visual angles, so the artificially synthesized data set scenefilow and part of data in KITTI2015 data set in real environment are adopted as training data. 39000 binocular images with 960 x 540 resolution and corresponding depth maps are contained in the large data set SceneFlow data set, and a large amount of training data can guarantee the learning capacity of the convolutional neural network. However, the SceneFlow data set is an artificially synthesized image, and therefore has a certain difference from a real image acquired in the real world. In order to enhance the application effect of the model in the daily life scene, the model is selected to be finely adjusted on the KITTI2015 data set in the example so as to adapt to the real scene. The KITTI2015 data set contains 200 binocular images and corresponding sparse depth maps. Because the method is of an unsupervised learning type, the scene flow dataset and the actual depth map data in the KITTI2015 dataset are not used. The higher resolution of the images in the data set makes the network training slower, so the images are randomly cropped to 320x180 to improve the network training speed. In addition, image preprocessing is carried out on the images in the data set, wherein the image preprocessing comprises slight affine deformation, random horizontal rotation, random scale jitter, random contrast, brightness, saturation, sharpness and the like, so that the number of samples is further increased, training optimization of network parameters is facilitated, and the generalization capability of the network is enhanced.
2) And constructing a multi-scale network model of binocular depth estimation, wherein the model comprises a plurality of convolution layers, an activation layer, a residual connection, a multi-scale pooling connection, a linear up-sampling layer and the like.
a) In order to reduce the parameter quantity of the network model on a large scale so that the network model is easier to converge and has stronger feature expression capability, the network selects 3 residual modules to carry out feature extraction on the input image. Except for the first layer, the remaining convolutional layers all use a small convolution kernel of 3 x 3 to better retain edge information. And carrying out batch standardization operation after each convolution layer to ensure that the data distribution is stable. And a ReLU activation function is adopted after each convolution layer in the model, so that the problem of gradient disappearance during network training is prevented. The output of each residual block is sampled down, a multi-scale pooling module is designed to perform average pooling operation on the input of the residual block with different sizes, and dimension reduction is performed through the 1 × 1 convolution layer, so that different feature information can be sensed on different scales, and the training parameters of the network are greatly reduced.
b) And finally obtaining a feature map of one eighth resolution of the original image after the input image passes through the three-time residual error module and the multi-scale pooling module and is subjected to dimensionality reduction. Left and right graph network weight sharing, calculating the characteristic correlation of the two graphs in the correlation operation, and the formula is as follows:
in the formula, x1The feature block of the left image as the center can perform correlation operation with all the feature blocks of the right image, and one point in the left image is calculated in a traversal modeTo the matching features of all points in the right figure. The matrix can be regarded as matching cost of the feature blocks at different depths, and then depth regression is selected to be regarded as a classification problem. In the deep regression, firstly, the softmax function is utilizedj is 1, …, K, and K matching costs at the depth are converted into a probability distribution of the depth and then passedWeighted summation mode obtains more stable depth estimationWhereinIndicating the depth of the predicted pixel, DmaxRepresenting the maximum disparity to be estimated, d being the respective depth values corresponding to the depth probability distribution, CdThe matching cost is expressed, and the final output is the weighted sum of all possible depths of the pixel point and the possibility of the depth.
c) And performing bilinear interpolation on the matching cost of the small scale, adding the upsampled cost into the next larger scale, and performing skip-layer connection on the information of the multiple upsampled layers by utilizing residual connection. Residual learning in the up-sampling process fully utilizes multi-scale information, so that the network further refines depth estimation on the basis of depth estimation of the previous scale, and meanwhile, the network is easier to train.
3) Setting initialization parameters according to a designed network model, and designing a loss function to obtain a minimization result in a continuous training process so as to obtain the optimal network weight.
One key point of the invention is how to realize unsupervised learning, and the network needs to construct a reasonable loss function to train, optimize and adjust the training parameters. Assuming that the prediction target image is a left image depth image, obtaining a network prediction left image depth imageTo obtain a target mapFirstly, an input image right image I in an image coordinate system needs to be processedrUsing an internal reference matrix K-1Conversion to the camera coordinate system of the right image, using the predicted left image depth map according to the stereo matching principlePerforming corresponding projection transformation with the external reference matrix T to obtain an image in a left image camera coordinate system, and performing coordinate system transformation again by using the matrix K to obtain a transition image in the left image coordinate systemThe method can be obtained according to a binocular camera projection conversion formula:
whereinprIs the corresponding image pixel value. Due to the characteristics of projective transformation, the transition diagramThe coordinates in (1) are converted to continuous values, so that 4 adjacent pixel values of the coordinates are used for linear interpolation. The coordinates of 4 adjacent pixels are respectively upper left, lower left, upper right and lower right, and the interpolation formula is as follows:
wherein,for the corresponding pixel value of the target image, w is proportional to the spatial distance between the target point and the adjacent point, and ∑a,bwab=1。
Therefore, the reconstruction loss function is given by:
wherein,
in the formula, x represents a difference value between corresponding pixel points of the target graph and the input graph, N is the number of pixel points of the image, and c is a threshold value set empirically, which is set to 1 in the present embodiment.
The Huber loss function has jump of first order difference at the c value, when the value is in the c range, the small residual gradient is better, and when the norm exceeds c, the large residual effect is better, so that the two losses can be effectively balanced.
The input image of the network does not have real depth information, but the original input image is estimated through a predicted depth map and a camera parameter matrix and is used as a network label to optimize the training parameter, so that the unsupervised learning of the network is realized. Meanwhile, the camera parameters can be modified externally in the optimization process of network training, so that the model is suitable for a plurality of camera systems and has self-adaptive performance.
4) And inputting the image to be processed into a network model to obtain a corresponding depth map, and continuously repeating the steps until the network converges or the training times are reached.
In the example, the synthesized big data set scenefiow is used for pre-training, and then the KITTI2015 data set is used for fine tuning, so that the network has high precision in daily real scenes, and the method has good universality.
The above description is only for the preferred embodiments of the present invention, but the protection scope of the present invention is not limited thereto, and any person skilled in the art can substitute or change the technical solution of the present invention and the inventive concept within the scope of the present invention, which is disclosed by the present invention, and the equivalent or change thereof belongs to the protection scope of the present invention.

Claims (5)

1. A binocular depth estimation method based on a depth neural network comprises the following steps:
1) the input left and right viewpoint images are preprocessed to enhance the data;
2) constructing a multi-scale network model of binocular depth estimation, wherein the model comprises a plurality of convolution layers, an activation layer, a residual connection, a multi-scale pooling connection and a linear up-sampling layer;
3) setting initialization parameters according to a designed multi-scale network model, and designing a loss function to obtain a minimization result in a continuous training process so as to obtain an optimal network weight;
4) and inputting the image to be processed into a network model to obtain a corresponding depth map, and continuously repeating the steps until the network converges or the training times are reached.
2. The binocular depth estimation method based on the depth neural network of claim 1, wherein: the multi-scale network model adopts three residual error network structures to carry out multi-scale convolution on input, each residual error module comprises two convolution layers and an identity mapping, the second layer, the sixth layer and the fourteenth layer in the network are multi-scale pooling modules, average pooling operation is carried out on the output of the second layer and the output of the sixth layer, and 1 x1 convolution is carried out together with the output of the fourteenth layer.
3. The binocular depth estimation method based on the depth neural network of claim 2, wherein: the left view and the right view are processed through a front-end network, and feature information of the left view and the right view is associated through feature correlation operation after passing through a multi-scale pooling module, so that feature correlation between the two views is calculated:
c(x1,x2)=∑o∈[-k,k]×[-k,k]<fl(x1+o),fr(x2+o)>
c is left image feature in x1The image block and the right image feature centered on x2Correlation of image blocks centered, flIs a left picture feature, frFor the right graph feature, the image block size is k × k.
4. The binocular depth estimation method based on the depth neural network of claim 3, wherein: the network restores the original resolution of the image according to the correlation characteristics, acquires depth maps of different scales by deconvolution and upsampling, generates an image by bilinear interpolation for the output of the upper layer in the linear upsampling operation, performs skip layer connection with the upper layer upsampling layer by residual learning, and finally restores the image to the original size.
5. The binocular depth estimation method based on the depth neural network of claim 1, wherein: the step 3) is specifically as follows: the pixel values of the left and right views of the network input are respectively represented as Il、IrWhen the network obtains the predicted depth map of the left imageUsing the camera internal reference matrix K-1I to be in the image coordinate systemrConverting into a camera coordinate system, converting into the camera coordinate system of the left image by using the external reference matrix T, and then converting into the image coordinate system of the left image again by using the internal reference matrix K, thereby obtaining a transition imageThe specific formula is as follows:
whereinprFor corresponding image pixel values, the pixel coordinates in the transition image are continuous values through projection transformation, so that the pixel value of each coordinate is determined by using a 4-neighborhood interpolation method, and finally the target image is obtained
In the formula,is the corresponding pixel value of the target image, a and b are the coordinate values of each adjacent point, wabTo be corresponding toWeight of coordinate pixel value proportional to target pointAnd the near pointIs a spatial distance ofa,bwab=1;
Construction of reconstruction loss function using Huber loss function
In the above formula, x represents a difference between corresponding pixel points of the target graph and the input graph, N is the number of pixel points of the image, and c is an experience setting threshold.
CN201811453789.0A 2018-11-30 2018-11-30 Binocular depth estimation method based on depth neural network Active CN109377530B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811453789.0A CN109377530B (en) 2018-11-30 2018-11-30 Binocular depth estimation method based on depth neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811453789.0A CN109377530B (en) 2018-11-30 2018-11-30 Binocular depth estimation method based on depth neural network

Publications (2)

Publication Number Publication Date
CN109377530A true CN109377530A (en) 2019-02-22
CN109377530B CN109377530B (en) 2021-07-27

Family

ID=65376554

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811453789.0A Active CN109377530B (en) 2018-11-30 2018-11-30 Binocular depth estimation method based on depth neural network

Country Status (1)

Country Link
CN (1) CN109377530B (en)

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948689A (en) * 2019-03-13 2019-06-28 北京达佳互联信息技术有限公司 A kind of video generation method, device, electronic equipment and storage medium
CN110009674A (en) * 2019-04-01 2019-07-12 厦门大学 Monocular image depth of field real-time computing technique based on unsupervised deep learning
CN110298791A (en) * 2019-07-08 2019-10-01 西安邮电大学 A kind of super resolution ratio reconstruction method and device of license plate image
CN110322499A (en) * 2019-07-09 2019-10-11 浙江科技学院 A kind of monocular image depth estimation method based on multilayer feature
CN110414674A (en) * 2019-07-31 2019-11-05 浙江科技学院 A kind of monocular depth estimation method based on residual error network and local refinement
CN110490919A (en) * 2019-07-05 2019-11-22 天津大学 A kind of depth estimation method of the monocular vision based on deep neural network
CN111062900A (en) * 2019-11-21 2020-04-24 西北工业大学 Binocular disparity map enhancement method based on confidence fusion
CN111179330A (en) * 2019-12-27 2020-05-19 福建(泉州)哈工大工程技术研究院 Binocular vision scene depth estimation method based on convolutional neural network
CN111753961A (en) * 2020-06-26 2020-10-09 北京百度网讯科技有限公司 Model training method and device, and prediction method and device
CN112288788A (en) * 2020-10-12 2021-01-29 南京邮电大学 Monocular image depth estimation method
CN112543317A (en) * 2020-12-03 2021-03-23 东南大学 Method for converting high-resolution monocular 2D video into binocular 3D video
CN112652058A (en) * 2020-12-31 2021-04-13 广州华多网络科技有限公司 Human face image replay method and device, computer equipment and storage medium
CN112767467A (en) * 2021-01-25 2021-05-07 郑健青 Double-image depth estimation method based on self-supervision deep learning
CN112785636A (en) * 2021-02-18 2021-05-11 上海理工大学 Multi-scale enhanced monocular depth estimation method
CN112837361A (en) * 2021-03-05 2021-05-25 浙江商汤科技开发有限公司 Depth estimation method and device, electronic equipment and storage medium
CN113076966A (en) * 2020-01-06 2021-07-06 字节跳动有限公司 Image processing method and device, neural network training method and storage medium
CN113239958A (en) * 2021-04-09 2021-08-10 Oppo广东移动通信有限公司 Image depth estimation method and device, electronic equipment and storage medium
CN113496521A (en) * 2020-04-08 2021-10-12 复旦大学 Method and device for generating depth image and camera external parameters by using multiple color pictures
CN113706599A (en) * 2021-10-29 2021-11-26 纽劢科技(上海)有限公司 Binocular depth estimation method based on pseudo label fusion
CN113762358A (en) * 2021-08-18 2021-12-07 江苏大学 Semi-supervised learning three-dimensional reconstruction method based on relative deep training
CN114170286A (en) * 2021-11-04 2022-03-11 西安理工大学 Monocular depth estimation method based on unsupervised depth learning
WO2022127533A1 (en) * 2020-12-18 2022-06-23 安翰科技(武汉)股份有限公司 Capsule endoscope image three-dimensional reconstruction method, electronic device, and readable storage medium
CN114693897A (en) * 2021-04-28 2022-07-01 上海联影智能医疗科技有限公司 Unsupervised inter-layer super-resolution for medical images
CN114782911A (en) * 2022-06-20 2022-07-22 小米汽车科技有限公司 Image processing method, device, equipment, medium, chip and vehicle
CN115966102A (en) * 2022-12-30 2023-04-14 中国科学院长春光学精密机械与物理研究所 Early warning braking method based on deep learning
CN117788843A (en) * 2024-02-27 2024-03-29 青岛超瑞纳米新材料科技有限公司 Carbon nanotube image processing method based on neural network algorithm

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102523464A (en) * 2011-12-12 2012-06-27 上海大学 Depth image estimating method of binocular stereo video
CN107204010A (en) * 2017-04-28 2017-09-26 中国科学院计算技术研究所 A kind of monocular image depth estimation method and system
CN107578436A (en) * 2017-08-02 2018-01-12 南京邮电大学 A kind of monocular image depth estimation method based on full convolutional neural networks FCN
CN107767413A (en) * 2017-09-20 2018-03-06 华南理工大学 A kind of image depth estimation method based on convolutional neural networks
CN108389226A (en) * 2018-02-12 2018-08-10 北京工业大学 A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax
US20180231871A1 (en) * 2016-06-27 2018-08-16 Zhejiang Gongshang University Depth estimation method for monocular image based on multi-scale CNN and continuous CRF

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102523464A (en) * 2011-12-12 2012-06-27 上海大学 Depth image estimating method of binocular stereo video
US20180231871A1 (en) * 2016-06-27 2018-08-16 Zhejiang Gongshang University Depth estimation method for monocular image based on multi-scale CNN and continuous CRF
CN107204010A (en) * 2017-04-28 2017-09-26 中国科学院计算技术研究所 A kind of monocular image depth estimation method and system
CN107578436A (en) * 2017-08-02 2018-01-12 南京邮电大学 A kind of monocular image depth estimation method based on full convolutional neural networks FCN
CN107767413A (en) * 2017-09-20 2018-03-06 华南理工大学 A kind of image depth estimation method based on convolutional neural networks
CN108389226A (en) * 2018-02-12 2018-08-10 北京工业大学 A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CLEMENT GODARD 等: ""Unsupervised Monocular Depth Estimation with Left-Right Consistency"", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948689A (en) * 2019-03-13 2019-06-28 北京达佳互联信息技术有限公司 A kind of video generation method, device, electronic equipment and storage medium
CN110009674A (en) * 2019-04-01 2019-07-12 厦门大学 Monocular image depth of field real-time computing technique based on unsupervised deep learning
CN110009674B (en) * 2019-04-01 2021-04-13 厦门大学 Monocular image depth of field real-time calculation method based on unsupervised depth learning
CN110490919A (en) * 2019-07-05 2019-11-22 天津大学 A kind of depth estimation method of the monocular vision based on deep neural network
CN110490919B (en) * 2019-07-05 2023-04-18 天津大学 Monocular vision depth estimation method based on deep neural network
CN110298791A (en) * 2019-07-08 2019-10-01 西安邮电大学 A kind of super resolution ratio reconstruction method and device of license plate image
CN110298791B (en) * 2019-07-08 2022-10-28 西安邮电大学 Super-resolution reconstruction method and device for license plate image
CN110322499A (en) * 2019-07-09 2019-10-11 浙江科技学院 A kind of monocular image depth estimation method based on multilayer feature
CN110322499B (en) * 2019-07-09 2021-04-09 浙江科技学院 Monocular image depth estimation method based on multilayer characteristics
CN110414674A (en) * 2019-07-31 2019-11-05 浙江科技学院 A kind of monocular depth estimation method based on residual error network and local refinement
CN110414674B (en) * 2019-07-31 2021-09-10 浙江科技学院 Monocular depth estimation method based on residual error network and local refinement
CN111062900A (en) * 2019-11-21 2020-04-24 西北工业大学 Binocular disparity map enhancement method based on confidence fusion
CN111179330A (en) * 2019-12-27 2020-05-19 福建(泉州)哈工大工程技术研究院 Binocular vision scene depth estimation method based on convolutional neural network
CN113076966B (en) * 2020-01-06 2023-06-13 字节跳动有限公司 Image processing method and device, training method of neural network and storage medium
CN113076966A (en) * 2020-01-06 2021-07-06 字节跳动有限公司 Image processing method and device, neural network training method and storage medium
CN113496521A (en) * 2020-04-08 2021-10-12 复旦大学 Method and device for generating depth image and camera external parameters by using multiple color pictures
CN111753961A (en) * 2020-06-26 2020-10-09 北京百度网讯科技有限公司 Model training method and device, and prediction method and device
CN111753961B (en) * 2020-06-26 2023-07-28 北京百度网讯科技有限公司 Model training method and device, prediction method and device
CN112288788A (en) * 2020-10-12 2021-01-29 南京邮电大学 Monocular image depth estimation method
CN112288788B (en) * 2020-10-12 2023-04-28 南京邮电大学 Monocular image depth estimation method
CN112543317A (en) * 2020-12-03 2021-03-23 东南大学 Method for converting high-resolution monocular 2D video into binocular 3D video
WO2022127533A1 (en) * 2020-12-18 2022-06-23 安翰科技(武汉)股份有限公司 Capsule endoscope image three-dimensional reconstruction method, electronic device, and readable storage medium
CN112652058B (en) * 2020-12-31 2024-05-31 广州华多网络科技有限公司 Face image replay method and device, computer equipment and storage medium
CN112652058A (en) * 2020-12-31 2021-04-13 广州华多网络科技有限公司 Human face image replay method and device, computer equipment and storage medium
CN112767467B (en) * 2021-01-25 2022-11-11 郑健青 Double-image depth estimation method based on self-supervision deep learning
CN112767467A (en) * 2021-01-25 2021-05-07 郑健青 Double-image depth estimation method based on self-supervision deep learning
CN112785636A (en) * 2021-02-18 2021-05-11 上海理工大学 Multi-scale enhanced monocular depth estimation method
CN112785636B (en) * 2021-02-18 2023-04-28 上海理工大学 Multi-scale enhanced monocular depth estimation method
CN112837361A (en) * 2021-03-05 2021-05-25 浙江商汤科技开发有限公司 Depth estimation method and device, electronic equipment and storage medium
CN113239958A (en) * 2021-04-09 2021-08-10 Oppo广东移动通信有限公司 Image depth estimation method and device, electronic equipment and storage medium
CN114693897A (en) * 2021-04-28 2022-07-01 上海联影智能医疗科技有限公司 Unsupervised inter-layer super-resolution for medical images
CN113762358B (en) * 2021-08-18 2024-05-14 江苏大学 Semi-supervised learning three-dimensional reconstruction method based on relative depth training
CN113762358A (en) * 2021-08-18 2021-12-07 江苏大学 Semi-supervised learning three-dimensional reconstruction method based on relative deep training
CN113706599A (en) * 2021-10-29 2021-11-26 纽劢科技(上海)有限公司 Binocular depth estimation method based on pseudo label fusion
CN114170286B (en) * 2021-11-04 2023-04-28 西安理工大学 Monocular depth estimation method based on unsupervised deep learning
CN114170286A (en) * 2021-11-04 2022-03-11 西安理工大学 Monocular depth estimation method based on unsupervised depth learning
CN114782911B (en) * 2022-06-20 2022-09-16 小米汽车科技有限公司 Image processing method, device, equipment, medium, chip and vehicle
CN114782911A (en) * 2022-06-20 2022-07-22 小米汽车科技有限公司 Image processing method, device, equipment, medium, chip and vehicle
CN115966102A (en) * 2022-12-30 2023-04-14 中国科学院长春光学精密机械与物理研究所 Early warning braking method based on deep learning
CN117788843A (en) * 2024-02-27 2024-03-29 青岛超瑞纳米新材料科技有限公司 Carbon nanotube image processing method based on neural network algorithm
CN117788843B (en) * 2024-02-27 2024-04-30 青岛超瑞纳米新材料科技有限公司 Carbon nanotube image processing method based on neural network algorithm

Also Published As

Publication number Publication date
CN109377530B (en) 2021-07-27

Similar Documents

Publication Publication Date Title
CN109377530B (en) Binocular depth estimation method based on depth neural network
CN111325794B (en) Visual simultaneous localization and map construction method based on depth convolution self-encoder
CN111259945B (en) Binocular parallax estimation method introducing attention map
CN113160375B (en) Three-dimensional reconstruction and camera pose estimation method based on multi-task learning algorithm
CN112634341B (en) Method for constructing depth estimation model of multi-vision task cooperation
de Queiroz Mendes et al. On deep learning techniques to boost monocular depth estimation for autonomous navigation
WO2018000752A1 (en) Monocular image depth estimation method based on multi-scale cnn and continuous crf
CN111508013B (en) Stereo matching method
CN113283525B (en) Image matching method based on deep learning
CN111783582A (en) Unsupervised monocular depth estimation algorithm based on deep learning
CN110378838A (en) Become multi-view image generation method, device, storage medium and electronic equipment
CN113870422B (en) Point cloud reconstruction method, device, equipment and medium
CN116229461A (en) Indoor scene image real-time semantic segmentation method based on multi-scale refinement
CN112560865B (en) Semantic segmentation method for point cloud under outdoor large scene
CN113792641A (en) High-resolution lightweight human body posture estimation method combined with multispectral attention mechanism
CN113096239B (en) Three-dimensional point cloud reconstruction method based on deep learning
Wang et al. Depth estimation of video sequences with perceptual losses
CN114677479A (en) Natural landscape multi-view three-dimensional reconstruction method based on deep learning
CN115330935A (en) Three-dimensional reconstruction method and system based on deep learning
CN116258757A (en) Monocular image depth estimation method based on multi-scale cross attention
CN116612468A (en) Three-dimensional target detection method based on multi-mode fusion and depth attention mechanism
CN117218246A (en) Training method and device for image generation model, electronic equipment and storage medium
Wang et al. Depth estimation of supervised monocular images based on semantic segmentation
CN116703996A (en) Monocular three-dimensional target detection algorithm based on instance-level self-adaptive depth estimation
Li et al. Monocular 3-D Object Detection Based on Depth-Guided Local Convolution for Smart Payment in D2D Systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220714

Address after: 313009 floor 3, No. 777, Chengzhong Avenue, Lianshi Town, Nanxun District, Huzhou City, Zhejiang Province

Patentee after: Zhejiang Qiqiao Lianyun biosensor technology Co.,Ltd.

Address before: 073000 West 200m northbound at the intersection of Dingzhou commercial street and Xingding Road, Baoding City, Hebei Province (No. 1910, 19th floor, building 3, jueshishan community)

Patentee before: Hebei Kaitong Information Technology Service Co.,Ltd.

Effective date of registration: 20220714

Address after: 073000 West 200m northbound at the intersection of Dingzhou commercial street and Xingding Road, Baoding City, Hebei Province (No. 1910, 19th floor, building 3, jueshishan community)

Patentee after: Hebei Kaitong Information Technology Service Co.,Ltd.

Address before: 300072 Tianjin City, Nankai District Wei Jin Road No. 92

Patentee before: Tianjin University