CN108389226A - A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax - Google Patents

A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax Download PDF

Info

Publication number
CN108389226A
CN108389226A CN201810144465.2A CN201810144465A CN108389226A CN 108389226 A CN108389226 A CN 108389226A CN 201810144465 A CN201810144465 A CN 201810144465A CN 108389226 A CN108389226 A CN 108389226A
Authority
CN
China
Prior art keywords
image
convolutional layer
camera
neural networks
pixels
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810144465.2A
Other languages
Chinese (zh)
Inventor
刘波
杨青相
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201810144465.2A priority Critical patent/CN108389226A/en
Publication of CN108389226A publication Critical patent/CN108389226A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/97Determining parameters from multiple pictures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20228Disparity calculation for image-based rendering

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The present invention discloses a kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax, includes the following steps:First, it is fitted a nonlinear function using convolutional neural networks, two width RGB images is converted into corresponding depth image;Then, it is calculated from left image pixel coordinate by transformation using depth information and obtains the location of pixels in right image;After the location of pixels for obtaining right image the pixel coordinate of right image and corresponding pixel value are obtained by bilinear interpolation;Finally using acquiring pixel value and left image corresponding calculated for pixel values prediction loss.Corresponding depth image can be obtained by this training for not needing any real depth information.This method is not needing any corresponding depth image of real depth information prediction.

Description

A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax
Technical field
The invention belongs to depth learning technology field more particularly to a kind of nothings based on convolutional neural networks and binocular parallax Depth prediction approach is supervised, automatic Pilot, distance estimations are applied to.
Background technology
The mankind can very easily be inferred to the three-dimensional structure of the movement and a scene of oneself in a short period of time. Such as we are very easy to find barrier and avoiding obstacles of making a response rapidly when walking in the street.But computer is wanted Completing above for task is extremely complex, and ability of the computer in terms of rebuilding real-world scene can not show a candle to the mankind, The case where especially processing blocks and lacks texture.
Why the mankind can be more preferable than what computer was done in these tasks.It is a kind of rational to assume to be that we pass through right The cognition in the world, including go about and largely observe the understanding for having developed us to scene structure.We from it is millions of time this The rule about the world is recognized in the observation of sample is:Road is flat, and building is straight, and automobile is on the way face.When you As soon as when the new scene of observation, judged using these rules.In this work we by training one model come This method is simulated, RGB image is shot by one group or so camera of training to explain camera motion and scene structure.
In recent years with the extensive use of deep learning, especially applied in image domains in convolutional neural networks (CNN) After obtaining immense success.Researcher recognizes CNN because complicated and implicit relationship can be captured, it is in image domains Achieve preferable effect.And because there is the presence of ImageNet artificial labeled data collection very big in this way, there is supervision Deep learning method successfully solves the problems, such as very more.
However, nowadays the obvious disadvantage of convolutional neural networks is to need the data using a large amount of handmarkings to instruct Practice.On the one hand handmarking's data set huge in this way ImageNet is needed to consume a large amount of manpower and materials, on the other hand Also it is easy to appear mistakes in labeling process.Expensive hardware is generally required particularly with the depth information of acquisition Outdoor Scene With conscientious careful acquisition.Advanced 3D sensors and multiple camera acquisitions calibrated have been used in spite of KITTI is this Data set, but the reliable depth of its acquisition, still only in limited range, and acquisition cost is higher.
Nowadays all it is using such as NYUv2 and KITTI by there is the method that the method for supervision trains CNN to carry out depth prediction Such data set is trained by RGB image depth map corresponding with its.But these have the method for supervision to be learnt There is no promote network except their direct application field.Trace it to its cause is if by trained single-view estimation of Depth Model is applied to another scene, needs RGB image depth image corresponding with its using another scene, and need again Training network.
Invention content
It is above-mentioned in order to solve the problems, such as, the present invention propose it is a kind of based on convolutional neural networks and binocular parallax without prison Depth prediction approach is superintended and directed, the training convolutional neural networks in the case where not needing any real depth information.
To achieve the above object, the present invention adopts the following technical scheme that:
A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax, includes the following steps:
Step 1 is fitted a nonlinear function using convolutional neural networks, and left and right camera, which is acquired two width RGB images, to be turned It is changed to corresponding depth image;
Step 2 obtains the location of pixels in right image using depth information calculating from left image pixel coordinate by transformation;
Step 3 by bilinear interpolation obtains the pixel coordinate of right image and right after the location of pixels for obtaining right image The pixel value answered;
Step 4, using acquiring pixel value and left image corresponding calculated for pixel values prediction loss.
The present invention can obtain corresponding depth image by this training for not needing any real depth information.It should Method is not needing any corresponding depth image of real depth information prediction.
Description of the drawings
Fig. 1 method flow diagrams;
The convolutional neural networks structure chart that Fig. 2 present invention uses;
Fig. 3 a, Fig. 3 b, Fig. 3 c are test result design sketch, wherein Fig. 3 a are left image, and Fig. 3 b are right image, and Fig. 3 c are Depth map.
Specific implementation mode
Below with reference to drawings and examples, invention is further described in detail.
The present invention provides a kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax, is not needing Training convolutional neural networks in the case of any real depth information;For training convolutional neural networks, KITTI data sets are used In training dataset is used as by a pair of of RGB image that left and right color camera obtains;These data than the RGB image calibrated and Its corresponding real depth image is easier to obtain.
The present invention simulates complicated nonlinear transformation using a CNN, it will be controlled using the parallax of left images Two width RGB images are converted to corresponding depth map.
It is as follows that symbol used in inventive method is described:
IL IR Left images
KL KR The corresponding internal reference matrix of left and right camera
T Outer ginseng matrix between the camera of left and right
pL pR The corresponding pixel coordinate of left images
ID By CNN predict come depth map
Iw Pass through the image that bilinear interpolation is newly-generated
q0 q1 q2 q3 Indicate four elements of spin matrix
XL YL ZL Left camera coordinates system three-dimensional coordinate
XR YR ZR Right camera coordinates system three-dimensional coordinate
The flow chart of the present invention is as shown in Figure 1, include four steps:
Step 1:Two width RGB images are converted into corresponding depth image.
The present invention uses convolutional neural networks as shown in Figure 2.First five layer of network layer and first five layer of Alexnet networks Very similar, the full linking layer of Alexnet networks is replaced with full convolutional layer by us, and last we use five layers of warp lamination It is up-sampled.
Network layer name is followed successively by:Left convolutional layer 1, left convolutional layer 2, right convolutional layer 1, right convolutional layer 2, channel merge layer, Convolutional layer 3, convolutional layer 4, convolutional layer 5, full convolutional layer, warp lamination 1, warp lamination 2, warp lamination 3, warp lamination 4, warp Lamination 5.
We carry out feature extraction using left convolutional layer 1 and left convolutional layer 2 to left image.Similarly we use right convolutional layer 1 and right convolutional layer 2 to right image carry out image characteristics extraction.Then, the feature that we extract left images, in channels Dimension merges into row of channels.For the capability of fitting of our neural networks of raising, the result that we generate after merging for channel It carries out successively again:Convolutional layer 3, convolutional layer 4, convolutional layer 5, full convolutional layer.Finally convolutional layer 5 is produced in order to up-sample us Raw result carries out successively:Warp lamination 1, warp lamination 2, warp lamination 3, warp lamination 4, warp lamination 5.
We are fitted a nonlinear function using convolutional neural networks as described above:
D(IL,IR)=ID
We are using this convolutional neural networks come by two width RGB image ILAnd IRBe converted to corresponding depth image ID
Step 2:Projected position is calculated using depth information.
First by the pixel coordinate conversion of left image to the camera coordinates of left camera, then again by the camera coordinates of left camera The camera coordinates of right camera are transformed to, the camera coordinates of right camera are finally projected as to the location of pixels of right image.Whole process Formula is represented by:
Step 2.1:Left camera coordinates system is changed in left image pixel coordinate inversion.
By the pixel coordinate conversion of left image to the camera coordinates of left camera, it is formulated as:
XL=ID(pL)(uL-uL0)/kLx
YL=ID(pL)(vL-vL0)/kLy
ZL=ID(pL)
Wherein, uL,vLFor the pixel transverse and longitudinal coordinate of image, uL0, vL0, kLx, kLyFor the intrinsic parameter of camera, andfxAnd fyFor camera focus.
Step 2.2:Left camera coordinates system transforms to right camera coordinates system.
Coordinate system transformation is carried out by rotating translation matrix, is formulated as:
Wherein, spin matrix can be expressed as with four elements:
Also, four elements need to meet constraints:
Step 2.3:Right camera coordinates system projects to right location of pixels.
By right camera coordinates by projective transformation be right location of pixels, be formulated as:
Step 3:Pixel coordinate is converted to using bilinear interpolation for the position sought.
Right location of pixels (the u that right camera projects in step 2R, vR) it is successive value, Wo MenyongAfter indicating projection Obtained right location of pixels, is formulated as:In order to obtain better pixel filling effect, we Using the method for bilinear interpolation, the pixel of the value (upper left corner, the upper right corner, the lower left corner and the lower right corner) of four neighborhood of pixels is used It is worth into row interpolation.It is represented by with formula:
Wherein,It indicates respectivelyFour neighborhood of pixels (upper left corner, the upper right corner, the lower left corner and The lower right corner) pixel coordinate;It can pass throughWithBetween spatial linear distance seek, and expire equality constraint Relationship
Step 4:Calculate prediction loss
The reconstruction loss function computational methods that the present invention uses, with reference to absolute error loss function formula, formula expression is such as Under:
Since the image pixel intensities that the gradient of this loss function is mainly derived from four fields around are poor.If predicted position Will occur almost without the problem of gradient or predicted position and actual position are apart from larger pixel difference positioned at weak texture position It is excessive the excessive situation of gradient occur.And in order to keep the depth map of appearance smoother, we use the simple regularization sides L2 Method constrains the gradient of depth imageFormula is expressed as follows:
We are expressed as final loss function:
Wherein, n is amount of images;λ is hyper parameter, and the intensity of regularization is adjusted as the coefficient of regularization.
Embodiment 1:
The present invention is using tall and handsome up to GPU as computing platform, using caffe deep learnings frame as CNN frames.It is specific real Apply that steps are as follows:
Step 1:Data set prepares.
, using our neural network of KITTI public data collection training, which, which uses, is mounted on mobile vehicle for we On a pair of of B/W camera, a pair of of color camera, laser radar acquires several Outdoor Scenes.We use November 26 On the same day, same vehicle, which acquires, belongs to cities and towns, residential area, and the data of left and right color camera acquisition are as me in the contextual data of road Training data.RGB original images that we acquire left and right camera it is down-sampled at 160x608 resolution ratio as our god Through network inputs.
For training set by 13855 left images to forming, we use 500 data conducts for carrying real depth information Test set assesses our result.
Step 2:Build convolutional neural networks.
We use network structure as shown in Figure 2, wherein network layer name to be followed successively by:Left convolutional layer 1, left convolutional layer 2, Right convolutional layer 1, right convolutional layer 2, channel merging layer, convolutional layer 3, convolutional layer 4, convolutional layer 5, full convolutional layer, warp lamination 1, instead Convolutional layer 2, warp lamination 3, warp lamination 4, warp lamination 5.
We carry out feature extraction using left convolutional layer 1 and left convolutional layer 2 to left image.Similarly, we use right convolution Layer 1 and right convolutional layer 2 carry out image characteristics extraction to right image.Then, the feature that we extract left images, Channels dimensions merge into row of channels.For the capability of fitting of our neural networks of raising, we produce after merging for channel Raw result carries out successively again:Convolutional layer 3, convolutional layer 4, convolutional layer 5, full convolutional layer.It is finally right in order to carry out up-sampling us The result that convolutional layer 5 generates carries out successively:Warp lamination 1, warp lamination 2, warp lamination 3, warp lamination 4, warp lamination 5.
The left convolutional layer 1 and 2 convolution kernel size of left convolutional layer are respectively:11*11 sizes and 5*5 sizes, left convolutional layer 1 Exporting characteristic pattern quantity with left convolutional layer 2 is respectively:96 and 256, corresponding right convolutional layer 1 and right convolution 2 are same. Channel merges layer our concat layers for being provided using caffe frames.Convolutional layer 3,5 convolution kernel size of convolutional layer 4 and convolutional layer It is:3*3 sizes, characteristic pattern the number of output are respectively:384,384 and 256.Full convolutional layer we use 1024 The convolution kernel of 1*1 sizes.Five layers of warp lamination convolution kernel size are:4*4 sizes, output characteristic pattern quantity are:1.
Step 3:Initialize the inside and outside parameter of left and right camera.
When training our neural network to seek depth information, it would be desirable to rational to initialize the inside and outside of left and right camera Parameter could solve as follows to preferable depth information initialization procedure:
Camera intrinsic parameter initializes:Wherein we initialize uL0, vL0It is the one of our input image sizes after down-sampled Half, respectively 304,80.Similarly us are facilitated to initialize the corresponding parameter u of right image in order to solveR0, vR0It is 304,80.I Initialize corresponding kLx, kLyRespectively 950,950.Similarly initialize kRx, kRyTo be similarly worth.
Camera extrinsic number initializes:Because the movement between the data set two images that we use is mainly reflected in level side Spin matrix is initialized as unit matrix, initialization is flat by upward translation so when we initialize corresponding outer ginseng matrix The movement that other directions of the movement of horizontal direction are only initialized when moving matrix is initialized as 0.Our spin matrixs need to meet multiple We use four element representation unit matrixs to miscellaneous constraints, and since quaternary number needs to meet equality constraint:So it is q that we, which initialize quaternary number,0=1, q1=0, q2=0, q3=0, we initialize Parameter t in translation matrixx,ty,tzRespectively 50,0,0.
We train our neural network using the camera inside and outside parameter of above-mentioned initialization.
Step 4:The training of neural network and the setting of network parameter.
In training convolutional neural networks, we read in 7 images to as a batch every time.We using SGD with 0.9 momentum and 0.0005 weight decaying network is optimized.We subtract corresponding mean value tri- channels RGB (104,117,123), then divided by 255 make left images pixel value be distributed between section [- 0.5,0.5].In loss functionIn we be arranged hyper parameter λ be 0.05.
In order to save the training time, we train 40000 model using Ravi Garg et al. in the training process Middle part fraction value starts our training.
Specific implementation step narration finishes, and effect is as shown in Fig. 3 a, Fig. 3 b, Fig. 3 c.The survey of the present invention is given below Test result.Experimental situation is:GPU:7.5 version of TITAN, CUDA carries out test using KITTI data sets and supervises list with having The several method of mesh prediction is compared.We assess our result with following appraisal procedure:
Comparative result:
Wherein c7Indicate the coarse grid and fine network of Eigen methods respectively with f.
Our unsupervised depth prediction approach based on convolutional neural networks and binocular vision, comparing has the method for supervision Difference is not very big, also bigger development space and research significance in accuracy rate.

Claims (5)

1. a kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax, which is characterized in that including following Step:
Step 1 is fitted a nonlinear function using convolutional neural networks, and left and right camera, which is acquired two width RGB images, to be converted to Corresponding depth image;
Step 2 obtains the location of pixels in right image using depth information calculating from left image pixel coordinate by transformation;
Step 3 by bilinear interpolation obtains the pixel coordinate of right image and corresponding after the location of pixels for obtaining right image Pixel value;
Step 4, using acquiring pixel value and left image corresponding calculated for pixel values prediction loss.
2. the unsupervised depth prediction approach based on convolutional neural networks and binocular parallax as described in claim 1, feature It is, step 1 is specially:
The convolutional neural networks include being followed successively by:Left convolutional layer 1, left convolutional layer 2, right convolutional layer 1, right convolutional layer 2, channel are closed And layer, convolutional layer 3, convolutional layer 4, convolutional layer 5, full convolutional layer, warp lamination 1, warp lamination 2, warp lamination 3, warp lamination 4, warp lamination 5;Wherein, feature extraction is carried out to left image using left convolutional layer 1 and left convolutional layer 2.Similarly use right convolution Layer 1 and right convolutional layer 2 carry out image characteristics extraction to right image;Then, it to the feature of left images extraction, is tieed up in channels It spends into row of channels and merges;The result generated after merging for channel carries out successively again:Convolutional layer 3, convolutional layer 4, convolutional layer 5, entirely Convolutional layer;Finally carried out successively to be up-sampled the result generated to convolutional layer 5:Warp lamination 1, warp lamination 2, warp Lamination 3, warp lamination 4, warp lamination 5;
It is fitted a nonlinear function using convolutional neural networks as described above:
D(IL,IR)=ID
Using this convolutional neural networks come by two width RGB image ILAnd IRBe converted to corresponding depth image ID
3. the unsupervised depth prediction approach based on convolutional neural networks and binocular parallax as described in claim 1, feature It is, in step 2, first by the pixel coordinate conversion of left image to the camera coordinates of left camera, then again by the phase of left camera The camera coordinates of right camera are finally projected as the location of pixels of right image by machine coordinate transform to the camera coordinates of right camera, whole A process formula is represented by:
It specifically includes:
Step 2.1:Left camera coordinates system is changed in left image pixel coordinate inversion
By the pixel coordinate conversion of left image to the camera coordinates of left camera, it is formulated as:
XL=ID(pL)(uL-uL0)/kLx
YL=ID(pL)(vL-vL0)/kLy
ZL=ID(pL)
Wherein, uL,vLFor the pixel transverse and longitudinal coordinate of image, uL0, vL0, kLx, kLyFor the intrinsic parameter of camera, andfxAnd fyFor camera focus;
Step 2.2:Left camera coordinates system transforms to right camera coordinates system
Coordinate system transformation is carried out by rotating translation matrix, is formulated as:
Wherein, spin matrix can be expressed as with four elements:
Also, four elements need to meet constraints:
Step 2.3:Right camera coordinates system projects to right location of pixels
By right camera coordinates by projective transformation be right location of pixels, be formulated as:
4. the unsupervised depth prediction approach based on convolutional neural networks and binocular parallax as described in claim 1, feature It is, step 3 is specially:
Right location of pixels (the u that right camera projects in step 2R, vR) it is successive value, it usesTo obtain after indicating projection Right location of pixels, is formulated as:Using the method for bilinear interpolation, four neighborhood of pixels are used Value (upper left corner, the upper right corner, the lower left corner and the lower right corner) pixel value into row interpolation, be represented by using formula:
Wherein,It indicates respectivelyFour neighborhood of pixels (upper left corner, the upper right corner, the lower left corner and bottom rights Angle) pixel coordinate;It can pass throughWithBetween spatial linear distance seek, and expire equality constraint relationship
5. the unsupervised depth prediction approach based on convolutional neural networks and binocular parallax as described in claim 1, feature It is, step 4 is specially:
The reconstruction loss function computational methods used, with reference to absolute error loss function formula, formula expression is as follows:
Image pixel intensities of the gradient of loss function from four fields of surrounding are poor, in order to keep the depth map of appearance smoother, make The gradient of depth image is constrained with simple L2 regularization methodsFormula is expressed as follows:
Final loss function is expressed as:
Wherein, n is amount of images;λ is hyper parameter, and the intensity of regularization is adjusted as the coefficient of regularization.
CN201810144465.2A 2018-02-12 2018-02-12 A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax Pending CN108389226A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810144465.2A CN108389226A (en) 2018-02-12 2018-02-12 A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810144465.2A CN108389226A (en) 2018-02-12 2018-02-12 A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax

Publications (1)

Publication Number Publication Date
CN108389226A true CN108389226A (en) 2018-08-10

Family

ID=63068766

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810144465.2A Pending CN108389226A (en) 2018-02-12 2018-02-12 A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax

Country Status (1)

Country Link
CN (1) CN108389226A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299656A (en) * 2018-08-13 2019-02-01 浙江零跑科技有限公司 A kind of deeply determining method of vehicle-mounted vision system scene visual
CN109377530A (en) * 2018-11-30 2019-02-22 天津大学 A kind of binocular depth estimation method based on deep neural network
CN109584340A (en) * 2018-12-11 2019-04-05 苏州中科广视文化科技有限公司 New Century Planned Textbook synthetic method based on depth convolutional neural networks
CN109615674A (en) * 2018-11-28 2019-04-12 浙江大学 The double tracer PET method for reconstructing of dynamic based on losses by mixture function 3D CNN
CN109801323A (en) * 2018-12-14 2019-05-24 中国科学院深圳先进技术研究院 Pyramid binocular depth with self-promotion ability estimates model
CN110009691A (en) * 2019-03-28 2019-07-12 北京清微智能科技有限公司 Based on the matched anaglyph generation method of binocular stereo vision and system
CN110175603A (en) * 2019-04-01 2019-08-27 佛山缔乐视觉科技有限公司 A kind of engraving character recognition methods, system and storage medium
CN110414393A (en) * 2019-07-15 2019-11-05 福州瑞芯微电子股份有限公司 A kind of natural interactive method and terminal based on deep learning
CN110702015A (en) * 2019-09-26 2020-01-17 中国南方电网有限责任公司超高压输电公司曲靖局 Method and device for measuring icing thickness of power transmission line
WO2020046066A1 (en) * 2018-08-30 2020-03-05 Samsung Electronics Co., Ltd. Method for training convolutional neural network to reconstruct an image and system for depth map generation from an image
CN111462208A (en) * 2020-04-05 2020-07-28 北京工业大学 Non-supervision depth prediction method based on binocular parallax and epipolar line constraint
CN111862321A (en) * 2019-04-30 2020-10-30 北京四维图新科技股份有限公司 Method, device and system for acquiring disparity map and storage medium
CN112639878A (en) * 2018-09-05 2021-04-09 谷歌有限责任公司 Unsupervised depth prediction neural network
EP3731528A4 (en) * 2017-12-21 2021-08-11 Sony Interactive Entertainment Inc. Image processing device, content processing device, content processing system, and image processing method
US11841921B2 (en) 2020-06-26 2023-12-12 Beijing Baidu Netcom Science And Technology Co., Ltd. Model training method and apparatus, and prediction method and apparatus

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101577004A (en) * 2009-06-25 2009-11-11 青岛海信数字多媒体技术国家重点实验室有限公司 Rectification method for polar lines, appliance and system thereof
CN105869167A (en) * 2016-03-30 2016-08-17 天津大学 High-resolution depth map acquisition method based on active and passive fusion
CN105975915A (en) * 2016-04-28 2016-09-28 大连理工大学 Front vehicle parameter identification method based on multitask convolution nerve network
CN106204731A (en) * 2016-07-18 2016-12-07 华南理工大学 A kind of multi-view angle three-dimensional method for reconstructing based on Binocular Stereo Vision System
CN106612427A (en) * 2016-12-29 2017-05-03 浙江工商大学 Method for generating spatial-temporal consistency depth map sequence based on convolution neural network
CN106934765A (en) * 2017-03-14 2017-07-07 长沙全度影像科技有限公司 Panoramic picture fusion method based on depth convolutional neural networks Yu depth information
KR20180012638A (en) * 2016-07-27 2018-02-06 한국전자통신연구원 Method and apparatus for detecting object in vision recognition with aggregate channel features

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101577004A (en) * 2009-06-25 2009-11-11 青岛海信数字多媒体技术国家重点实验室有限公司 Rectification method for polar lines, appliance and system thereof
CN105869167A (en) * 2016-03-30 2016-08-17 天津大学 High-resolution depth map acquisition method based on active and passive fusion
CN105975915A (en) * 2016-04-28 2016-09-28 大连理工大学 Front vehicle parameter identification method based on multitask convolution nerve network
CN106204731A (en) * 2016-07-18 2016-12-07 华南理工大学 A kind of multi-view angle three-dimensional method for reconstructing based on Binocular Stereo Vision System
KR20180012638A (en) * 2016-07-27 2018-02-06 한국전자통신연구원 Method and apparatus for detecting object in vision recognition with aggregate channel features
CN106612427A (en) * 2016-12-29 2017-05-03 浙江工商大学 Method for generating spatial-temporal consistency depth map sequence based on convolution neural network
CN106934765A (en) * 2017-03-14 2017-07-07 长沙全度影像科技有限公司 Panoramic picture fusion method based on depth convolutional neural networks Yu depth information

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11503267B2 (en) 2017-12-21 2022-11-15 Sony Interactive Entertainment Inc. Image processing device, content processing device, content processing system, and image processing method
EP3731528A4 (en) * 2017-12-21 2021-08-11 Sony Interactive Entertainment Inc. Image processing device, content processing device, content processing system, and image processing method
CN109299656A (en) * 2018-08-13 2019-02-01 浙江零跑科技有限公司 A kind of deeply determining method of vehicle-mounted vision system scene visual
US10832432B2 (en) 2018-08-30 2020-11-10 Samsung Electronics Co., Ltd Method for training convolutional neural network to reconstruct an image and system for depth map generation from an image
WO2020046066A1 (en) * 2018-08-30 2020-03-05 Samsung Electronics Co., Ltd. Method for training convolutional neural network to reconstruct an image and system for depth map generation from an image
US11410323B2 (en) 2018-08-30 2022-08-09 Samsung Electronics., Ltd Method for training convolutional neural network to reconstruct an image and system for depth map generation from an image
CN112639878A (en) * 2018-09-05 2021-04-09 谷歌有限责任公司 Unsupervised depth prediction neural network
CN109615674A (en) * 2018-11-28 2019-04-12 浙江大学 The double tracer PET method for reconstructing of dynamic based on losses by mixture function 3D CNN
CN109615674B (en) * 2018-11-28 2020-09-18 浙江大学 Dynamic double-tracing PET reconstruction method based on mixed loss function 3D CNN
CN109377530B (en) * 2018-11-30 2021-07-27 天津大学 Binocular depth estimation method based on depth neural network
CN109377530A (en) * 2018-11-30 2019-02-22 天津大学 A kind of binocular depth estimation method based on deep neural network
CN109584340B (en) * 2018-12-11 2023-04-18 苏州中科广视文化科技有限公司 New visual angle synthesis method based on deep convolutional neural network
CN109584340A (en) * 2018-12-11 2019-04-05 苏州中科广视文化科技有限公司 New Century Planned Textbook synthetic method based on depth convolutional neural networks
CN109801323A (en) * 2018-12-14 2019-05-24 中国科学院深圳先进技术研究院 Pyramid binocular depth with self-promotion ability estimates model
CN110009691A (en) * 2019-03-28 2019-07-12 北京清微智能科技有限公司 Based on the matched anaglyph generation method of binocular stereo vision and system
CN110175603A (en) * 2019-04-01 2019-08-27 佛山缔乐视觉科技有限公司 A kind of engraving character recognition methods, system and storage medium
CN111862321A (en) * 2019-04-30 2020-10-30 北京四维图新科技股份有限公司 Method, device and system for acquiring disparity map and storage medium
CN111862321B (en) * 2019-04-30 2024-05-03 北京四维图新科技股份有限公司 Parallax map acquisition method, device, system and storage medium
CN110414393A (en) * 2019-07-15 2019-11-05 福州瑞芯微电子股份有限公司 A kind of natural interactive method and terminal based on deep learning
CN110702015A (en) * 2019-09-26 2020-01-17 中国南方电网有限责任公司超高压输电公司曲靖局 Method and device for measuring icing thickness of power transmission line
CN111462208A (en) * 2020-04-05 2020-07-28 北京工业大学 Non-supervision depth prediction method based on binocular parallax and epipolar line constraint
US11841921B2 (en) 2020-06-26 2023-12-12 Beijing Baidu Netcom Science And Technology Co., Ltd. Model training method and apparatus, and prediction method and apparatus

Similar Documents

Publication Publication Date Title
CN108389226A (en) A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax
Amirkolaee et al. Height estimation from single aerial images using a deep convolutional encoder-decoder network
CN111462329B (en) Three-dimensional reconstruction method of unmanned aerial vehicle aerial image based on deep learning
Mitrokhin et al. EV-IMO: Motion segmentation dataset and learning pipeline for event cameras
CN106803267B (en) Kinect-based indoor scene three-dimensional reconstruction method
CN104778671B (en) A kind of image super-resolution method based on SAE and rarefaction representation
CN105069746B (en) Video real-time face replacement method and its system based on local affine invariant and color transfer technology
CN108416840A (en) A kind of dense method for reconstructing of three-dimensional scenic based on monocular camera
Kropatsch et al. Digital image analysis: selected techniques and applications
CN106981080A (en) Night unmanned vehicle scene depth method of estimation based on infrared image and radar data
CN110310227A (en) A kind of image super-resolution rebuilding method decomposed based on high and low frequency information
CN106780592A (en) Kinect depth reconstruction algorithms based on camera motion and image light and shade
CN104835130A (en) Multi-exposure image fusion method
CN102142153A (en) Image-based remodeling method of three-dimensional model
CN104869387A (en) Method for acquiring binocular image maximum parallax based on optical flow method
CN110910437B (en) Depth prediction method for complex indoor scene
CN116258658B (en) Swin transducer-based image fusion method
CN113313828B (en) Three-dimensional reconstruction method and system based on single-picture intrinsic image decomposition
CN111462208A (en) Non-supervision depth prediction method based on binocular parallax and epipolar line constraint
CN110097634A (en) A kind of terrible imaging method of the three-dimensional of self-adapting multi-dimension
CN114677479A (en) Natural landscape multi-view three-dimensional reconstruction method based on deep learning
CN113516693A (en) Rapid and universal image registration method
CN112686830B (en) Super-resolution method of single depth map based on image decomposition
CN112116646B (en) Depth estimation method for light field image based on depth convolution neural network
CN117315169A (en) Live-action three-dimensional model reconstruction method and system based on deep learning multi-view dense matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180810