CN108389226A - A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax - Google Patents
A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax Download PDFInfo
- Publication number
- CN108389226A CN108389226A CN201810144465.2A CN201810144465A CN108389226A CN 108389226 A CN108389226 A CN 108389226A CN 201810144465 A CN201810144465 A CN 201810144465A CN 108389226 A CN108389226 A CN 108389226A
- Authority
- CN
- China
- Prior art keywords
- image
- convolutional layer
- camera
- neural networks
- pixels
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/97—Determining parameters from multiple pictures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20228—Disparity calculation for image-based rendering
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The present invention discloses a kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax, includes the following steps:First, it is fitted a nonlinear function using convolutional neural networks, two width RGB images is converted into corresponding depth image;Then, it is calculated from left image pixel coordinate by transformation using depth information and obtains the location of pixels in right image;After the location of pixels for obtaining right image the pixel coordinate of right image and corresponding pixel value are obtained by bilinear interpolation;Finally using acquiring pixel value and left image corresponding calculated for pixel values prediction loss.Corresponding depth image can be obtained by this training for not needing any real depth information.This method is not needing any corresponding depth image of real depth information prediction.
Description
Technical field
The invention belongs to depth learning technology field more particularly to a kind of nothings based on convolutional neural networks and binocular parallax
Depth prediction approach is supervised, automatic Pilot, distance estimations are applied to.
Background technology
The mankind can very easily be inferred to the three-dimensional structure of the movement and a scene of oneself in a short period of time.
Such as we are very easy to find barrier and avoiding obstacles of making a response rapidly when walking in the street.But computer is wanted
Completing above for task is extremely complex, and ability of the computer in terms of rebuilding real-world scene can not show a candle to the mankind,
The case where especially processing blocks and lacks texture.
Why the mankind can be more preferable than what computer was done in these tasks.It is a kind of rational to assume to be that we pass through right
The cognition in the world, including go about and largely observe the understanding for having developed us to scene structure.We from it is millions of time this
The rule about the world is recognized in the observation of sample is:Road is flat, and building is straight, and automobile is on the way face.When you
As soon as when the new scene of observation, judged using these rules.In this work we by training one model come
This method is simulated, RGB image is shot by one group or so camera of training to explain camera motion and scene structure.
In recent years with the extensive use of deep learning, especially applied in image domains in convolutional neural networks (CNN)
After obtaining immense success.Researcher recognizes CNN because complicated and implicit relationship can be captured, it is in image domains
Achieve preferable effect.And because there is the presence of ImageNet artificial labeled data collection very big in this way, there is supervision
Deep learning method successfully solves the problems, such as very more.
However, nowadays the obvious disadvantage of convolutional neural networks is to need the data using a large amount of handmarkings to instruct
Practice.On the one hand handmarking's data set huge in this way ImageNet is needed to consume a large amount of manpower and materials, on the other hand
Also it is easy to appear mistakes in labeling process.Expensive hardware is generally required particularly with the depth information of acquisition Outdoor Scene
With conscientious careful acquisition.Advanced 3D sensors and multiple camera acquisitions calibrated have been used in spite of KITTI is this
Data set, but the reliable depth of its acquisition, still only in limited range, and acquisition cost is higher.
Nowadays all it is using such as NYUv2 and KITTI by there is the method that the method for supervision trains CNN to carry out depth prediction
Such data set is trained by RGB image depth map corresponding with its.But these have the method for supervision to be learnt
There is no promote network except their direct application field.Trace it to its cause is if by trained single-view estimation of Depth
Model is applied to another scene, needs RGB image depth image corresponding with its using another scene, and need again
Training network.
Invention content
It is above-mentioned in order to solve the problems, such as, the present invention propose it is a kind of based on convolutional neural networks and binocular parallax without prison
Depth prediction approach is superintended and directed, the training convolutional neural networks in the case where not needing any real depth information.
To achieve the above object, the present invention adopts the following technical scheme that:
A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax, includes the following steps:
Step 1 is fitted a nonlinear function using convolutional neural networks, and left and right camera, which is acquired two width RGB images, to be turned
It is changed to corresponding depth image;
Step 2 obtains the location of pixels in right image using depth information calculating from left image pixel coordinate by transformation;
Step 3 by bilinear interpolation obtains the pixel coordinate of right image and right after the location of pixels for obtaining right image
The pixel value answered;
Step 4, using acquiring pixel value and left image corresponding calculated for pixel values prediction loss.
The present invention can obtain corresponding depth image by this training for not needing any real depth information.It should
Method is not needing any corresponding depth image of real depth information prediction.
Description of the drawings
Fig. 1 method flow diagrams;
The convolutional neural networks structure chart that Fig. 2 present invention uses;
Fig. 3 a, Fig. 3 b, Fig. 3 c are test result design sketch, wherein Fig. 3 a are left image, and Fig. 3 b are right image, and Fig. 3 c are
Depth map.
Specific implementation mode
Below with reference to drawings and examples, invention is further described in detail.
The present invention provides a kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax, is not needing
Training convolutional neural networks in the case of any real depth information;For training convolutional neural networks, KITTI data sets are used
In training dataset is used as by a pair of of RGB image that left and right color camera obtains;These data than the RGB image calibrated and
Its corresponding real depth image is easier to obtain.
The present invention simulates complicated nonlinear transformation using a CNN, it will be controlled using the parallax of left images
Two width RGB images are converted to corresponding depth map.
It is as follows that symbol used in inventive method is described:
IL IR | Left images |
KL KR | The corresponding internal reference matrix of left and right camera |
T | Outer ginseng matrix between the camera of left and right |
pL pR | The corresponding pixel coordinate of left images |
ID | By CNN predict come depth map |
Iw | Pass through the image that bilinear interpolation is newly-generated |
q0 q1 q2 q3 | Indicate four elements of spin matrix |
XL YL ZL | Left camera coordinates system three-dimensional coordinate |
XR YR ZR | Right camera coordinates system three-dimensional coordinate |
The flow chart of the present invention is as shown in Figure 1, include four steps:
Step 1:Two width RGB images are converted into corresponding depth image.
The present invention uses convolutional neural networks as shown in Figure 2.First five layer of network layer and first five layer of Alexnet networks
Very similar, the full linking layer of Alexnet networks is replaced with full convolutional layer by us, and last we use five layers of warp lamination
It is up-sampled.
Network layer name is followed successively by:Left convolutional layer 1, left convolutional layer 2, right convolutional layer 1, right convolutional layer 2, channel merge layer,
Convolutional layer 3, convolutional layer 4, convolutional layer 5, full convolutional layer, warp lamination 1, warp lamination 2, warp lamination 3, warp lamination 4, warp
Lamination 5.
We carry out feature extraction using left convolutional layer 1 and left convolutional layer 2 to left image.Similarly we use right convolutional layer
1 and right convolutional layer 2 to right image carry out image characteristics extraction.Then, the feature that we extract left images, in channels
Dimension merges into row of channels.For the capability of fitting of our neural networks of raising, the result that we generate after merging for channel
It carries out successively again:Convolutional layer 3, convolutional layer 4, convolutional layer 5, full convolutional layer.Finally convolutional layer 5 is produced in order to up-sample us
Raw result carries out successively:Warp lamination 1, warp lamination 2, warp lamination 3, warp lamination 4, warp lamination 5.
We are fitted a nonlinear function using convolutional neural networks as described above:
D(IL,IR)=ID
We are using this convolutional neural networks come by two width RGB image ILAnd IRBe converted to corresponding depth image ID。
Step 2:Projected position is calculated using depth information.
First by the pixel coordinate conversion of left image to the camera coordinates of left camera, then again by the camera coordinates of left camera
The camera coordinates of right camera are transformed to, the camera coordinates of right camera are finally projected as to the location of pixels of right image.Whole process
Formula is represented by:
Step 2.1:Left camera coordinates system is changed in left image pixel coordinate inversion.
By the pixel coordinate conversion of left image to the camera coordinates of left camera, it is formulated as:
XL=ID(pL)(uL-uL0)/kLx
YL=ID(pL)(vL-vL0)/kLy
ZL=ID(pL)
Wherein, uL,vLFor the pixel transverse and longitudinal coordinate of image, uL0, vL0, kLx, kLyFor the intrinsic parameter of camera, andfxAnd fyFor camera focus.
Step 2.2:Left camera coordinates system transforms to right camera coordinates system.
Coordinate system transformation is carried out by rotating translation matrix, is formulated as:
Wherein, spin matrix can be expressed as with four elements:
Also, four elements need to meet constraints:
Step 2.3:Right camera coordinates system projects to right location of pixels.
By right camera coordinates by projective transformation be right location of pixels, be formulated as:
Step 3:Pixel coordinate is converted to using bilinear interpolation for the position sought.
Right location of pixels (the u that right camera projects in step 2R, vR) it is successive value, Wo MenyongAfter indicating projection
Obtained right location of pixels, is formulated as:In order to obtain better pixel filling effect, we
Using the method for bilinear interpolation, the pixel of the value (upper left corner, the upper right corner, the lower left corner and the lower right corner) of four neighborhood of pixels is used
It is worth into row interpolation.It is represented by with formula:
Wherein,It indicates respectivelyFour neighborhood of pixels (upper left corner, the upper right corner, the lower left corner and
The lower right corner) pixel coordinate;It can pass throughWithBetween spatial linear distance seek, and expire equality constraint
Relationship
Step 4:Calculate prediction loss
The reconstruction loss function computational methods that the present invention uses, with reference to absolute error loss function formula, formula expression is such as
Under:
Since the image pixel intensities that the gradient of this loss function is mainly derived from four fields around are poor.If predicted position
Will occur almost without the problem of gradient or predicted position and actual position are apart from larger pixel difference positioned at weak texture position
It is excessive the excessive situation of gradient occur.And in order to keep the depth map of appearance smoother, we use the simple regularization sides L2
Method constrains the gradient of depth imageFormula is expressed as follows:
We are expressed as final loss function:
Wherein, n is amount of images;λ is hyper parameter, and the intensity of regularization is adjusted as the coefficient of regularization.
Embodiment 1:
The present invention is using tall and handsome up to GPU as computing platform, using caffe deep learnings frame as CNN frames.It is specific real
Apply that steps are as follows:
Step 1:Data set prepares.
, using our neural network of KITTI public data collection training, which, which uses, is mounted on mobile vehicle for we
On a pair of of B/W camera, a pair of of color camera, laser radar acquires several Outdoor Scenes.We use November 26
On the same day, same vehicle, which acquires, belongs to cities and towns, residential area, and the data of left and right color camera acquisition are as me in the contextual data of road
Training data.RGB original images that we acquire left and right camera it is down-sampled at 160x608 resolution ratio as our god
Through network inputs.
For training set by 13855 left images to forming, we use 500 data conducts for carrying real depth information
Test set assesses our result.
Step 2:Build convolutional neural networks.
We use network structure as shown in Figure 2, wherein network layer name to be followed successively by:Left convolutional layer 1, left convolutional layer 2,
Right convolutional layer 1, right convolutional layer 2, channel merging layer, convolutional layer 3, convolutional layer 4, convolutional layer 5, full convolutional layer, warp lamination 1, instead
Convolutional layer 2, warp lamination 3, warp lamination 4, warp lamination 5.
We carry out feature extraction using left convolutional layer 1 and left convolutional layer 2 to left image.Similarly, we use right convolution
Layer 1 and right convolutional layer 2 carry out image characteristics extraction to right image.Then, the feature that we extract left images,
Channels dimensions merge into row of channels.For the capability of fitting of our neural networks of raising, we produce after merging for channel
Raw result carries out successively again:Convolutional layer 3, convolutional layer 4, convolutional layer 5, full convolutional layer.It is finally right in order to carry out up-sampling us
The result that convolutional layer 5 generates carries out successively:Warp lamination 1, warp lamination 2, warp lamination 3, warp lamination 4, warp lamination 5.
The left convolutional layer 1 and 2 convolution kernel size of left convolutional layer are respectively:11*11 sizes and 5*5 sizes, left convolutional layer 1
Exporting characteristic pattern quantity with left convolutional layer 2 is respectively:96 and 256, corresponding right convolutional layer 1 and right convolution 2 are same.
Channel merges layer our concat layers for being provided using caffe frames.Convolutional layer 3,5 convolution kernel size of convolutional layer 4 and convolutional layer
It is:3*3 sizes, characteristic pattern the number of output are respectively:384,384 and 256.Full convolutional layer we use 1024
The convolution kernel of 1*1 sizes.Five layers of warp lamination convolution kernel size are:4*4 sizes, output characteristic pattern quantity are:1.
Step 3:Initialize the inside and outside parameter of left and right camera.
When training our neural network to seek depth information, it would be desirable to rational to initialize the inside and outside of left and right camera
Parameter could solve as follows to preferable depth information initialization procedure:
Camera intrinsic parameter initializes:Wherein we initialize uL0, vL0It is the one of our input image sizes after down-sampled
Half, respectively 304,80.Similarly us are facilitated to initialize the corresponding parameter u of right image in order to solveR0, vR0It is 304,80.I
Initialize corresponding kLx, kLyRespectively 950,950.Similarly initialize kRx, kRyTo be similarly worth.
Camera extrinsic number initializes:Because the movement between the data set two images that we use is mainly reflected in level side
Spin matrix is initialized as unit matrix, initialization is flat by upward translation so when we initialize corresponding outer ginseng matrix
The movement that other directions of the movement of horizontal direction are only initialized when moving matrix is initialized as 0.Our spin matrixs need to meet multiple
We use four element representation unit matrixs to miscellaneous constraints, and since quaternary number needs to meet equality constraint:So it is q that we, which initialize quaternary number,0=1, q1=0, q2=0, q3=0, we initialize
Parameter t in translation matrixx,ty,tzRespectively 50,0,0.
We train our neural network using the camera inside and outside parameter of above-mentioned initialization.
Step 4:The training of neural network and the setting of network parameter.
In training convolutional neural networks, we read in 7 images to as a batch every time.We using SGD with
0.9 momentum and 0.0005 weight decaying network is optimized.We subtract corresponding mean value tri- channels RGB
(104,117,123), then divided by 255 make left images pixel value be distributed between section [- 0.5,0.5].In loss functionIn we be arranged hyper parameter λ be 0.05.
In order to save the training time, we train 40000 model using Ravi Garg et al. in the training process
Middle part fraction value starts our training.
Specific implementation step narration finishes, and effect is as shown in Fig. 3 a, Fig. 3 b, Fig. 3 c.The survey of the present invention is given below
Test result.Experimental situation is:GPU:7.5 version of TITAN, CUDA carries out test using KITTI data sets and supervises list with having
The several method of mesh prediction is compared.We assess our result with following appraisal procedure:
Comparative result:
Wherein c7Indicate the coarse grid and fine network of Eigen methods respectively with f.
Our unsupervised depth prediction approach based on convolutional neural networks and binocular vision, comparing has the method for supervision
Difference is not very big, also bigger development space and research significance in accuracy rate.
Claims (5)
1. a kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax, which is characterized in that including following
Step:
Step 1 is fitted a nonlinear function using convolutional neural networks, and left and right camera, which is acquired two width RGB images, to be converted to
Corresponding depth image;
Step 2 obtains the location of pixels in right image using depth information calculating from left image pixel coordinate by transformation;
Step 3 by bilinear interpolation obtains the pixel coordinate of right image and corresponding after the location of pixels for obtaining right image
Pixel value;
Step 4, using acquiring pixel value and left image corresponding calculated for pixel values prediction loss.
2. the unsupervised depth prediction approach based on convolutional neural networks and binocular parallax as described in claim 1, feature
It is, step 1 is specially:
The convolutional neural networks include being followed successively by:Left convolutional layer 1, left convolutional layer 2, right convolutional layer 1, right convolutional layer 2, channel are closed
And layer, convolutional layer 3, convolutional layer 4, convolutional layer 5, full convolutional layer, warp lamination 1, warp lamination 2, warp lamination 3, warp lamination
4, warp lamination 5;Wherein, feature extraction is carried out to left image using left convolutional layer 1 and left convolutional layer 2.Similarly use right convolution
Layer 1 and right convolutional layer 2 carry out image characteristics extraction to right image;Then, it to the feature of left images extraction, is tieed up in channels
It spends into row of channels and merges;The result generated after merging for channel carries out successively again:Convolutional layer 3, convolutional layer 4, convolutional layer 5, entirely
Convolutional layer;Finally carried out successively to be up-sampled the result generated to convolutional layer 5:Warp lamination 1, warp lamination 2, warp
Lamination 3, warp lamination 4, warp lamination 5;
It is fitted a nonlinear function using convolutional neural networks as described above:
D(IL,IR)=ID
Using this convolutional neural networks come by two width RGB image ILAnd IRBe converted to corresponding depth image ID。
3. the unsupervised depth prediction approach based on convolutional neural networks and binocular parallax as described in claim 1, feature
It is, in step 2, first by the pixel coordinate conversion of left image to the camera coordinates of left camera, then again by the phase of left camera
The camera coordinates of right camera are finally projected as the location of pixels of right image by machine coordinate transform to the camera coordinates of right camera, whole
A process formula is represented by:
It specifically includes:
Step 2.1:Left camera coordinates system is changed in left image pixel coordinate inversion
By the pixel coordinate conversion of left image to the camera coordinates of left camera, it is formulated as:
XL=ID(pL)(uL-uL0)/kLx
YL=ID(pL)(vL-vL0)/kLy
ZL=ID(pL)
Wherein, uL,vLFor the pixel transverse and longitudinal coordinate of image, uL0, vL0, kLx, kLyFor the intrinsic parameter of camera, andfxAnd fyFor camera focus;
Step 2.2:Left camera coordinates system transforms to right camera coordinates system
Coordinate system transformation is carried out by rotating translation matrix, is formulated as:
Wherein, spin matrix can be expressed as with four elements:
Also, four elements need to meet constraints:
Step 2.3:Right camera coordinates system projects to right location of pixels
By right camera coordinates by projective transformation be right location of pixels, be formulated as:
4. the unsupervised depth prediction approach based on convolutional neural networks and binocular parallax as described in claim 1, feature
It is, step 3 is specially:
Right location of pixels (the u that right camera projects in step 2R, vR) it is successive value, it usesTo obtain after indicating projection
Right location of pixels, is formulated as:Using the method for bilinear interpolation, four neighborhood of pixels are used
Value (upper left corner, the upper right corner, the lower left corner and the lower right corner) pixel value into row interpolation, be represented by using formula:
Wherein,It indicates respectivelyFour neighborhood of pixels (upper left corner, the upper right corner, the lower left corner and bottom rights
Angle) pixel coordinate;It can pass throughWithBetween spatial linear distance seek, and expire equality constraint relationship
5. the unsupervised depth prediction approach based on convolutional neural networks and binocular parallax as described in claim 1, feature
It is, step 4 is specially:
The reconstruction loss function computational methods used, with reference to absolute error loss function formula, formula expression is as follows:
Image pixel intensities of the gradient of loss function from four fields of surrounding are poor, in order to keep the depth map of appearance smoother, make
The gradient of depth image is constrained with simple L2 regularization methodsFormula is expressed as follows:
Final loss function is expressed as:
Wherein, n is amount of images;λ is hyper parameter, and the intensity of regularization is adjusted as the coefficient of regularization.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810144465.2A CN108389226A (en) | 2018-02-12 | 2018-02-12 | A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810144465.2A CN108389226A (en) | 2018-02-12 | 2018-02-12 | A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108389226A true CN108389226A (en) | 2018-08-10 |
Family
ID=63068766
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810144465.2A Pending CN108389226A (en) | 2018-02-12 | 2018-02-12 | A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108389226A (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109299656A (en) * | 2018-08-13 | 2019-02-01 | 浙江零跑科技有限公司 | A kind of deeply determining method of vehicle-mounted vision system scene visual |
CN109377530A (en) * | 2018-11-30 | 2019-02-22 | 天津大学 | A kind of binocular depth estimation method based on deep neural network |
CN109584340A (en) * | 2018-12-11 | 2019-04-05 | 苏州中科广视文化科技有限公司 | New Century Planned Textbook synthetic method based on depth convolutional neural networks |
CN109615674A (en) * | 2018-11-28 | 2019-04-12 | 浙江大学 | The double tracer PET method for reconstructing of dynamic based on losses by mixture function 3D CNN |
CN109801323A (en) * | 2018-12-14 | 2019-05-24 | 中国科学院深圳先进技术研究院 | Pyramid binocular depth with self-promotion ability estimates model |
CN110009691A (en) * | 2019-03-28 | 2019-07-12 | 北京清微智能科技有限公司 | Based on the matched anaglyph generation method of binocular stereo vision and system |
CN110175603A (en) * | 2019-04-01 | 2019-08-27 | 佛山缔乐视觉科技有限公司 | A kind of engraving character recognition methods, system and storage medium |
CN110414393A (en) * | 2019-07-15 | 2019-11-05 | 福州瑞芯微电子股份有限公司 | A kind of natural interactive method and terminal based on deep learning |
CN110702015A (en) * | 2019-09-26 | 2020-01-17 | 中国南方电网有限责任公司超高压输电公司曲靖局 | Method and device for measuring icing thickness of power transmission line |
WO2020046066A1 (en) * | 2018-08-30 | 2020-03-05 | Samsung Electronics Co., Ltd. | Method for training convolutional neural network to reconstruct an image and system for depth map generation from an image |
CN111462208A (en) * | 2020-04-05 | 2020-07-28 | 北京工业大学 | Non-supervision depth prediction method based on binocular parallax and epipolar line constraint |
CN111862321A (en) * | 2019-04-30 | 2020-10-30 | 北京四维图新科技股份有限公司 | Method, device and system for acquiring disparity map and storage medium |
CN112639878A (en) * | 2018-09-05 | 2021-04-09 | 谷歌有限责任公司 | Unsupervised depth prediction neural network |
EP3731528A4 (en) * | 2017-12-21 | 2021-08-11 | Sony Interactive Entertainment Inc. | Image processing device, content processing device, content processing system, and image processing method |
US11841921B2 (en) | 2020-06-26 | 2023-12-12 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Model training method and apparatus, and prediction method and apparatus |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101577004A (en) * | 2009-06-25 | 2009-11-11 | 青岛海信数字多媒体技术国家重点实验室有限公司 | Rectification method for polar lines, appliance and system thereof |
CN105869167A (en) * | 2016-03-30 | 2016-08-17 | 天津大学 | High-resolution depth map acquisition method based on active and passive fusion |
CN105975915A (en) * | 2016-04-28 | 2016-09-28 | 大连理工大学 | Front vehicle parameter identification method based on multitask convolution nerve network |
CN106204731A (en) * | 2016-07-18 | 2016-12-07 | 华南理工大学 | A kind of multi-view angle three-dimensional method for reconstructing based on Binocular Stereo Vision System |
CN106612427A (en) * | 2016-12-29 | 2017-05-03 | 浙江工商大学 | Method for generating spatial-temporal consistency depth map sequence based on convolution neural network |
CN106934765A (en) * | 2017-03-14 | 2017-07-07 | 长沙全度影像科技有限公司 | Panoramic picture fusion method based on depth convolutional neural networks Yu depth information |
KR20180012638A (en) * | 2016-07-27 | 2018-02-06 | 한국전자통신연구원 | Method and apparatus for detecting object in vision recognition with aggregate channel features |
-
2018
- 2018-02-12 CN CN201810144465.2A patent/CN108389226A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101577004A (en) * | 2009-06-25 | 2009-11-11 | 青岛海信数字多媒体技术国家重点实验室有限公司 | Rectification method for polar lines, appliance and system thereof |
CN105869167A (en) * | 2016-03-30 | 2016-08-17 | 天津大学 | High-resolution depth map acquisition method based on active and passive fusion |
CN105975915A (en) * | 2016-04-28 | 2016-09-28 | 大连理工大学 | Front vehicle parameter identification method based on multitask convolution nerve network |
CN106204731A (en) * | 2016-07-18 | 2016-12-07 | 华南理工大学 | A kind of multi-view angle three-dimensional method for reconstructing based on Binocular Stereo Vision System |
KR20180012638A (en) * | 2016-07-27 | 2018-02-06 | 한국전자통신연구원 | Method and apparatus for detecting object in vision recognition with aggregate channel features |
CN106612427A (en) * | 2016-12-29 | 2017-05-03 | 浙江工商大学 | Method for generating spatial-temporal consistency depth map sequence based on convolution neural network |
CN106934765A (en) * | 2017-03-14 | 2017-07-07 | 长沙全度影像科技有限公司 | Panoramic picture fusion method based on depth convolutional neural networks Yu depth information |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11503267B2 (en) | 2017-12-21 | 2022-11-15 | Sony Interactive Entertainment Inc. | Image processing device, content processing device, content processing system, and image processing method |
EP3731528A4 (en) * | 2017-12-21 | 2021-08-11 | Sony Interactive Entertainment Inc. | Image processing device, content processing device, content processing system, and image processing method |
CN109299656A (en) * | 2018-08-13 | 2019-02-01 | 浙江零跑科技有限公司 | A kind of deeply determining method of vehicle-mounted vision system scene visual |
US10832432B2 (en) | 2018-08-30 | 2020-11-10 | Samsung Electronics Co., Ltd | Method for training convolutional neural network to reconstruct an image and system for depth map generation from an image |
WO2020046066A1 (en) * | 2018-08-30 | 2020-03-05 | Samsung Electronics Co., Ltd. | Method for training convolutional neural network to reconstruct an image and system for depth map generation from an image |
US11410323B2 (en) | 2018-08-30 | 2022-08-09 | Samsung Electronics., Ltd | Method for training convolutional neural network to reconstruct an image and system for depth map generation from an image |
CN112639878A (en) * | 2018-09-05 | 2021-04-09 | 谷歌有限责任公司 | Unsupervised depth prediction neural network |
CN109615674A (en) * | 2018-11-28 | 2019-04-12 | 浙江大学 | The double tracer PET method for reconstructing of dynamic based on losses by mixture function 3D CNN |
CN109615674B (en) * | 2018-11-28 | 2020-09-18 | 浙江大学 | Dynamic double-tracing PET reconstruction method based on mixed loss function 3D CNN |
CN109377530B (en) * | 2018-11-30 | 2021-07-27 | 天津大学 | Binocular depth estimation method based on depth neural network |
CN109377530A (en) * | 2018-11-30 | 2019-02-22 | 天津大学 | A kind of binocular depth estimation method based on deep neural network |
CN109584340B (en) * | 2018-12-11 | 2023-04-18 | 苏州中科广视文化科技有限公司 | New visual angle synthesis method based on deep convolutional neural network |
CN109584340A (en) * | 2018-12-11 | 2019-04-05 | 苏州中科广视文化科技有限公司 | New Century Planned Textbook synthetic method based on depth convolutional neural networks |
CN109801323A (en) * | 2018-12-14 | 2019-05-24 | 中国科学院深圳先进技术研究院 | Pyramid binocular depth with self-promotion ability estimates model |
CN110009691A (en) * | 2019-03-28 | 2019-07-12 | 北京清微智能科技有限公司 | Based on the matched anaglyph generation method of binocular stereo vision and system |
CN110175603A (en) * | 2019-04-01 | 2019-08-27 | 佛山缔乐视觉科技有限公司 | A kind of engraving character recognition methods, system and storage medium |
CN111862321A (en) * | 2019-04-30 | 2020-10-30 | 北京四维图新科技股份有限公司 | Method, device and system for acquiring disparity map and storage medium |
CN111862321B (en) * | 2019-04-30 | 2024-05-03 | 北京四维图新科技股份有限公司 | Parallax map acquisition method, device, system and storage medium |
CN110414393A (en) * | 2019-07-15 | 2019-11-05 | 福州瑞芯微电子股份有限公司 | A kind of natural interactive method and terminal based on deep learning |
CN110702015A (en) * | 2019-09-26 | 2020-01-17 | 中国南方电网有限责任公司超高压输电公司曲靖局 | Method and device for measuring icing thickness of power transmission line |
CN111462208A (en) * | 2020-04-05 | 2020-07-28 | 北京工业大学 | Non-supervision depth prediction method based on binocular parallax and epipolar line constraint |
US11841921B2 (en) | 2020-06-26 | 2023-12-12 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Model training method and apparatus, and prediction method and apparatus |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108389226A (en) | A kind of unsupervised depth prediction approach based on convolutional neural networks and binocular parallax | |
Amirkolaee et al. | Height estimation from single aerial images using a deep convolutional encoder-decoder network | |
CN111462329B (en) | Three-dimensional reconstruction method of unmanned aerial vehicle aerial image based on deep learning | |
Mitrokhin et al. | EV-IMO: Motion segmentation dataset and learning pipeline for event cameras | |
CN106803267B (en) | Kinect-based indoor scene three-dimensional reconstruction method | |
CN104778671B (en) | A kind of image super-resolution method based on SAE and rarefaction representation | |
CN105069746B (en) | Video real-time face replacement method and its system based on local affine invariant and color transfer technology | |
CN108416840A (en) | A kind of dense method for reconstructing of three-dimensional scenic based on monocular camera | |
Kropatsch et al. | Digital image analysis: selected techniques and applications | |
CN106981080A (en) | Night unmanned vehicle scene depth method of estimation based on infrared image and radar data | |
CN110310227A (en) | A kind of image super-resolution rebuilding method decomposed based on high and low frequency information | |
CN106780592A (en) | Kinect depth reconstruction algorithms based on camera motion and image light and shade | |
CN104835130A (en) | Multi-exposure image fusion method | |
CN102142153A (en) | Image-based remodeling method of three-dimensional model | |
CN104869387A (en) | Method for acquiring binocular image maximum parallax based on optical flow method | |
CN110910437B (en) | Depth prediction method for complex indoor scene | |
CN116258658B (en) | Swin transducer-based image fusion method | |
CN113313828B (en) | Three-dimensional reconstruction method and system based on single-picture intrinsic image decomposition | |
CN111462208A (en) | Non-supervision depth prediction method based on binocular parallax and epipolar line constraint | |
CN110097634A (en) | A kind of terrible imaging method of the three-dimensional of self-adapting multi-dimension | |
CN114677479A (en) | Natural landscape multi-view three-dimensional reconstruction method based on deep learning | |
CN113516693A (en) | Rapid and universal image registration method | |
CN112686830B (en) | Super-resolution method of single depth map based on image decomposition | |
CN112116646B (en) | Depth estimation method for light field image based on depth convolution neural network | |
CN117315169A (en) | Live-action three-dimensional model reconstruction method and system based on deep learning multi-view dense matching |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180810 |