CN106658023B - A kind of end-to-end visual odometry and method based on deep learning - Google Patents
A kind of end-to-end visual odometry and method based on deep learning Download PDFInfo
- Publication number
- CN106658023B CN106658023B CN201611191845.9A CN201611191845A CN106658023B CN 106658023 B CN106658023 B CN 106658023B CN 201611191845 A CN201611191845 A CN 201611191845A CN 106658023 B CN106658023 B CN 106658023B
- Authority
- CN
- China
- Prior art keywords
- network
- light stream
- interframe
- training
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C22/00—Measuring distance traversed on the ground by vehicles, persons, animals or other moving solid bodies, e.g. using odometers, using pedometers
Abstract
The invention discloses a kind of end-to-end visual odometry and method based on deep learning, network is estimated including cascade light stream network and interframe, the light stream network is according to the consecutive frame in data set in image sequence, choosing the light stream end point error between output light stream vectors and reference data is loss function, after carrying out network training, the light stream of generation is exported, the interframe estimates network using light stream image as input, loss function is constructed based on six degree of freedom output the distance between pose vector and reference data, repetitive exercise network carries out interframe estimation.Light stream network module and interframe estimation network module is respectively trained using different inputoutput datas in the present invention, and the two cascade is finally constituted visual odometry module end to end, further profound training, Optimal Parameters.The training time can be greatly reduced in the hierarchical training method, improve training effectiveness.
Description
Technical field
The present invention relates to a kind of end-to-end visual odometry and method based on deep learning.
Background technique
Visual odometry is method of the robot using visual sensor estimation displacement, is robot localization, map
The basic technology of building, avoidance and path planning contour level task.
Traditional visual odometry is based primarily upon the space geometry relationship of interframe visual signature, estimates robot interframe position
Appearance, therefore also referred to as interframe is estimated.Feature is divided into two class of sparse features and dense characteristic, corresponds respectively to image local information table
Showing indicates with global information.Traditional feature needs manually to choose or calculate, and causes to indicate image information to have centainly artificial
Property and limitation, while the matched accuracy of dependence characteristics it is single etc. in the illumination variation of reply image, motion blur, texture
Situation has biggish limitation, affects its estimated accuracy.
Summary of the invention
The present invention to solve the above-mentioned problems, proposes a kind of end-to-end visual odometry based on deep learning and side
Method, the present invention using interframe estimating depth nerual network technique end to end, realize estimate from original image to interframe it is straight
Output is connect, relative to conventional method, the technology is without manual extraction feature or light stream image, the sub, nothing without construction feature description
Interframe characteristic matching is needed, there are no need to carry out complicated geometric operation.
To achieve the goals above, the present invention adopts the following technical scheme:
A kind of end-to-end visual odometry based on deep learning, including cascade light stream network and interframe estimate network,
The light stream network chooses the light between output light stream vectors and reference data according to the consecutive frame in data set in image sequence
Flow endpoint error is loss function, and after carrying out network training, the light stream image of generation is exported, and the interframe estimation network is with light
Stream picture constructs loss function, iteration as input, based on six degree of freedom output the distance between pose vector and reference data
Training network, carries out interframe estimation.
The light stream network and interframe estimation network are stratification training method.
The light stream network is convolutional neural networks training aids.
The light stream network is chosen between output light stream vectors and reference data using consecutive frame consecutive image as input
Light stream end point error carries out the network training that the sequential frame image that will be inputted generates light stream image as loss function.
The interframe estimates that the training of entire light stream image using light stream image as input, is divided into global optical flow by network
Figure is trained and the local of multiple sub-light stream pictures is trained, the feature of finally both combinations output, is output to full articulamentum, completion base
Network is estimated in the interframe of light stream.
The interframe estimation network is to utilize KITTI data set training network.
The interframe estimation network is to train network using generated data.
A kind of end-to-end vision mileage estimation method based on deep learning, according to adjacent in image sequence in data set
Frame, choosing the light stream end point error between output light stream vectors and reference data is loss function, after carrying out network training, is generated
Light stream image, according to light stream image, based on six degree of freedom output the distance between pose vector and reference data building loss letter
Number, repetitive exercise network carry out interframe estimation.
Light stream network module and interframe estimation network module is respectively trained using different inputoutput datas, finally by the two
Cascade, further profound training, Optimal Parameters.
The invention has the benefit that
(1) present invention chooses or calculates feature compared to conventional method without artificial, eliminate the biggish feature of error
With process, there are no complicated geometric operation is needed, have the characteristics that intuitive simple;
(2) stratification deep neural network training method proposed by the present invention is, it can be achieved that light stream network and interframe estimate net
Network parallel training, improves training speed;
(3) present invention in light stream network application, optical flow computation speed is improved, so that algorithm real-time is mentioned
It rises;
(4) light stream network module and interframe estimation network module is respectively trained using different inputoutput datas in the present invention,
The two cascade is finally constituted into visual odometry module end to end, further profound training, Optimal Parameters.This is hierarchical
The training time can be greatly reduced in training method, improve training effectiveness.
Detailed description of the invention
Fig. 1 is system structure diagram of the invention;
Fig. 2 is the light stream network diagram of the invention based on convolutional neural networks;
Fig. 3 is that interframe of the invention estimates network diagram.
Specific embodiment:
The invention will be further described with embodiment with reference to the accompanying drawing.
A kind of interframe estimating depth nerual network technique end to end, realize estimate from original image to interframe it is direct
Output, is a modular visual odometry.Relative to conventional method, which is not necessarily to manual extraction feature or light stream figure
Picture describes son without construction feature, is not necessarily to interframe characteristic matching, and there are no the geometric operations that need to carry out complexity.
As shown in Figure 1, odometer of the invention includes two submodules: light stream network module and interframe estimate network mould
Block.Two modules use stratification training method, i.e., light stream network module and frame are respectively trained using different inputoutput datas
Between estimate network module, the two cascade is finally constituted into visual odometry module end to end, further profound training, excellent
Change parameter.The hierarchical training method can be greatly reduced the training time, improve the excellent of training effectiveness and deep neural network
One of gesture.Specific step is as follows:
The building of light stream network: light stream network can be made of convolutional neural networks (CNN), and pass through truthful data or synthesis
Data carry out network training, using consecutive frame consecutive image as input, choose the light between output light stream vectors and reference data
Flow endpoint error (endpoint error, EPE) is used as loss function, realizes the net generated from input sequential frame image to light stream
Network training.
As shown in Fig. 2, the i-th frame image and i+1 frame image are inputted CNN network respectively, respective characteristics of image is exported
It indicates;Before and after frames image feature representation is combined, deeper CNN network is further input into;It is improved by upper convolutional network
The pond operating result resolution ratio of CNN network exports dense global optical flow figure pixel-by-pixel.
Interframe estimation network building: the network using light stream image as input, with six degree of freedom export pose vector and
The distance between reference data constructs loss function, repetitive exercise network.Fig. 3, which is illustrated, utilizes local light stream picture and global light
The process that combination of network completes the interframe estimation based on light stream is respectively trained in stream picture.This process can be selected KITTI data set or
Generated data calculates input light stream by traditional optical flow algorithm to train network.
In the establishment process of interframe estimation module, global optical flow figure is divided into multiple local light stream subgraphs first, then
Global optical flow figure and local light stream subgraph are inputted into CNN network respectively, obtaining light stream local feature and global characteristics indicates.By light
Stream local feature and global characteristics expression are combined, and are input to full articulamentum, are obtained the frame of six-freedom degree pose vector expression
Between estimate.
Training process can be divided into three phases: light stream subgraph local first is as input, and interframe estimation is as output, training
Network;Secondly using global optical flow figure as input, interframe estimation is as output, training network;Finally, by local light stream subgraph and
For global optical flow figure simultaneously as input, interframe estimation further trains network as output.
It realizes end-to-end visual odometry: cascading trained light stream network and the interframe estimation network based on light stream,
Using the consecutive frame of image sequence in data set as the input of whole network, with six degree of freedom output vector and reference data away from
From construction loss function, repetitive exercise Optimal Parameters realize quick, accurate, robust end-to-end visual odometry.
Above-mentioned, although the foregoing specific embodiments of the present invention is described with reference to the accompanying drawings, not protects model to the present invention
The limitation enclosed, those skilled in the art should understand that, based on the technical solutions of the present invention, those skilled in the art are not
Need to make the creative labor the various modifications or changes that can be made still within protection scope of the present invention.
Claims (7)
1. a kind of end-to-end visual odometry based on deep learning, it is characterized in that: including that cascade light stream network and interframe are estimated
Network is counted, the light stream network is according to the consecutive frame in data set in image sequence, using consecutive frame consecutive image as input, choosing
Taking the light stream end point error between output light stream vectors and reference data is loss function, after carrying out network training, by generation
The output of light stream image, the interframe estimate network using light stream image as input, based on six degree of freedom output pose vector and base
The distance between quasi- data construct loss function, and repetitive exercise network carries out interframe estimation;
After light stream network and interframe estimation cascade, further profound training, Optimal Parameters.
2. a kind of end-to-end visual odometry based on deep learning as described in claim 1, it is characterized in that: the light stream net
Network and interframe estimation network are stratification training method.
3. a kind of end-to-end visual odometry based on deep learning as described in claim 1, it is characterized in that: the light stream net
Network is convolutional neural networks training aids.
4. a kind of end-to-end visual odometry based on deep learning as described in claim 1, it is characterized in that:
The interframe estimates network using light stream image as input, and the training of entire light stream image is divided into global optical flow figure instruction
Experienced and multiple sub-light stream pictures part training, the feature of finally both combinations output are output to full articulamentum, complete to be based on light
The interframe of stream estimates network.
5. a kind of end-to-end visual odometry based on deep learning as described in claim 1, it is characterized in that: the interframe is estimated
Counting network is to utilize KITTI data set training network.
6. a kind of end-to-end visual odometry based on deep learning as described in claim 1, it is characterized in that: the interframe is estimated
Meter network is to train network using generated data.
7. a kind of end-to-end vision mileage estimation method based on deep learning, it is characterized in that: according to image sequence in data set
In consecutive frame, using consecutive frame consecutive image as input, choose output light stream vectors and reference data between light stream endpoint
Error is loss function, after carrying out network training, generates light stream image, according to light stream image, exports pose based on six degree of freedom
The distance between vector and reference data construct loss function, and repetitive exercise network carries out interframe estimation;
Light stream network module and interframe estimation network module is respectively trained using different inputoutput datas, finally by the two grade
Connection, further profound training, Optimal Parameters.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611191845.9A CN106658023B (en) | 2016-12-21 | 2016-12-21 | A kind of end-to-end visual odometry and method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611191845.9A CN106658023B (en) | 2016-12-21 | 2016-12-21 | A kind of end-to-end visual odometry and method based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106658023A CN106658023A (en) | 2017-05-10 |
CN106658023B true CN106658023B (en) | 2019-12-03 |
Family
ID=58833548
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611191845.9A Active CN106658023B (en) | 2016-12-21 | 2016-12-21 | A kind of end-to-end visual odometry and method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106658023B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11856181B2 (en) | 2017-09-28 | 2023-12-26 | Lg Electronics Inc. | Method and device for transmitting or receiving 6DoF video using stitching and re-projection related metadata |
Families Citing this family (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107289967B (en) * | 2017-08-17 | 2023-06-09 | 珠海一微半导体股份有限公司 | Separable optical odometer and mobile robot |
CN107527358B (en) * | 2017-08-23 | 2020-05-12 | 北京图森智途科技有限公司 | Dense optical flow estimation method and device |
CN109785376B (en) * | 2017-11-15 | 2023-02-28 | 富士通株式会社 | Training method of depth estimation device, depth estimation device and storage medium |
CN107909602A (en) * | 2017-12-08 | 2018-04-13 | 长沙全度影像科技有限公司 | A kind of moving boundaries method of estimation based on deep learning |
CN108122249A (en) * | 2017-12-20 | 2018-06-05 | 长沙全度影像科技有限公司 | A kind of light stream method of estimation based on GAN network depth learning models |
CN109978924A (en) * | 2017-12-27 | 2019-07-05 | 长沙学院 | A kind of visual odometry method and system based on monocular |
CN108303094A (en) * | 2018-01-31 | 2018-07-20 | 深圳市拓灵者科技有限公司 | The Position Fixing Navigation System and its positioning navigation method of array are merged based on multiple vision sensor |
CN108648216B (en) * | 2018-04-19 | 2020-10-09 | 长沙学院 | Visual odometer implementation method and system based on optical flow and deep learning |
CN108881952B (en) * | 2018-07-02 | 2021-09-14 | 上海商汤智能科技有限公司 | Video generation method and device, electronic equipment and storage medium |
CN109272493A (en) * | 2018-08-28 | 2019-01-25 | 中国人民解放军火箭军工程大学 | A kind of monocular vision odometer method based on recursive convolution neural network |
CN109656134A (en) * | 2018-12-07 | 2019-04-19 | 电子科技大学 | A kind of end-to-end decision-making technique of intelligent vehicle based on space-time joint recurrent neural network |
CN109708658B (en) * | 2019-01-14 | 2020-11-24 | 浙江大学 | Visual odometer method based on convolutional neural network |
CN111627051B (en) | 2019-02-27 | 2023-12-15 | 中强光电股份有限公司 | Electronic device and method for estimating optical flow |
CN110335337B (en) * | 2019-04-28 | 2021-11-05 | 厦门大学 | Method for generating visual odometer of antagonistic network based on end-to-end semi-supervision |
CN110111366B (en) * | 2019-05-06 | 2021-04-30 | 北京理工大学 | End-to-end optical flow estimation method based on multistage loss |
CN110310299B (en) * | 2019-07-03 | 2021-11-19 | 北京字节跳动网络技术有限公司 | Method and apparatus for training optical flow network, and method and apparatus for processing image |
CN110378936B (en) * | 2019-07-30 | 2021-11-05 | 北京字节跳动网络技术有限公司 | Optical flow calculation method and device and electronic equipment |
CN110599542A (en) * | 2019-08-30 | 2019-12-20 | 北京影谱科技股份有限公司 | Method and device for local mapping of adaptive VSLAM (virtual local area model) facing to geometric area |
CN112648997A (en) * | 2019-10-10 | 2021-04-13 | 成都鼎桥通信技术有限公司 | Method and system for positioning based on multitask network model |
CN111192312B (en) * | 2019-12-04 | 2023-12-26 | 中广核工程有限公司 | Depth image acquisition method, device, equipment and medium based on deep learning |
CN111127557B (en) * | 2019-12-13 | 2022-12-13 | 中国电子科技集团公司第二十研究所 | Visual SLAM front-end attitude estimation method based on deep learning |
CN111260680B (en) * | 2020-01-13 | 2023-01-03 | 杭州电子科技大学 | RGBD camera-based unsupervised pose estimation network construction method |
CN111539988B (en) * | 2020-04-15 | 2024-04-09 | 京东方科技集团股份有限公司 | Visual odometer implementation method and device and electronic equipment |
CN111833400B (en) * | 2020-06-10 | 2023-07-28 | 广东工业大学 | Camera pose positioning method |
CN112344922B (en) * | 2020-10-26 | 2022-10-21 | 中国科学院自动化研究所 | Monocular vision odometer positioning method and system |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9761008B2 (en) * | 2014-05-08 | 2017-09-12 | The Trustees Of The University Of Pennsylvania | Methods, systems, and computer readable media for visual odometry using rigid structures identified by antipodal transform |
US9427874B1 (en) * | 2014-08-25 | 2016-08-30 | Google Inc. | Methods and systems for providing landmarks to facilitate robot localization and visual odometry |
US20160349379A1 (en) * | 2015-05-28 | 2016-12-01 | Alberto Daniel Lacaze | Inertial navigation unit enhaced with atomic clock |
-
2016
- 2016-12-21 CN CN201611191845.9A patent/CN106658023B/en active Active
Non-Patent Citations (3)
Title |
---|
《Exploring Representation Learning With CNNs for Frame-to-Frame Ego-Motion Estimation》;Gabriele Costante 等;《IEEE ROBOTICS AND AUTOMATION LETTERS》;20160131;正文第3部分,第20-22页,图3-4 * |
《F1owNet: Learning Optical Flow with Convolutional Networks》;Alexey Dosovitskiy 等;《IEEE International Conference on Computer Vision》;20151231;摘要,图1-3,第2579-2764段 * |
《High Accuracy Optical Flow Estimation Based on a Theory for Warping》;Thomas Brox等;《IEEE Conference on European Conference on Computer Vision》;20040531;全文 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11856181B2 (en) | 2017-09-28 | 2023-12-26 | Lg Electronics Inc. | Method and device for transmitting or receiving 6DoF video using stitching and re-projection related metadata |
Also Published As
Publication number | Publication date |
---|---|
CN106658023A (en) | 2017-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106658023B (en) | A kind of end-to-end visual odometry and method based on deep learning | |
CN101625768B (en) | Three-dimensional human face reconstruction method based on stereoscopic vision | |
CN106600583B (en) | Parallax picture capturing method based on end-to-end neural network | |
CN104661010B (en) | Method and device for establishing three-dimensional model | |
CN103003846B (en) | Articulation region display device, joint area detecting device, joint area degree of membership calculation element, pass nodular region affiliation degree calculation element and joint area display packing | |
CN104408760B (en) | A kind of high-precision virtual assembly system algorithm based on binocular vision | |
CN111340868B (en) | Unmanned underwater vehicle autonomous decision control method based on visual depth estimation | |
CN105144196A (en) | Method and device for calculating a camera or object pose | |
CN108986166A (en) | A kind of monocular vision mileage prediction technique and odometer based on semi-supervised learning | |
CN106780592A (en) | Kinect depth reconstruction algorithms based on camera motion and image light and shade | |
CN106296812A (en) | Synchronize location and build drawing method | |
CN105225269A (en) | Based on the object modelling system of motion | |
CN106780543A (en) | A kind of double framework estimating depths and movement technique based on convolutional neural networks | |
CN103413352A (en) | Scene three-dimensional reconstruction method based on RGBD multi-sensor fusion | |
CN101976455A (en) | Color image three-dimensional reconstruction method based on three-dimensional matching | |
CN109272493A (en) | A kind of monocular vision odometer method based on recursive convolution neural network | |
CN106780631A (en) | A kind of robot closed loop detection method based on deep learning | |
CN104123747A (en) | Method and system for multimode touch three-dimensional modeling | |
Aliakbarian et al. | Flag: Flow-based 3d avatar generation from sparse observations | |
CN109708654A (en) | A kind of paths planning method and path planning system | |
CN110264526A (en) | A kind of scene depth and camera position posture method for solving based on deep learning | |
CN104966320A (en) | Method for automatically generating camouflage pattern based on three-order Bezier curve | |
Liu et al. | Atvio: Attention guided visual-inertial odometry | |
Liao et al. | Maptrv2: An end-to-end framework for online vectorized hd map construction | |
Carvalho et al. | Long-term prediction of motion trajectories using path homology clusters |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |