CN110189390A

CN110189390A - A kind of monocular vision SLAM method and system

Info

Publication number: CN110189390A
Application number: CN201910279226.2A
Authority: CN
Inventors: 杨吉多才; 程月华; 徐贵力; 董文德; 谢瑒
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2019-04-09
Filing date: 2019-04-09
Publication date: 2019-08-30
Anticipated expiration: 2039-04-09
Also published as: CN110189390B

Abstract

The invention discloses a kind of monocular vision SLAM method and system, it is related to the synchronous positioning in computer vision and builds figure field technical field, include the following steps: to obtain present frame from image currently entered, key frame screening is carried out to the present frame of acquisition, and present frame feeding frame is lined up；Successively getting frame line up in each key frame, each key frame of acquisition is initialized, local map optimization；Successively obtain each key frame in frame queue, a line feature extraction is carried out to each key frame of acquisition, and it send dotted line feature as seed point into depth filter, depth filter traverses seed point, polar curve search, depth filtering are carried out to it on each frame in frame queue, estimate dotted line depths of features, and establishes new cartographic information；While the present invention can be guaranteed compared with high real-time, the dotted line map for being more advantageous to navigation is established, and there is better robustness.

Description

A kind of monocular vision SLAM method and system

Technical field

The present invention relates to the synchronous positioning in computer vision and build figure field technical field, and in particular to a kind of monocular view Feel SLAM method.

Background technique

The independent navigation of robot receives very big concern in recent years, becomes development inexorable trend.Figure is built in synchronous positioning (SLAM) technology can effectively solve robot and position itself in circumstances not known, while perceive two problems of ambient enviroment.

Vision SLAM problem is modeled as movement and observational equation by the researchers of early stage, and it is fixed to be completed in a manner of filtering Position, the state estimation for building figure, wherein representative system has MonoSLAM.Recent study persons use the side of nonlinear optimization Formula completes optimization to quantity of state in SLAM problem, and PTAM that Klein in 2007 et al. is proposed (Parallel Tracking and build figure) is introduced Key frame strategy, and by locating and tracking and build figure optimization for the first time and be divided to the realization of two threads, it forms in vision SLAM system framework The concept of front and back end.The association of sensor input data is mainly completed in front end, wherein the implementation positioned can be divided into method of characteristic point With direct method: in the SLAM system based on method of characteristic point it is more classical with ORB-SLAM；In SLAM system based on direct method with LSD-SLAM is more classical.SVO in 2014 is the semi-direct method visual odometry applied to unmanned aerial vehicle platform, is extracted first Then characteristic point is positioned using direct method.The nonlinear optimization to system mode is mainly completed in rear end, mainly to scheme optimization Method is representative.

Method of characteristic point adapts to the big movement of interframe, has preferable robustness, but feature extracting and matching can occupy the plenty of time, It is difficult to reach higher real-time, while being only capable of establishing more sparse point cloud map, scene structure information can not be provided.Directly Dense reconstruction can be achieved using the more information in image in method at GPU, but for feelings such as light change, motion blurs Condition robustness is insufficient.Semi-direct method has preferable real-time, but there is also disadvantages possessed by direct method.

Summary of the invention

The purpose of the present invention is to provide a kind of monocular vision SLAM methods, caused by the prior art above-mentioned more to solve Item one of defect or defect.

In order to achieve the above objectives, the present invention adopts the following technical solutions realization:

A kind of monocular vision SLAM method includes the following steps: to obtain present frame from image currently entered, to acquisition The present frame carry out key frame screening, and by the present frame feeding frame line up；Successively obtain the frame line up in it is each Key frame initializes each key frame of acquisition, local map optimization；It successively obtains each in the frame queue Key frame carries out a line feature extraction to each key frame of acquisition, and is sent into the dotted line feature as seed point Depth filter, the depth filter traverse seed point, carry out polar curve search, depth to it on each frame in the frame queue Degree filtering, estimates the dotted line depths of features, and establish new cartographic information.

A kind of monocular vision SLAM system, comprising:

Estimation thread: for obtaining present frame from image currently entered, the present frame of acquisition is carried out Key frame screening, and the key frame of screening feeding frame is lined up；

Rear end optimize thread: for successively obtain the frame line up in each key frame, to each key of acquisition Frame is initialized, local map optimizes；

It builds figure line journey: for successively obtaining each frame in the frame queue, a line feature extraction being carried out to key frame, and will The dotted line feature is sent into depth filter as seed point, and the depth filter traverses seed point, on each frame Polar curve search, depth filtering are carried out to it, estimate the dotted line depths of features, and establish new cartographic information.

The present invention has the advantages that the key frame strategy combined using vision with spatial alternation, more can completely be protected Scene information is deposited, while improving system robustness；Map based on dotted line is established using semi-direct method, ensure that higher real-time Property, and the navigation in low texture scene is adapted to, while can also provide more structural informations in scene；Using based on figure optimization Rear end optimisation strategy, have higher positioning accuracy.

Detailed description of the invention

Fig. 1 is specific embodiment of the invention SLAM system framework figure；

Fig. 2 is specific embodiment of the invention key frame strategic process figure；

Fig. 3 is that the specific embodiment of the invention builds figure effect picture.

Specific embodiment

To be easy to understand the technical means, the creative features, the aims and the efficiencies achieved by the present invention, below with reference to Specific embodiment, the present invention is further explained.

As shown in Figure 1 to Figure 3, the present invention provides a kind of monocular vision SLAM method, as shown in Figure 1, including following specific Process:

Step 1) system is the processing of three thread parallels, respectively estimation thread, and rear end optimizes thread and builds figure line journey.

Step 2) obtains current input image from monocular camera.

Step 3) estimation thread obtain present frame, system be divided into it is initial, normal, with losing three kinds of states.It completes first The initialization of SLAM then enters normal condition after initializing successfully, carry out estimation to subsequent input frame, pass through sparse image Alignment, feature refinement, pose and three step of structure optimization complete the estimation to present frame.If in the state of losing, then into Row reorientation.

Initialization, method for relocating and paper " SVO:Fast Semi-Direct Monocular Visual in step 3) The semi-direct method monocular vision odometer SVO that Odometry " is proposed is identical.

Different features participates in the mode difference of estimation in step 3), and wherein feature is divided into corner feature, gradient point Three kinds of feature, gradient line feature.For corner feature, method for estimating is identical as SVO.Feature refinement in construct based on The least square problem of gray scale invariance is indicated with optimizing current signature point location of pixels are as follows:

Wherein p_i、p′_iRespectively indicate the characteristic point with reference to characteristic point and present frame on key frame, δ I (p_i, p '_i) indicate The gray scale residual error of two o'clock.For the feature refinement of gradient point feature, limiting it to optimize direction is gradient direction, therefore this is optimized for One-dimensional optimization.The optimized amount of gradient direction indicates are as follows:

Wherein, [Δ u Δ v] is expressed as the Jacobian matrix that optimization is related to, [d_x d_y] indicate to be unit gradient vector.Ladder The re-projection error for spending point feature indicates, also needs to project former error result to gradient vector.It is participated in for gradient line feature The mode of estimation, method particularly includes: the both ends extracted region gradient point feature on line segment, the interrelated use of gradient point feature In indicating gradient line feature, estimation then is simultaneously participated in as a pair of of gradient point feature, is blocked if it exists, the feelings such as pseudo- line segment Condition, then can be independent as gradient point feature participation estimation.

Strategy in step 4) estimation thread using vision in conjunction with spatial alternation carries out key frame sieve to present frame Choosing, and it is sent into frame queue, for remaining two thread dispatching.As shown in Fig. 2, key frame strategy specifically,

Step 4-1) if the characteristic point quantity difference of present frame and previous frame is greater than amount threshold 20, then it is assumed that tracking will It loses, is inserted into new key frame at once；If the characteristic point quantity difference of present frame and previous frame is not more than amount threshold 20, hold Row step 4-2)

Step 4-2) whether the parallax average value of present frame and previous keyframe matching characteristic point be greater than 40 picture of parallax threshold value Element, if so, entering step 4-3) make further screening；If it is not, then directly terminating.

Step 4-3) in the feature refinement of estimation, current frame image is divided to the figure arranged for nrows row ncols As lattice, and construct the local map formed with key frame.There are the number of grid ncells of local map subpoint for statistics, if R=ncells/ (nrows × ncols) is inserted into new key frame if r is less than proportion threshold value 0.7, otherwise enters in next step.

Step 4-4) consider spatial alternation, calculate the mean depth d of current scene point map_min, traverse the pass of local map Key frame simultaneously obtains it to the displacement with present frame, judges whether the displacement is above displacement threshold value, if the then new key of insertion.Wherein Displacement threshold value is set as d_min12%.

Step 5) rear end optimization thread successively obtains each key frame in frame queue, is optimized using the figure based on G2o frame Method completes the optimization of initialization, local map, the map dotted line and pose that optimized variable is saved by key frame.Wherein just The figure optimization object of beginningization is the two frame key frames for participating in visual odometry initialization, and the figure optimization object of local map is by closing Local map constructed by key frame, local map, which refers to, has the relationship that regards altogether with current key frame, and is displaced nearest preceding N frame, and N takes 10。

Step 6), which is built in figure line journey, successively obtains each frame in frame queue, carries out a line feature extraction, feature to key frame Including corner feature, gradient point feature, gradient line feature.Corner feature is extracted by FAST algorithm, and gradient line feature passes through LSD Line drawing algorithm obtains, gradient point feature extraction algorithm specifically,

Step 6-1) obtain image-region grayscale image to be extracted.

Step 6-2) grayscale image gaussian filtering is denoised.

Step 6-3) by Roberts, the operators such as Prewitt, Sobel or Lapacian seek image gradient.If gradient Less than Grads threshold 20, then 0 is set, it is on the contrary then retain, obtain edge graph.

Step 6-4) traversal edge, finding has the pixel of greatest gradient value as current signature point.

Step 7) feature extraction strategy, specifically:

It 7-1) divides an image into 25 Pixel Dimensions and obtains grid, extract a feature in each lattice, it is existing on key frame Grid occupied by feature no longer extracts feature.

Residual image grid 7-2) is traversed, angle point is extracted using FAST-12 algorithm, taking algorithm threshold value is 7 to 20.

Residual image grid 7-3) is traversed, gradient line feature is extracted using LSD algorithm, and by extracting line both ends gradient point It is indicated, wherein Marking the cell locating for gradient point is to have occupied.

Residual image grid 7-4) is traversed, gradient point feature is extracted.

Then it is sent feature as seed point into depth filter, depth filter traverses seed point, right on each frame It carries out polar curve search, depth filtering, to estimate depths of features, and establishes new cartographic information.

The embodiment of the present invention shown in Fig. 3 builds figure effect picture, and wherein point identification is map point structure, and line is identified as ground figure line Structure, curve are identified as crucial frame track.

As known by the technical knowledge, the present invention can pass through the embodiment party of other essence without departing from its spirit or essential feature Case is realized.Therefore, embodiment disclosed above, in all respects are merely illustrative, not the only.Institute Have within the scope of the present invention or is included in the invention in the change being equal in the scope of the present invention.

Claims

1. a kind of monocular vision SLAM method, which comprises the steps of:

Present frame is obtained from image currently entered, key frame screening is carried out to the present frame of acquisition, and work as by described in Previous frame is sent into frame and is lined up；

Successively obtain the frame line up in each key frame, each key frame of acquisition is initialized, local map Optimization；

Each key frame in the frame queue is successively obtained, a line feature extraction is carried out to each key frame of acquisition, and It is sent the dotted line feature as seed point into depth filter, the depth filter traverses seed point, in the frame queue In polar curve search, depth filtering are carried out to it on each frame, estimate the dotted line depths of features, and establish new cartographic information.

2. monocular vision SLAM system according to claim 1, which is characterized in that the present frame obtained from image into Further include following steps before the screening of row key frame:

SLAM initialization is carried out to the present frame in original state；

Estimation is carried out to the subsequent input frame in normal condition, passes through sparse image alignment, feature refinement, pose and knot Structure optimization carries out estimation to the present frame；

It is relocated in the subsequent input frame with the state of losing.

3. monocular vision SLAM method according to claim 2, which is characterized in that described current in original state Frame carries out SLAM initialization and is all made of semi-direct method monocular vision to reorientation is carried out in the subsequent input frame with the state of losing Odometer.

4. monocular vision SLAM method according to claim 2, which is characterized in that carry out estimation to the present frame During different method for estimating is selected according to the difference of feature；Wherein the feature includes: corner feature, gradient point Feature and gradient line feature.

5. monocular vision SLAM method according to claim 1, which is characterized in that closed to the present frame of acquisition The screening of key frame includes the following steps:

Step 5.1 judges whether the characteristic point quantity difference of the present frame and previous frame is greater than amount threshold 20, if then thinking Tracking will lose, and be inserted into new key frame at once；5.2 are thened follow the steps if not；

Whether the parallax average value of step 5.2, the present frame and previous keyframe matching characteristic point is greater than 40 picture of parallax threshold value Element further screens if then entering step 5.3；If it is not, then directly terminating；

Step 5.3, estimation feature refinement in, by the current frame image divide in order to^nrowsThe image of row ncols column Lattice, and construct the local map formed with key frame；It counts there are the number of grid ncells of the local map subpoint, If r=ncells/ (nrows × ncols), if r is less than proportion threshold value 0.7, it is inserted into new key frame, is otherwise entered next Step；

5.4 consider spatial alternation, calculate the mean depth d of the local map_min, traverse the key frame of the local map simultaneously It is obtained to the displacement with the present frame, judges whether the displacement is above displacement threshold value, if being then inserted into new key frame；Wherein Displacement threshold value is set as d_min10% to 18%.

6. monocular vision SLAM method according to claim 5, which is characterized in that each key frame of acquisition into The map dotted line and pose that optimized variable in row initialization and local map optimization is saved by the key frame；Wherein, just The figure optimization object of beginningization is the two frame key frames for participating in visual odometry initialization, and the figure optimization object of local map is by institute State local map constructed by key frame.

7. monocular vision SLAM method according to claim 1, which is characterized in that carry out a line feature extraction to key frame In feature include corner feature, gradient point feature, gradient line feature；The corner feature is extracted by FAST algorithm, described Gradient line feature is extracted by LSD algorithm；The gradient point feature extraction algorithm includes the following steps:

Step 7.1 obtains image-region grayscale image to be extracted；

Step 7.2 denoises the grayscale image gaussian filtering；

Step 7.3 seeks described image gradient by Roberts, Prewitt, Sobel or Lapacian operator, if gradient is small In Grads threshold 20, then 0 is set, it is on the contrary then retain, obtain edge graph；

Step 7.4, the traversal edge graph, finding has the pixel of greatest gradient value as current signature point.

8. a kind of monocular vision SLAM method according to claim 7, which is characterized in that carry out dotted line feature to key frame Feature extraction includes the following steps: in extraction

The grid of 25 to 50 Pixel Dimensions is divided an image into, extracts a feature in each grid；Wherein, on the key frame Grid occupied by existing feature no longer extracts feature；

Remaining grid is traversed, angle point is extracted using FAST-12 algorithm, taking algorithm threshold value is 7 to 20；

Remaining grid is traversed, gradient line feature is extracted using LSD algorithm, and table is carried out to it by extracting line both ends gradient point Show；Wherein Marking the cell locating for gradient point is to have occupied；

Remaining grid is traversed, gradient point feature is extracted.

9. a kind of monocular vision SLAM system characterized by comprising

Estimation thread: for obtaining present frame from image currently entered, the present frame of acquisition is carried out crucial Frame screening, and the key frame of screening feeding frame is lined up；

Rear end optimizes thread: for successively obtain the frame line up in each key frame, to each key frame of acquisition into Row initialization, local map optimization；

It builds figure line journey: for successively obtaining each frame in the frame queue, a line feature extraction being carried out to key frame, and will be described Dotted line feature is sent into depth filter as seed point, and the depth filter traverses seed point, to it on each frame Polar curve search, depth filtering are carried out, estimates the dotted line depths of features, and establish new cartographic information.

10. monocular vision SLAM system according to claim 9, which is characterized in that the estimation thread further include:

Initialization module: for carrying out SLAM initialization to the present frame in original state；

Locating module: for carrying out estimation to the subsequent input frame in normal condition, pass through sparse image alignment, feature Refinement, pose and structure optimization carry out estimation to present frame；

Reorientation module: for being relocated in the subsequent input frame with the state of losing.