CN112381890B - RGB-D vision SLAM method based on dotted line characteristics - Google Patents

RGB-D vision SLAM method based on dotted line characteristics Download PDF

Info

Publication number
CN112381890B
CN112381890B CN202011358438.9A CN202011358438A CN112381890B CN 112381890 B CN112381890 B CN 112381890B CN 202011358438 A CN202011358438 A CN 202011358438A CN 112381890 B CN112381890 B CN 112381890B
Authority
CN
China
Prior art keywords
frame
camera
point
pose
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011358438.9A
Other languages
Chinese (zh)
Other versions
CN112381890A (en
Inventor
方志军
卢俊鑫
高永彬
杨淑群
方荣辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai University of Engineering Science
Original Assignee
Shanghai University of Engineering Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai University of Engineering Science filed Critical Shanghai University of Engineering Science
Priority to CN202011358438.9A priority Critical patent/CN112381890B/en
Publication of CN112381890A publication Critical patent/CN112381890A/en
Application granted granted Critical
Publication of CN112381890B publication Critical patent/CN112381890B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/005Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 with correlation of navigation data from several sources, e.g. map or contour matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/521Depth or shape recovery from laser ranging, e.g. using interferometry; from the projection of structured light
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20068Projection on vertical or horizontal image axis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Theoretical Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Optics & Photonics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an RGB-D vision SLAM method based on dotted line characteristics, which comprises the following steps: s1, calibrating internal parameters of the RGB-D camera; s2, extracting line features; s3, acquiring an initial pose transformation position of the camera; s4, projecting the key frame to the current frame; s5, forming an error function of the visual odometer module; s6, performing iterative optimization of the initial pose transformation of the current frame camera and solving the pose of the current frame camera; s7, judging key frames; s8, local BA optimization; s9, detecting a loop; s10, repositioning; s11, constructing a map; and S12, outputting the result. According to the SLAM method provided by the invention, the RGB-D camera is used as image acquisition equipment, the points and edge lines of objects in a scene are used as positioning features to realize SLAM, and the method has better accuracy in low-semantic low-texture challenging scenes.

Description

RGB-D vision SLAM method based on dotted line characteristics
Technical Field
The invention relates to an RGB-D visual SLAM method based on dotted line characteristics, and belongs to the technical field of visual SLAM (synchronous positioning and map construction).
Background
SLAM (simultaneous localization and mapping) can be summarized as: a certain mobile device (such as a robot, an unmanned aerial vehicle, a mobile phone and the like) starts from an unknown place in an unknown environment, the position, the attitude and the motion track of the mobile device are observed and positioned through a sensor (such as a laser radar, a camera and the like) in the motion process, and incremental map construction is carried out according to the position of the mobile device, so that the purposes of simultaneous positioning and map construction are achieved. The development of SLAM technology has been more than 30 years old today, and the related technical fields are numerous, and the SLAM technology has become a popular research direction in the fields of robots and computer vision.
Generally speaking, SLAM systems typically include a variety of sensors and a variety of functional modules. However, the currently common robot SLAM system generally has two forms, which are distinguished by core function modules: lidar (laser SLAM) based SLAM and vision based SLAM (Visual SLAM or VSLAM). The vision-based vision SLAM is a hotspot of research in various scientific research institutes at present, and takes a vision sensor as a carrier to obtain characteristics such as geometry, texture, color and the like in an environment to model, solve the motion state of a camera in the real world, and finally realize autonomous navigation and mapping of a robot in an unknown environment.
Therefore, the extraction of visual features becomes the key of visual SLAM, and good visual features should have rotation invariance and scale invariance. In the environment with abundant semantic information, little illumination change and no dynamic object, points, lines, surfaces and the like can be extracted to complete self high-precision positioning. However, in the current visual SLAM, the precision of the SLAM algorithm is suddenly reduced in low-semantic, large-illumination-variation and dynamic environments such as white walls, floors and personnel activity scenes.
Disclosure of Invention
Aiming at the problems in the prior art, the invention aims to provide an RGB-D visual SLAM method based on dotted line characteristics so as to solve the problems of low semantic meaning, large illumination change and low precision in a dynamic environment in the conventional visual SLAM technology.
In order to achieve the purpose, the invention adopts the following technical scheme:
an RGB-D visual SLAM method based on dotted line characteristics comprises the following steps:
s1, calibrating internal parameters of the RGB-D camera: the method comprises the following steps of taking an RGB-D camera as image acquisition equipment, and carrying out internal reference calibration on the RGB-D camera, wherein the internal reference of the RGB-D camera comprises a principal point, a focal length and a distortion coefficient of the camera;
s2, extracting point and line characteristics: performing edge detection and straight line fitting on an image frame acquired by an RGB-D camera at the current moment, screening and sampling line points of the straight line obtained by fitting, and forming a point set by the acquired points;
s3, acquiring an initial pose transformation position of the camera: initializing the transformation pose of different image frames acquired by an RGB-D camera to acquire the initial transformation pose of the camera;
s4, straight line projection of the key frame to the current frame: projecting the straight line on the key frame obtained in the step S2 to the current frame according to the initial pose of the camera obtained in the step S3;
s5, constructing an adaptive error based on the local gradient of the line segment to form an error function of the visual odometer module: obtaining projection straight lines on the current frame according to the step S4, obtaining the adaptation degree value of each straight line on the current frame by using the relation between the projection straight lines and the projection straight line neighborhood image data, and accumulating the adaptation degree value of each projection straight line to form an error function of a Visual Odometer (VO) module;
s6, iterative optimization of the initial pose transformation of the current frame camera and pose solution of the current frame camera: according to the error function finally formed in the step S5, iterative optimization of the initial transformation pose of the camera is carried out by utilizing a nonlinear error function optimization method, and the transformation poses of the current frame and the key frame obtained by optimization are used
Figure GDA0003704659550000021
Left-handed key frame pose xi k Solving the pose xi of the current frame camera c Thereby acquiring the specific position of the camera at the current frame moment and realizing the positioning of the camera;
s7, judging key frames: judging whether the current frame is a key frame or not based on the straight line adaptation degree value of the current frame obtained in the step S5, if the current frame is judged to be the key frame, replacing and storing the current frame as a new key frame, inserting the new key frame into a rear-end sliding window, and performing local BA optimization; if the current frame is judged to be a non-key frame, the relevant operation of the key frame is not carried out, and the step S2 is returned;
s8, local BA optimization: when the current frame is judged to be the key frame in the step S7, inserting the current frame into a rear local BA optimization sliding window, and constructing a common-view local BA optimization based on the sliding window by using the point and line characteristics extracted in the step S2 and the camera transformation pose solved in the step S6, and all the key frames in the sliding window and the point and line characteristics which belong to the key frames;
s9, loop detection: performing loop detection and global graph optimization on the new key frame obtained in step S8 and all existing key frames, specifically: performing loop detection on each newly inserted key frame and all previous key frames based on a dictionary model, and if the loop frame is detected, performing graph optimization on all key frames between the new key frame and the loop frame by using an error function constructed by the visual odometry module in the step S5;
s10, repositioning: when the difference between the camera transformation pose solved by the error function of the visual odometer module in the step S6 and the set threshold value is too large, the pose solving of the current frame camera is judged to be failed, and repositioning is carried out; in the repositioning process, loop detection processing in the step S9 is firstly carried out, loop frames corresponding to the current frame are searched, then the step S3 is skipped to for estimating the initial transformation pose, and corresponding subsequent operations are continuously and sequentially executed;
s11, map construction: solving the pose of each key frame and the point and line characteristics obtained in the step to construct a sparse map;
s12, outputting the result: and (4) storing the pose, drawing the motion path of the camera, storing and synchronously displaying the map in real time.
Preferably, in step S1, the RGB-D camera is calibrated with an internal reference K by using a zhangnyou camera calibration method.
Further preferably, in step S1, the calibration of the internal reference K of the RGB-D camera includes the following steps:
s11, taking the RGB-D camera as an image acquisition device, and shooting a calibration board (checkerboard) with fixed size at different angles by using the RGB-D camera to obtain image data required by calibration;
and S12, calculating camera internal parameter K of the acquired image data by using a Zhang Zhengyou camera calibration method, and storing the internal parameter coefficient.
Preferably, in the step S2, the edge detection and the straight line fitting are performed on the video image frame acquired by the RGB-D camera, and the preprocessing operations including denoising and edge enhancement are performed on the video image frame acquired by the RGB-D camera; then, edge extraction is carried out on the preprocessed image by using an edge detector; and finally, fitting the extracted edges to obtain a required straight line, screening the fitted straight line and sampling line points, wherein the collected points form a point set.
Further preferably, in step S2, the extracting of the point and line features of the image frame acquired by the RGB-D camera at the current time includes the following steps:
s21, preprocessing operations including denoising and edge enhancement are carried out on a video image frame acquired by the RGB-D camera, and then an edge is extracted from the image by using a Canny edge extractor;
s22, directly collecting collinear point clusters from the edges extracted in the step S21 by using an edge linking and segmentation technology, and then fitting initial line segments in the point clusters based on a least square fitting method;
s23, further extending and combining the initial line segments fitted in the step S22 to generate longer and more complete line segments, fitting a required straight line, screening the fitted straight line and sampling line points, and enabling the collected points to form a point set.
Preferably, the acquiring of the initial pose of the camera in step S3 includes the following steps:
s31, firstly, judging that the image frame acquired by the current RGB-D camera belongs to the frame, then respectively executing the camera pose initialization scheme in the step S32, the step S33 or the step S34 for different image frames according to the judgment result, and acquiring the initial transformation pose of the corresponding image frame (the iterative convergence speed of the subsequent error function can be accelerated to a certain extent, and the method is prevented from falling into a local optimal value);
s32, if the current frame image is the first frame image, the pose is transformed and initialized by using a unit array I;
s33, if the current frame is a second frame image or a repositioning image frame, solving the initial transformation pose of the camera by a perspective N-point projection (PNP) method in geometric transformation by using the point set acquired in the step S2;
and S34, if the current frame is not the first image frame, the second image frame and the repositioning image frame, initializing the transformation pose based on the continuous motion model.
Further preferably, step S33 includes the following operations:
s331, for the point sets of the current frame and the key frame obtained in the step S2, respectively defining a region with a fixed size around each feature point position, and calculating the gradient size and direction of each pixel point in the region to form a feature descriptor corresponding to each key point;
s332, calculating a descriptor distance between each feature point of the key frame and a feature point of the current frame, sequentially sorting according to the distance of the descriptors, extracting two points with the nearest descriptor distance as a point pair, and determining a best matching point;
s333, screening out error point pairs by using RANSAC (random sample consensus) to the obtained best matching point pairs;
and S334, solving the initial transformation pose of the camera by using the matching point pair obtained in the S333 and through a perspective N-point projection (PNP) method in geometric transformation.
Further preferably, step S334 includes the following calculation:
s3341, projecting the matching point on the key frame to the three-dimensional space according to the posture of the key frame to obtain a three-dimensional space coordinate point P i =(X i ,Y i ,Z i ) T And solving the camera internal parameter K and the three-dimensional space point P obtained in the step 1 i Corresponding two-dimensional point p on the current frame i =(u i ,v i ) T Solving the transformation pose of the camera between two frames
Figure GDA0003704659550000041
R∈SO(3),t∈R 3 The lie algebraic form of T is xi; wherein the current frame pixel point p i And three-dimensional space point P i The transformation relationship between the two is shown in formula (1):
Figure GDA0003704659550000042
in the formula (1), s i The xi ^ is an antisymmetric matrix of xi which is a scale factor of the current image frame;
s3342, because the transformation pose of the camera is unknown and noise exists, the equation (1) has an error, therefore, the errors are summed to construct a least square problem, and then the error function is minimized to find the best initial transformation pose of the camera, as shown in the following equation (2):
Figure GDA0003704659550000051
in the formula (2), the first and second groups,
Figure GDA0003704659550000052
i.e. the initial transformation of the camera, error term, obtained by solving
Figure GDA0003704659550000053
Referred to as reprojection error.
Preferably, in step S34, initializing the camera transformation matrix by using the continuous motion model, specifically: the current frame F c (newly incoming image frame, which the present invention subsequently refers to as current frame) of a previous frame F c-1 Corresponding key frame F k (Key frame means that a frame is selected from a series of local common frames to be used as a representative of a local frame, and local information is recorded, and the key frame has remarkable characteristic information)
Figure GDA0003704659550000054
Multiplication by the current frame F c First two frames F c-1 And F c-2 Position xi of transformation between c-2,c-1 Obtain the current frame F c Initial pose transformation of
Figure GDA0003704659550000055
Namely, as shown in formula (3):
Figure GDA0003704659550000056
further preferably, step S34 includes the following operations:
s341, solving current frame F c Its previous frame F c-1 And its key frame F k Change the pose between
Figure GDA0003704659550000057
S342, solving the current frame F c First two frames F c-1 And F c-2 Position xi of transformation between c-2,c-1
S343, solving the current frame F c Initial pose transformation of
Figure GDA0003704659550000058
Solving the following equation (3):
Figure GDA0003704659550000059
preferably, in step S5, constructing an adaptive error based on the local gradient of the line segment to form an error function of the visual odometer module, includes the following steps:
s51, obtaining the projection straight lines on the current frame according to the step S4, and calculating the direction alpha of each projection straight line;
s52, sampling each projection straight line at equal intervals, wherein the number of the sampled points of each projection straight line satisfies the following formula:
Figure GDA00037046595500000510
wherein: n is the total number of sampling points of each projection line, L is the corresponding length of each projection line, and L is the total number of sampling points of each projection line s Represents the length of the shortest projected line in the current frame, L l The length of the longest projection straight line in the current frame is taken as the length of the longest projection straight line in the current frame;
s53, defining a neighborhood with fixed size by taking each sampling point as a center, and calculating gradient amplitudes and gradient directions corresponding to all pixel points in the neighborhood;
s54, screening neighborhood points according to the gradient amplitude and the gradient direction of the pixel points in the neighborhood of each sampling point obtained in the step S53, and when the gradient amplitude of the neighborhood points is lower than a set threshold value, considering that the neighborhood points do not contribute to straight lines and abandoning the points, or else, keeping the points;
s55, calculating the distance from each neighborhood point screened in the step S54 to the subordinate projection straight line thereof, and forming the weight value of each neighborhood point by setting that the distance meets the standard Gaussian normal distribution;
s56, subtracting the direction of the straight line from the gradient direction of all the pixel points in the neighborhood of each sampling point to obtain a direction difference value, if the direction difference value is larger than a set threshold value, further screening and rejecting the neighborhood points, and if the direction difference value is in the range of the set threshold value, multiplying the direction difference value by the gradient amplitude of the pixel points in the neighborhood to obtain an adaptive contribution value of each neighborhood point;
s57, multiplying the direction difference value calculated by all sampling points of each projection straight line solved in the step S56 by the weight value of the step S55, accumulating and summing to form the adaptation degree value of the straight line and the neighborhood image data;
and S58, accumulating the fit degree values of the projection straight lines obtained in the step S57 to form an error function of a Visual Odometer (VO) module.
Preferably, in step S7, the determination of whether the current frame is the key frame is performed based on the linear fitness value in step S5, specifically: and based on the error function formed by accumulating the adaptation degree values of each projection straight line obtained in the step S6, calculating the adaptation rate of the current frame according to an image frame adaptation rate calculation formula, and if the calculated adaptation rate of the current frame is higher than a preset threshold, determining that the current frame is a key frame.
Preferably, in step S8, the local BA optimization includes the following steps:
s81, when the current frame is judged to be the latest key frame in the step S7, point and line features extracted in the step S2 are utilized, the camera transformation pose obtained in the step S6 is utilized, all key frames in the sliding window and the point and line features to which the key frames belong construct a co-view local BA optimization based on the sliding window, then pose nodes in the co-view (co-visibility graph) are the corresponding poses of all key frames contained in the current sliding window, and all the point and line landmarks are the co-view landmarks of all the key frames in the sliding window; calculating a co-visual landmark between two key frames, and adopting landmark point matching based on an Oriented Fast and Rotated Brief (ORB) characteristic and Line characteristic matching based on a Line Band Descriptor (LBD) Descriptor;
s82, because the size of the sliding window is fixed, if the sliding window is full before the current new key frame is inserted, the marginalization of the old key frame and the road mark point is carried out (concretely, the marginalization of the old key frame, the old road mark and the road mark is carried out by using Schur supplement);
s83, re-projection errors of the point and line characteristics in the common view are solved respectively, an error function of rear-end local BA optimization is constructed by combining the two errors, and then iteration optimization is carried out by adopting a Levenberg-Marquardt (L-M) method, so that the pose of the key frame in the common view and the optimization of the common-view landmark are realized;
preferably, in step S9, the loop detection and global graph optimization for all the key frames obtained in step S8 includes the following steps:
s91, based on a dictionary model, firstly, finding out a corresponding word type in a dictionary K-ary tree according to descriptors corresponding to all points and line features of the current key frame, and calculating the weight corresponding to each word to obtain a word vector of the current key frame;
s92, calculating the preselected similarity or L1 norm between the word vector of the current key frame and the word vectors of other key frames to obtain the similarity between frames, and selecting the key frame with the highest similarity and exceeding a preset threshold value as the loop frame of the current key frame;
s93, solving the transformation pose between the loop frame and the current key frame by adopting the error function in the step S5 according to the loop frame obtained by calculation in the step S92;
and S94, constructing a pose graph of all the key frames in the loop according to the transformed poses obtained by the solution of the step S93, wherein the nodes of the pose graph are the poses of each key frame, and the edges are the co-sight landmark points among the key frames, so that the global pose graph optimization is realized.
Compared with the prior art, the invention has the following remarkable beneficial effects:
the invention provides an RGB-D vision SLAM method based on dotted line characteristics, which takes an RGB-D camera as image acquisition equipment, utilizes points and edge lines of an object in a scene as positioning characteristics to realize real-time positioning and map construction (SLAM), firstly carries out vision odometry processing on images acquired by the RGB-D camera (comprising the steps of internal reference calibration of the RGB-D camera, point and line characteristic extraction, acquisition of initial transformation pose of the camera, linear projection of a key frame to a current frame, construction of adaptive errors based on line segment local gradients, formation of an error function of a vision odometry module, iterative solution of the pose of the camera and the like), then, the local BA optimization processing is carried out on the key frames in the camera image frame, then the loop detection is further carried out on the key frames subjected to the local BA optimization processing, then, global graph optimization is executed, and a relocation module is further arranged to perform failed restart on the visual odometer; the whole SLAM method has real-time performance, robustness and high efficiency, fully combines points, straight lines and image data, reduces drift errors, improves the accuracy of the visual SLAM in low-semantic, low-texture and other challenging scenes, and achieves remarkable progress and unexpected effects compared with the prior art.
Drawings
FIG. 1 is a flow chart of an RGB-D visual SLAM method based on dotted line features provided by the present invention;
FIG. 2 is a flow chart of the visual odometer module of the present invention;
FIG. 3 is an edge map extracted from an image frame according to the present invention;
FIG. 4 is a straight line obtained by fitting an edge map of an image frame according to the present invention;
FIG. 5 is a schematic diagram of points and lines extracted from an edge map according to the present invention;
FIG. 6 is a schematic diagram of the transformation of point and line features in an image and a three-dimensional world according to the present invention;
FIG. 7 is a schematic diagram of an image neighborhood generated by sampling points of each projection line in the present invention;
FIG. 8 is a diagram of a camera positioning trace result of the present invention;
fig. 9 is a three-dimensional reconstruction result diagram of the present invention.
Detailed Description
The technical scheme of the invention is further detailed and completely explained by combining specific embodiments.
Example 1
As shown in fig. 1 and 2, the RGB-D visual SLAM method based on dotted line features provided in the present invention includes the following steps:
s1, calibrating internal parameters of the RGB-D camera: the method comprises the following steps of taking an RGB-D camera as image acquisition equipment, and carrying out internal reference calibration on the RGB-D camera, wherein the internal reference of the RGB-D camera comprises a principal point, a focal length and a distortion coefficient of the camera:
s11, taking the RGB-D camera as an image acquisition device, and shooting a calibration board (checkerboard) with fixed size at different angles by using the RGB-D camera to obtain image data required by calibration; in the invention, an RGB-D camera is adopted as an image acquisition device, the RGB-D camera can provide a pair of RGB images and depth images after registration in real time, and the depth images can provide accurate depth information for converting subsequent pixel coordinates into a three-dimensional space;
and S12, calculating camera internal parameter K of the acquired image data by using Zhangyingyou camera calibration method, and storing the internal parameter coefficient.
S2, point and line feature extraction: the image frame that present moment RGB-D camera acquireed carries out edge detection and straight line fitting to further carry out screening and line point sampling to the straight line that the fitting obtained, the point that gathers constitutes the point set:
s21, preprocessing operations including denoising and edge enhancement are firstly carried out on a video image frame acquired by an RGB-D camera, then a Canny edge extractor is utilized, low and high thresholds of a traditional Canny operator are set in a self-adaptive mode, edge extraction is carried out on the image, and an extracted edge graph is shown in the attached drawing 3;
s22, directly collecting collinear point clusters from the edges extracted in the step S21 by using an edge linking and segmentation technology, and then fitting initial line segments in the point clusters based on a least square fitting method; the traditional RGB-D vision SLAM method based on straight lines mainly adopts the existing LSD line segment extractor or Hough straight line detector to extract straight line characteristics, but the invention uses the more robust characteristics of edges to fit straight lines, and can effectively solve the problem that the traditional SLAM has poor extraction effect on characteristic points and characteristic lines under the environments of low semantic, low texture, changed customs and the like;
s23, further extending and combining the initial line segments fitted in the step S22 to generate longer and more complete line segments, so as to fit a required straight line, and finally fitting to obtain the straight line as shown in the attached figure 4;
s24, collecting end point characteristics of the straight lines obtained by fitting in the S23, wherein each straight line can obtain two point characteristics, all straight line characteristic end points are screened to obtain a characteristic point set of the text, and the extracted point set characteristics are shown in figure 5;
s3, acquiring an initial pose transformation position of the camera:
s31, firstly, judging that the image frame acquired by the current RGB-D camera belongs to the frame, then respectively executing the camera pose initialization scheme in the step S32, the step S33 or the step S34 for different image frames according to the judgment result, and acquiring the initial transformation pose of the corresponding image frame (the iterative convergence speed of the subsequent error function can be accelerated to a certain extent, and the method is prevented from falling into a local optimal value);
s32, if the current frame image is the first frame image, the pose is transformed and initialized by using a unit array I;
s33, if the current frame is the second frame image or the repositioned image frame, using the point set collected in step S2 to solve the initial transformation pose of the camera by a perspective N-point projection (PNP) method in geometric transformation, where the projection transformation of the point and line features between the image frames is as shown in fig. 6:
s331, for the point sets of the current frame and the key frame obtained in the step S2, respectively defining a region with a fixed size around each feature point position, and calculating the gradient size and direction of each pixel point in the region to form a feature descriptor corresponding to each key point;
s332, calculating a descriptor distance between each feature point of the key frame and a feature point of the current frame, sequentially sorting according to the distance of the descriptors, extracting two points with the nearest descriptor distance as a point pair, and determining a best matching point; in the embodiment, the descriptors are sequentially sorted according to the distance, and the top sixty percent is taken as the best matching point;
s333, screening out error points of the obtained best matching point pairs by using RANSAC;
s334, firstly, the matching points on the key frame obtained in the S323 are projected to a three-dimensional space according to the posture of the key frame to obtain a three-dimensional space coordinate point P i =(X i ,Y i ,Z i ) T And solving the camera internal parameter K and the three-dimensional space point P obtained in the step 1 i Corresponding two-dimensional point p on the current frame i =(u i ,v i ) T Solving the transformation pose of the camera between two frames
Figure GDA0003704659550000091
R∈ SO(3),t∈R 3 The lie algebraic form of T is xi; wherein the current frame pixel point p i And three-dimensional space point P i The transformation relationship between the two is shown in formula (1):
Figure GDA0003704659550000092
in the formula (1), s i The xi ^ is an antisymmetric matrix of xi which is a scale factor of the current image frame;
s335, due to the fact that the transformation pose of the camera is unknown and noise exists, an error exists in the equation (1), therefore, the errors are summed to construct a least square problem, and then the error function is minimized to find the best initial transformation pose of the camera, as shown in the following equation (2):
Figure GDA0003704659550000101
in the formula (2), the first and second groups,
Figure GDA0003704659550000102
i.e. the initial transformation of the camera, error term, obtained by solving
Figure GDA0003704659550000103
Referred to as reprojection error;
s34, if the current frame is not the first, the second and the repositioning image frame, initializing the transformation pose based on the continuous motion model:
s341, solving current frame F c Its previous frame F c-1 And its key frame F k Change the pose between
Figure GDA0003704659550000104
S342, solving the current frame F c The first two frames F c-1 And F c-2 Position xi of transformation between c-2,c-1
S343, solving the current frame F c Initial pose transformation of
Figure GDA0003704659550000105
Solving the following equation (3):
Figure GDA0003704659550000106
s4, straight line projection of the key frame to the current frame: projecting the straight line on the key frame obtained in the step S2 to the current frame according to the initial pose of the camera obtained in the step S3;
s5, constructing an adaptive error based on the local gradient of the line segment to form an error function of the visual odometer module: obtaining projection straight lines on the current frame according to the step S4, obtaining the adaptation degree value of each straight line on the current frame by using the relation between the projection straight lines and the projection straight line neighborhood image data, and accumulating the adaptation degree value of each projection straight line to form an error function of a Visual Odometer (VO) module;
s51, obtaining the projection straight lines on the current frame according to the step S4, and calculating the direction alpha of each projection straight line;
s52, sampling each projection straight line at equal intervals, wherein the number of the sampled points of each projection straight line satisfies the following formula (4):
Figure GDA0003704659550000107
in the formula (4), n is the total number of sampling points of each projection line, L is the corresponding length of each projection line, and L is s Represents the length of the shortest projected line in the current frame, L l The length of the longest projection straight line in the current frame;
s53, defining a neighborhood with the size of Lx 2w by taking each sampling point as the center, and calculating all pixel points p in the neighborhood i (u i ,v i ) Corresponding gradient amplitude m i (u i ,v i ) Direction of gradient beta i (u i ,v i ) The image neighborhood schematic diagram generated by each projection straight line sampling point is shown in fig. 7, and the calculation formulas of the gradient amplitude and the gradient direction are as follows:
Figure GDA0003704659550000111
in the formula (5), A i (u i ,v i ) Represents a point p i (u i ,v i ) Change in luminosity in the x-axis direction, B i (u i ,v i ) Represents a point p i (u i ,v i ) Photometric change in the y-axis direction, I (u) i Θ1,v i Θ 1), Θ e { +, - } represents the point p i (u i Θ1,v i Θ 1) corresponding pixel values;
s54, screening neighborhood points according to the gradient amplitude and the gradient direction of the pixel points in the neighborhood of each sampling point obtained in the step S53, and when the gradient amplitude of the neighborhood points is lower than a set threshold value, considering that the neighborhood points do not contribute to straight lines and abandoning the points, or else, keeping the points; the contribution of the screened points to the projection adaptation degree of the straight line is different, not all pixel points in the rectangular neighborhood have the same weight to carry out adaptability evaluation, and the pixel points closer to the projection line segment should make more contribution to an adaptability evaluation function (FEF);
s55, calculating the distance d from each neighborhood point screened in the step S54 to the subordinate projection straight line i At a distance d i The setting of a normal gaussian distribution satisfying the criteria μ ═ 0 and σ ═ w forms the weight value G for each neighborhood point u,σ (d i );
S56, subtracting the direction alpha of the straight line from the gradient direction beta of all the pixel points in the neighborhood of each sampling point to obtain a direction difference value, if the direction difference value is larger than a set threshold value, further screening and rejecting the neighborhood points, and if the direction difference value is in the range of the set threshold value, multiplying the direction difference value by the gradient amplitude m of the pixel points in the neighborhood i (u i ,v i ) Obtaining an adaptation contribution value of each neighborhood point, wherein a calculation formula of the adaptation contribution value is shown as a formula (6):
E pi =|m i (u i ,v i )·cos(β i (u i ,v i )-α)| (6)
in the formula (6), the first and second groups,
Figure GDA0003704659550000113
representing the adaptive contribution value of each domain point to the straight line;
s57, multiplying the adaptive contribution values of all the sampling domain points of each projection straight line solved in the step S56 by the weight value G in the step S55 u,σ (d i ) And accumulating and summing to form the adaptation degree value of the straight line and the neighborhood image data, wherein the calculation formula is shown as a formula (7):
Figure GDA0003704659550000114
in the formula (7), the first and second groups,
Figure GDA0003704659550000115
is a projected straight line on the current frame,
Figure GDA0003704659550000116
the adaptive degree value corresponding to the projection straight line;
s58, accumulating the fit degree value of each projection straight line obtained in the step S57 to form an error function of a Visual Odometer (VO) module, wherein the obtained error function formula is shown as a formula (8):
Figure GDA0003704659550000117
in the formula (8), wherein E (ξ) ck ) Calculating the error of the current image frame transformation pose by the vision odometer; xi ck A lie algebra form for transforming the pose from the key frame to the current frame;
Figure GDA0003704659550000121
detecting straight lines on the key frames;
Figure GDA0003704659550000122
a projection function that is a function that projects a straight line from the key frame to the current frame;
s6, iterative optimization of the initial pose of the current frame camera and pose solution of the current frame camera: according to the error function finally formed in the step S5, carrying out iterative optimization of the initial pose transformation of the camera by using a nonlinear error function optimization method; obtaining the transformation pose of the current frame and the key frame by optimization
Figure GDA0003704659550000123
Left-handed key frame pose xi k Solving the pose xi of the current frame camera c Therefore, the specific position of the camera at the current frame moment is realized, and the positioning of the camera is realized:
s61, constructing an optimization formula of the error function, as shown in formula (9):
Figure GDA0003704659550000124
in the formula (9), the reaction mixture,
Figure GDA0003704659550000125
representing the camera transformation pose obtained by iterative optimization,
Figure GDA0003704659550000126
as a function of error E (ξ) for visual odometry ck ) Minimizing, and solving the optimal transformation pose xi between the current frame and the key frame ck
S62, carrying out iterative optimization solution by adopting a Levenberg-Marquardt (Levenberg-Marquardt L-M) nonlinear optimization method, and realizing the positioning of the camera by updating the parameters of the camera, wherein the parameters of the camera at least comprise the position and the posture of the camera in a world coordinate system;
in steps S5 and S6 of this embodiment, a novel and effective error function is constructed by using a gradient relationship between a straight line and image data in its neighborhood, and accurate positioning of a mobile device and three-dimensional map reconstruction of peripheral edge structure information can be realized by nonlinear optimization iterative optimization; particularly, the construction error function is evaluated by the adaptation degree value formed by the line features and the image data which commonly exist in the environment, so that the drift error caused by the matching error of the past point features is well relieved, and the uncertainty of the aperture and the line segment end point of the straight line is solved by constructing the reprojection error no longer by the end point of the straight line;
s7, judging key frames: judging whether the current frame is a key frame or not based on the straight line adaptation degree value of the step S5: if the current frame is judged to be the key frame, replacing and storing the current frame as a new key frame, inserting the new key frame into a rear-end sliding window, and performing local BA optimization; if the current frame is judged to be a non-key frame, the relevant operation of the key frame is not carried out, and the step S2 is returned to:
regarding the current frame, the current frame is determined as a new key frame when any one of the following conditions is satisfied:
1) the maximum frame number limit for key frame replacement is reached or 25 frames have been processed from the last global relocation;
2) for the inter-frame transformation pose obtained in the step S6
Figure GDA0003704659550000127
Decomposing the rotation and translation components, solving the distance between the current frame and the key frame by using the components obtained by decomposition, and if the distance is greater than a set threshold value, determining the current frame as a new key frame;
3) calculating the image frame adaptation rate theta according to the error function of the current frame obtained in step S5 by using the following formula (10), and if the adaptation rate theta obtained for the current frame is higher than a preset threshold (e.g. 0.85), the current frame is considered as a new key frame, and the formula (10) is as follows:
Figure GDA0003704659550000131
in the formula (10), E k+j,k Is a key frame F k To the current frame F k+j Obtaining a Visual Odometer (VO) module error value; e k+1,k Is a key frame F k To its next frame F k+1 Obtaining a Visual Odometer (VO) module error value;
Figure GDA0003704659550000132
as a key frame F k+j A detection straight line of (1);
Figure GDA0003704659550000133
as a key frame F k+1 A detection straight line of (1);
s8, local BA optimization: when the current frame is judged to be the key frame in the step S7, inserting the current frame into the rear-end local BA optimization sliding window, and constructing the common-view local BA optimization based on the sliding window by using the point and line features extracted in the step S2 and the camera transformation pose solved in the step S6, all the key frames in the sliding window and the point and line features belonging thereto:
s81, when the current frame is judged to be the latest key frame in the step S7, point and line features extracted in the step S2 are utilized, the camera transformation pose obtained in the step S6 is utilized, all key frames in the sliding window and the point and line features to which the key frames belong construct a co-view local BA optimization based on the sliding window, then pose nodes in the co-view (co-visibility graph) are the corresponding poses of all key frames contained in the current sliding window, and all the point and line landmarks are the co-view landmarks of all the key frames in the sliding window; calculating a co-visual landmark between two key frames, and adopting landmark point matching based on an Oriented Fast and Rotated Brief (ORB) characteristic and Line characteristic matching based on a Line Band Descriptor (LBD) Descriptor;
s82, because the size of the sliding window is fixed, if the sliding window is full before the current new key frame is inserted, the marginalization of the old key frame and the road mark point is carried out (concretely, the marginalization of the old key frame, the old road mark and the road mark is carried out by using Schur supplement);
s83, re-projection errors of the point and line characteristics in the co-view are solved respectively, an error function of rear-end local BA optimization is constructed by combining the two errors, and then iterative optimization is carried out by adopting a Levenberg-Marquardt (L-M) method, so that the optimization of the pose of the key frame in the co-view and the co-view landmark is realized:
s831, solving for the reprojection error of the point characteristics according to the formula (11):
Figure GDA0003704659550000134
in formula (11), P W,j ∈R 3 Corresponding a three-dimensional coordinate for the jth characteristic point in the common view, wherein pi is a projection function and pi (xi) i ,P W,j ) Representing a projection P W,j Image planes to the ith key frame, e ij Then represents the projection point coordinates and the observation coordinates p i,j The difference in distance between;
s832, solving the reprojection error of the line characteristics according to the formula (12):
Figure GDA0003704659550000141
in the formula (12), wherein P W,k ,Q W,k Three-dimensional coordinates corresponding to two end points of the line feature respectively,
Figure GDA0003704659550000142
line coefficients for the kth line feature observed on the ith keyframe;
wherein the content of the first and second substances,
Figure GDA0003704659550000143
is shown in equation (13):
Figure GDA0003704659550000144
in the formula (13), the first and second groups,
Figure GDA0003704659550000145
and
Figure GDA0003704659550000146
representing two-dimensional homogeneous image coordinates corresponding to left and right end points of the kth line feature on the ith key frame;
s833, combination e ij And e ik An error function for the back-end local BA optimization of the invention is constructed, and is specifically shown in formula (14):
Figure GDA0003704659550000147
in equation (14), Ψ is a vector formed by variables optimized for the back-end local BA, and ρ is a robust huber loss function, Ω ij And Ω ik Is a covariance matrix corresponding to the point feature and the line feature,
Figure GDA0003704659550000148
and
Figure GDA0003704659550000149
common view set as key frameCollecting lines;
s9, loop detection: performing loop detection and global graph optimization on the new key frame obtained in step S8 and all existing key frames, specifically: performing loop detection on each newly inserted key frame and all previous key frames based on a dictionary model, and if the loop frame is detected, performing graph optimization on all key frames between the new key frame and the loop frame by using an error function constructed by the visual odometry module in the step S5; accumulated drift errors can be further eliminated through loop detection and global graph optimization, and poses and landmark points are further optimized:
s91, based on the dictionary model, firstly, finding out the corresponding word type in the dictionary K-ary tree according to the descriptors corresponding to all points and line features of the current key frame;
counting the times of each word appearing in the picture, and calculating the weight corresponding to each word according to the formula (15):
Figure GDA00037046595500001410
in the formula (15), tf (I, I) t ) The corresponding weight of each word is represented,
Figure GDA00037046595500001411
indicating the number of times the word i appears in the current picture,
Figure GDA00037046595500001412
and (3) representing the number of all words in the picture, and calculating the word vector of the current key frame according to the calculated weight and the formula (16):
Figure GDA00037046595500001413
in the formula (16), the first and second groups,
Figure GDA0003704659550000151
the word vector representing the current key frame, tf (I, I) t ) To representThe weight of each word in the current keyframe, idf (i) represents the frequency of occurrence of each word in the lexicon;
s92, calculating the preselected similarity or L1 norm between the word vector of the current key frame and the word vectors of other key frames to obtain the similarity between frames, and selecting the key frame with the highest similarity and exceeding a preset threshold value as the loop frame of the current key frame;
s93, solving the transformation pose between the loop frame and the current key frame by adopting the error function in the step S5 according to the loop frame obtained by calculation in the step S92;
s94, constructing a pose graph of all key frames in a loop according to the transformation pose obtained by solving in the step S93 and a graph optimization error function shown in a formula (17), wherein nodes of the pose graph are poses of each key frame, and edges of the pose graph are co-view landmark points between the key frames, so that global pose graph optimization is realized:
γ i,jij )=log(exp(ξ i,j )·exp(ξ j -1 )·exp(ξ j )) (17)
xi in formula (17) i,j Is the transformation pose, xi, between key frame i and key frame j i And xi j Respectively obtaining camera poses of the key frame i and the key frame j after local BA optimization;
s10, repositioning: when the difference between the camera transformation pose solved by the error function of the visual odometer module in the step S6 and the set threshold value is too large, the pose solving of the current frame camera is judged to be failed, and repositioning is carried out; in the repositioning process, loop detection processing in the step S9 is firstly carried out, loop frames corresponding to the current frame are searched, then the step S3 is skipped to for estimating the initial transformation pose, and corresponding subsequent operations are continuously and sequentially executed;
s11, map construction: solving the pose of each key frame and the point and line characteristics obtained in the step to construct a sparse map;
s12, outputting the result: pose storage, path drawing and map storage and synchronous real-time display, wherein the path diagram obtained in the embodiment is shown in fig. 8, and the corresponding three-dimensional reconstruction map is shown in fig. 9.
Finally, it should be pointed out here that: the above is only a part of the preferred embodiments of the present invention and should not be construed as limiting the scope of the present invention, and the insubstantial modifications and adaptations of the present invention by those skilled in the art based on the above description are intended to be covered by the present invention.

Claims (10)

1. An RGB-D visual SLAM method based on dotted line characteristics is characterized by comprising the following steps:
s1, calibrating internal parameters of the RGB-D camera: the method comprises the following steps of taking an RGB-D camera as image acquisition equipment, and carrying out internal reference calibration on the RGB-D camera, wherein the internal reference of the RGB-D camera comprises a principal point, a focal length and a distortion coefficient of the camera;
s2, extracting point and line characteristics: performing edge detection and straight line fitting on an image frame acquired by an RGB-D camera at the current moment, further screening and sampling line points on the straight line obtained by fitting, and forming a point set by the acquired points;
s3, acquiring an initial pose transformation position of the camera: initializing the transformation pose of different image frames acquired by an RGB-D camera to acquire the initial transformation pose of the camera;
s4, straight line projection of the key frame to the current frame: projecting the straight line on the key frame obtained in the step S2 to the current frame according to the initial pose of the camera obtained in the step S3;
s5, constructing an adaptive error based on the local gradient of the line segment to form an error function of the visual odometer module: obtaining projection straight lines on the current frame according to the step S4, obtaining the adaptation degree value of each straight line on the current frame by utilizing the relation between the projection straight lines and the neighborhood image data of the projection straight lines, and accumulating the adaptation degree value of each projection straight line to form an error function of the visual odometer module;
s6, iterative optimization of the initial pose of the current frame camera and pose solution of the current frame camera: according to the error function finally formed in the step S5, iterative optimization of the initial pose transformation of the camera is carried out by utilizing a nonlinear error function optimization method, and the current frame and the key obtained by optimizationPose transformation of frames
Figure FDA0003704659540000011
Left-handed key frame pose xi k Solving the pose xi of the current frame camera c Thereby acquiring the specific position of the camera at the current frame moment and realizing the positioning of the camera;
s7, judging key frames: judging whether the current frame is a key frame or not based on the straight line adaptation degree value of the current frame obtained in the step S5, if so, replacing and storing the current frame as a new key frame, inserting the new key frame into a rear-end sliding window, and performing local BA optimization; if the current frame is judged to be a non-key frame, the relevant operation of the key frame is not carried out, and the step S2 is returned;
s8, local BA optimization: when the current frame is judged to be the key frame in the step S7, inserting the current frame into a rear local BA optimization sliding window, and constructing a common-view local BA optimization based on the sliding window by using the point and line characteristics extracted in the step S2 and the camera transformation pose solved in the step S6, and all the key frames in the sliding window and the point and line characteristics which belong to the key frames;
s9, loop detection: performing loop detection and global graph optimization on the new key frame obtained in step S8 and all existing key frames, specifically: performing loop detection on each newly inserted key frame and all previous key frames based on a dictionary model, and if the loop frame is detected, performing graph optimization on all key frames between the new key frame and the loop frame by using an error function constructed by the visual odometry module in the step S5;
s10, repositioning: when the difference between the camera transformation pose solved by the error function of the visual odometer module in the step S6 and the set threshold value is too large, the pose solving of the current frame camera is judged to be failed, and repositioning is carried out; in the repositioning process, loop detection processing in step S9 is firstly carried out, loop frames corresponding to the current frame are searched, then the step S3 is skipped to carry out estimation of initial transformation poses, and corresponding subsequent operations are continuously and sequentially executed;
s11, map construction: solving the pose of each key frame and the point and line characteristics obtained in the step to construct a sparse map;
s12, outputting the result: and (4) storing the pose, drawing the motion path of the camera, storing and synchronously displaying the map in real time.
2. The RGB-D visual SLAM method based on dotted line features of claim 1, wherein: in step S1, an interior reference K calibration method is used to calibrate the RGB-D camera.
3. The RGB-D visual SLAM method based on dotted line features of claim 1, wherein: in the step S2, edge detection and straight line fitting are performed on the video image frame acquired by the RGB-D camera, in which preprocessing operations including denoising and edge enhancement are performed on the video image frame acquired by the RGB-D camera; then, edge extraction is carried out on the preprocessed image by using an edge detector; and finally, fitting the extracted edges to obtain a required straight line, screening the fitted straight line and sampling line points, wherein the collected points form a point set.
4. The RGB-D visual SLAM method according to claim 1, wherein the step S3 of obtaining the initial pose transformation of the camera includes the following steps:
s31, firstly, judging that the image frame acquired by the current RGB-D camera belongs to the frame, then respectively executing the camera pose initialization scheme in the step S32, the step S33 or the step S34 for different image frames according to the judgment result, and acquiring the initial transformation pose of the corresponding image frame;
s32, if the current frame image is the first frame image, the pose is changed and initialized by using a unit array I;
s33, if the current frame is the second frame image or the repositioning image frame, the initial transformation pose of the camera is solved by a perspective N-point projection method in geometric transformation by using the point set acquired in the step S2;
and S34, if the current frame is not the first image frame, the second image frame and the repositioning image frame, initializing the transformation pose based on the continuous motion model.
5. The RGB-D visual SLAM method according to claim 4, wherein the step S33 includes the following operations:
s331, for the point sets of the current frame and the key frame obtained in the step S2, respectively defining a region with a fixed size around each feature point position, and calculating the gradient size and direction of each pixel point in the region to form a feature descriptor corresponding to each key point;
s332, calculating a descriptor distance between each feature point of the key frame and a feature point of the current frame, sequentially sorting according to the distance of the descriptors, extracting two points with the nearest descriptor distance as a point pair, and determining a best matching point;
s333, screening out error point pairs by using RANSAC;
and S334, solving the initial transformation pose of the camera by using the matching point pairs obtained in the S333 and through a perspective N point projection method in geometric transformation.
6. The RGB-D visual SLAM method according to claim 5, wherein the step S334 includes the following operations:
s3341, projecting the matching point on the key frame to the three-dimensional space according to the posture of the key frame to obtain a three-dimensional space coordinate point P i =(X i ,Y i ,Z i ) T And solving the camera internal parameter K and the three-dimensional space point P obtained in the step 1 i Corresponding two-dimensional point p on the current frame i =(u i ,v i ) T Solving the transformation pose of the camera between two frames
Figure FDA0003704659540000031
R∈SO(3),t∈R 3 The lie algebra form of T is xi, wherein the pixel point p of the current frame is i And three-dimensional space point P i The transformation relationship between the two is shown in formula (1):
Figure FDA0003704659540000032
in the formula (1), s i Is the scale factor, ξ, of the current image frame ^ An antisymmetric matrix of ξ;
s3342, because the transformation pose of the camera is unknown and noise exists, the equation (1) has an error, therefore, the errors are summed to construct a least square problem, and then the error function is minimized to find the best initial transformation pose of the camera, as shown in the following equation (2):
Figure FDA0003704659540000033
in the formula (2), the first and second groups,
Figure FDA0003704659540000034
i.e. the initial transformation of the camera, error term, obtained by solving
Figure FDA0003704659540000035
Referred to as reprojection error.
7. The RGB-D visual SLAM method according to claim 4, wherein the initialization of the camera transformation matrix in step S34 is performed by using a continuous motion model, specifically: the current frame F c Previous frame F of c-1 Key frame F corresponding thereto k Change the pose between
Figure FDA0003704659540000036
Multiplication by the current frame F c The first two frames F c-1 And F c-2 Position xi of transformation between c-2,c-1 Obtain the current frame F c Initial pose transformation of
Figure FDA0003704659540000037
Namely, as shown in equation (3):
Figure FDA0003704659540000038
8. the RGB-D visual SLAM method according to claim 1, wherein the step S5 of constructing the adaptive error based on the local gradient of the line segment to form the error function of the visual odometer module includes the following steps:
s51, obtaining the projection straight lines on the current frame according to the step S4, and calculating the direction alpha of each projection straight line;
s52, sampling each projection straight line at equal intervals, wherein the number of the sampled points of each projection straight line satisfies the following formula:
Figure FDA0003704659540000041
wherein: n is the total number of sampling points of each projection line, L is the corresponding length of each projection line, and L is the total number of sampling points of each projection line s Represents the length of the shortest projected line in the current frame, L l The length of the longest projection straight line in the current frame is taken as the length of the longest projection straight line in the current frame;
s53, defining a neighborhood with fixed size by taking each sampling point as a center, and calculating gradient amplitudes and gradient directions corresponding to all pixel points in the neighborhood;
s54, screening neighborhood points according to the gradient amplitude and the gradient direction of the pixel points in the neighborhood of each sampling point obtained in the step S53, and when the gradient amplitude of the neighborhood points is lower than a set threshold value, considering that the neighborhood points do not contribute to straight lines and abandoning the points, or else, keeping the points;
s55, calculating the distance from each neighborhood point screened in the step S54 to the subordinate projection straight line thereof, and forming the weight value of each neighborhood point by setting that the distance meets the standard Gaussian normal distribution;
s56, subtracting the direction of the straight line from the gradient direction of all the pixel points in the neighborhood of each sampling point to obtain a direction difference value, if the direction difference value is larger than a set threshold value, further screening and rejecting the neighborhood points, and if the direction difference value is in the range of the set threshold value, multiplying the direction difference value by the gradient amplitude of the pixel points in the neighborhood to obtain an adaptive contribution value of each neighborhood point;
s57, multiplying the direction difference value calculated by all sampling points of each projection straight line solved in the step S56 by the weight value of the step S55, accumulating and summing to form the adaptation degree value of the straight line and the neighborhood image data;
and S58, accumulating the fit degree values of the projection straight lines obtained in the step S57 to form an error function of the visual odometer module.
9. The RGB-D visual SLAM method according to the dotted line feature of claim 1, wherein the local BA optimization in step S8 includes the following steps:
s81, when the current frame is judged to be the latest key frame in the step S7, the point and line characteristics extracted in the step S2 are used, the camera transformation pose obtained in the step S6 is used, the common-view local BA optimization based on the sliding window is established by all key frames in the sliding window and the point and line characteristics to which the key frames belong, the pose nodes in the common view are the corresponding poses of all key frames contained in the current sliding window, and all the point and line landmarks are the common-view landmarks of all the key frames in the sliding window; calculating a common visual landmark between two key frames, and adopting landmark point matching based on the Oriented Fast and Rotated Brief characteristics and Line characteristic matching based on a Line Band Descriptor;
s82, because the size of the sliding window is fixed, if the sliding window is full before the current new key frame is inserted, marginalizing the old key frame and the landmark point;
and S83, solving the reprojection errors of the point and line characteristics in the common view respectively, combining the two errors to construct an error function of the rear-end local BA optimization, and then performing iterative optimization by adopting a Levenberg-Marquardt method, thereby realizing the optimization of the pose of the key frame in the common view and the common-view landmark.
10. The RGB-D visual SLAM method according to claim 1, wherein in step S9, the loop detection and global map optimization for all the key frames obtained in step S8 includes the following steps:
s91, based on a dictionary model, firstly, finding out a corresponding word type in a dictionary K-ary tree according to descriptors corresponding to all points and line features of the current key frame, and calculating the weight corresponding to each word to obtain a word vector of the current key frame;
s92, calculating the preselected similarity or L1 norm between the word vector of the current key frame and the word vectors of other key frames to obtain the similarity between frames, and selecting the key frame with the highest similarity and exceeding a preset threshold value as the loop frame of the current key frame;
s93, solving the transformation pose between the loop frame and the current key frame by adopting the error function in the step S5 according to the loop frame obtained by calculation in the step S92;
and S94, constructing a pose graph of all the key frames in the loop according to the transformed poses obtained by the solution of the step S93, wherein the nodes of the pose graph are the poses of each key frame, and the edges are the co-sight landmark points among the key frames, so that the global pose graph optimization is realized.
CN202011358438.9A 2020-11-27 2020-11-27 RGB-D vision SLAM method based on dotted line characteristics Active CN112381890B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011358438.9A CN112381890B (en) 2020-11-27 2020-11-27 RGB-D vision SLAM method based on dotted line characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011358438.9A CN112381890B (en) 2020-11-27 2020-11-27 RGB-D vision SLAM method based on dotted line characteristics

Publications (2)

Publication Number Publication Date
CN112381890A CN112381890A (en) 2021-02-19
CN112381890B true CN112381890B (en) 2022-08-02

Family

ID=74587478

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011358438.9A Active CN112381890B (en) 2020-11-27 2020-11-27 RGB-D vision SLAM method based on dotted line characteristics

Country Status (1)

Country Link
CN (1) CN112381890B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344995A (en) * 2021-06-22 2021-09-03 平安普惠企业管理有限公司 Data registration method, device and equipment based on linear features and storage medium
CN113465617B (en) * 2021-07-08 2024-03-19 上海汽车集团股份有限公司 Map construction method and device and electronic equipment
CN113536024B (en) * 2021-08-11 2022-09-09 重庆大学 ORB-SLAM relocation feature point retrieval acceleration method based on FPGA
CN114004808B (en) * 2021-10-29 2024-04-16 上海应用技术大学 Loop detection method based on differential manifold
CN116468786B (en) * 2022-12-16 2023-12-26 中国海洋大学 Semantic SLAM method based on point-line combination and oriented to dynamic environment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110060277A (en) * 2019-04-30 2019-07-26 哈尔滨理工大学 A kind of vision SLAM method of multiple features fusion
CN110782494A (en) * 2019-10-16 2020-02-11 北京工业大学 Visual SLAM method based on point-line fusion
CN110853100A (en) * 2019-10-24 2020-02-28 东南大学 Structured scene vision SLAM method based on improved point-line characteristics
CN111982103A (en) * 2020-08-14 2020-11-24 北京航空航天大学 Point-line comprehensive visual inertial odometer method with optimized weight

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105096386B (en) * 2015-07-21 2017-11-17 中国民航大学 A wide range of complicated urban environment geometry map automatic generation method
US11187536B2 (en) * 2018-01-12 2021-11-30 The Trustees Of The University Of Pennsylvania Probabilistic data association for simultaneous localization and mapping
CN109934862A (en) * 2019-02-22 2019-06-25 上海大学 A kind of binocular vision SLAM method that dotted line feature combines

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110060277A (en) * 2019-04-30 2019-07-26 哈尔滨理工大学 A kind of vision SLAM method of multiple features fusion
CN110782494A (en) * 2019-10-16 2020-02-11 北京工业大学 Visual SLAM method based on point-line fusion
CN110853100A (en) * 2019-10-24 2020-02-28 东南大学 Structured scene vision SLAM method based on improved point-line characteristics
CN111982103A (en) * 2020-08-14 2020-11-24 北京航空航天大学 Point-line comprehensive visual inertial odometer method with optimized weight

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Improved Point-Line Feature Based Visual SLAM Method for Indoor Scenes;Runzhi Wang et al.;《sensors》;20181020;第1-20页 *
基于点线特征的单目视觉同时定位与地图构建算法;王丹 等;《机器人》;20190531;第41卷(第3期);第392-403页 *

Also Published As

Publication number Publication date
CN112381890A (en) 2021-02-19

Similar Documents

Publication Publication Date Title
CN112381890B (en) RGB-D vision SLAM method based on dotted line characteristics
CN111156984B (en) Monocular vision inertia SLAM method oriented to dynamic scene
CN110533722B (en) Robot rapid repositioning method and system based on visual dictionary
CN107869989B (en) Positioning method and system based on visual inertial navigation information fusion
CN110070615B (en) Multi-camera cooperation-based panoramic vision SLAM method
CN112598757B (en) Multi-sensor time-space calibration method and device
Prescott et al. Line-based correction of radial lens distortion
CN101383899A (en) Video image stabilizing method for space based platform hovering
CN112396656B (en) Outdoor mobile robot pose estimation method based on fusion of vision and laser radar
CN112233177A (en) Unmanned aerial vehicle pose estimation method and system
CN109272577B (en) Kinect-based visual SLAM method
CN112419497A (en) Monocular vision-based SLAM method combining feature method and direct method
Ribera et al. Estimating phenotypic traits from UAV based RGB imagery
CN113506342B (en) SLAM omni-directional loop correction method based on multi-camera panoramic vision
CN113743385A (en) Unmanned ship water surface target detection method and device and unmanned ship
CN111899345B (en) Three-dimensional reconstruction method based on 2D visual image
CN115240089A (en) Vehicle detection method of aerial remote sensing image
CN117218350A (en) SLAM implementation method and system based on solid-state radar
CN114529615A (en) Radar calibration method, device and storage medium
CN117036404A (en) Monocular thermal imaging simultaneous positioning and mapping method and system
CN114419259B (en) Visual positioning method and system based on physical model imaging simulation
Svedman et al. Structure from stereo vision using unsynchronized cameras for simultaneous localization and mapping
CN111383354B (en) SFM-based three-dimensional point cloud orientation correction method
CN113379738A (en) Method and system for detecting and positioning epidemic trees based on images
Majidi et al. Aerial tracking of elongated objects in rural environments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant