CN112258600A - Simultaneous positioning and map construction method based on vision and laser radar - Google Patents

Simultaneous positioning and map construction method based on vision and laser radar Download PDF

Info

Publication number
CN112258600A
CN112258600A CN202011117756.6A CN202011117756A CN112258600A CN 112258600 A CN112258600 A CN 112258600A CN 202011117756 A CN202011117756 A CN 202011117756A CN 112258600 A CN112258600 A CN 112258600A
Authority
CN
China
Prior art keywords
point
points
laser
depth information
visual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011117756.6A
Other languages
Chinese (zh)
Inventor
章国锋
鲍虎军
王宇伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202011117756.6A priority Critical patent/CN112258600A/en
Publication of CN112258600A publication Critical patent/CN112258600A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/20Drawing from basic elements, e.g. lines or circles
    • G06T11/206Drawing of charts or graphs
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/02Systems using the reflection of electromagnetic waves other than radio waves
    • G01S17/06Systems determining position data of a target
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/86Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/003Reconstruction from projections, e.g. tomography
    • G06T11/006Inverse problem, transformation from projection-space into object-space, e.g. transform methods, back-projection, algebraic methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration by the use of local operators
    • G06T5/30Erosion or dilatation, e.g. thinning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • G06T2207/10044Radar image

Abstract

The invention discloses a simultaneous positioning and map construction method based on vision and laser radar, and belongs to the field of SLAM. The laser odometer and the visual odometer are operated simultaneously, wherein the visual odometer can assist the laser point cloud to better remove motion distortion, and the point cloud with the motion distortion removed can be projected to an image to be used as depth information to carry out motion calculation of the next frame. After the laser odometer obtains a good initial value, the situation that the laser odometer falls into a degraded scene due to the defect of a single sensor is avoided, and the addition of visual information enables the odometer to achieve higher positioning accuracy. The loop detection and repositioning module is realized through the visual bag, a complete visual-laser SLAM system is formed, after loop is detected, the system carries out pose map optimization according to loop constraint, accumulated errors after long-time movement are eliminated to a certain extent, and high-precision positioning and point cloud map construction can be completed in a complex environment.

Description

Simultaneous positioning and map construction method based on vision and laser radar
Technical Field
The invention relates to the field of simultaneous localization and mapping (SLAM), in particular to a simultaneous localization and mapping method based on vision and laser radar.
Background
With the rise of artificial intelligence concepts and the maturation of computer vision related technologies, unmanned devices and robotics are receiving more and more attention. Meanwhile, Simultaneous Localization and Mapping (SLAM) is an indispensable module in the related art of robots. Depending on the sensors used, SLAM technology can be largely classified into camera-based visual SLAM, which can be subdivided into monocular SLAM, binocular SLAM, and RGBD-SLAM, and LiDAR-based laser SLAM, which can be further classified into 2D LiDAR-based laser SLAM and 3D LiDAR-based laser SLAM.
Visual SLAM is a low-cost solution that is currently widely used in the fields of augmented reality and robotics. The method comprises the steps of acquiring surrounding environment information by using cameras, extracting image features, completing data association between frames, solving the pose of each frame of camera by using a multi-view geometric principle, and finally constructing a three-dimensional map. Because the imaging result of the camera is greatly influenced by illumination, the camera is difficult to continuously operate in an environment with unstable illumination conditions. Furthermore, for some non-textured and weakly textured regions, a tracking loss situation is likely to occur since it is difficult to obtain a sufficient number of feature points by relying on visual information only.
The current mature Visual SLAM scheme generally consists of a front end and a rear end, wherein the front end is mainly responsible for completing a real-time positioning function and a Visual Odometer (VO) function. The back end is an optimization module, which typically performs further correlation on the frame data transmitted from the front end, such as extracting key frames, and then performs optimization on the camera poses and map points of the key frames, and furthermore, maintains a loop detection and relocation module in order to maintain global consistency of the map and ensure that the map can be relocated after a tracking loss.
Academic community research work on visual SLAM has continued for more than ten years, 2007, MonoSLAM systems were proposed, which are based on a filter framework and build a probabilistic model to predict posterior distribution of system states through kalman filtering. The system adopts a serial architecture, and alternately updates the camera pose and map point information in one thread. Later, the filter-based SLAM scheme was gradually replaced by an optimized-based visual SLAM scheme, the earliest proposed optimized-based visual SLAM was the PTAM proposed by Klein et al. The biggest difference between the system and the Mono-SLAM is that a concept of Bundle Adjustment (BA) is introduced, and the camera pose and the map point are optimized simultaneously by a nonlinear least square method, so that the aim of minimizing the reprojection error is fulfilled. In addition, PTAM also proposes the concept of key frames (keyframes), where higher quality images are added to global optimization as key frames. These innovative concepts are also heavily borrowed by the successor, making this work the prototype of many mature optimization-based visual SLAM schemes in the future. Among them, Bundle Adjustment becomes an independent research topic, and researchers have proposed many more efficient and robust BA schemes, such as Segment-Based BA, EIBA, and ICE-BA that is later applied in the inertial navigation SLAM system. With the great improvement of hardware performance, the optimization-based scheme surpasses the filter-based scheme in both precision and speed. ORB-SLAM, proposed by Mur-Artal et al in 2015, is also one of the most advanced SLAM systems at present. One of the biggest characteristics of the ORB-SLAM is that the uniform ORB feature is used globally, which greatly improves the robustness of the system compared with the prior work. In addition to the two threads for tracking the camera pose and the thread for updating the map point mentioned in the PTAM, the ORB-SLAM also separately opens one thread for processing loop detection and repositioning, which not only can eliminate the accumulated error caused by long-time operation, but also can return to the original track to continue positioning and mapping after the tracking is lost.
The laser SLAM utilizes a 2D or 3D laser radar to measure the distance of discrete points by emitting laser and receiving reflected light of a measured object to obtain structural information of the surrounding environment, data is given in a Point cloud collection mode, Point clouds are subjected to online registration through an ICP (inductively coupled plasma) and an improved ICP (Point-to-Line ICP) method, real-time pose information is obtained, and an environment map is constructed. At present, most sweeping robots in the market are basically carried by a 2D laser SLAM technology. In the positioning and mapping of the three-dimensional scene, the multi-line laser radar is high in price, and is characterized by wide viewing angle (the horizontal viewing angle of the common rotary laser radar reaches 360 degrees), long distance measurement and high precision (generally more than 100 meters, and the error is within 2 centimeters), so that the 3D laser SLAM is generally superior to the visual SLAM scheme in positioning precision under the condition of good data processing. This is also shown in the public data set KITTI, and the top 3 schemes in terms of positioning accuracy are all based on lidar.
The lidar SLAM generally performs frame-to-frame pose solution or frame-to-local map pose solution with ICP and some improved ICP algorithms as the main core. Because the point cloud number of a single frame of the multi-line laser radar is large, usually, tens of thousands of points exist, the calculation efficiency is low in direct ICP (inductively coupled plasma) solution, the pose calculation is also inaccurate due to the fact that the direct ICP solution is easily trapped into a local optimal solution, similar to a visual method, the point cloud data are preprocessed in a mode of extracting feature points with large information quantity from the point cloud, and only loss terms among the processed feature points are added during ICP (inductively coupled plasma) so as to improve the calculation efficiency. Typical examples of the method include a LOAM proposed by Zhang et al in 2014, which estimates the curvature of a current point according to the distance relationship between each point and a nearby point in the multiline lidar, and then extracts an edge point and a face point according to the curvature, wherein an error term between the features constitutes an object to be optimized. In addition, the point cloud is obtained continuously by the laser radar, the laser radar coordinate system changes constantly relative to the visual coordinate system in motion, and the laser radar cannot sense the movement in the process, so that the point cloud of the laser radar has a motion distortion phenomenon, and the distortion degree is increased along with the acceleration of the movement speed. In the LOAM, Zhang uses a uniform motion hypothesis to reduce the influence caused by motion distortion as much as possible, obtains a timestamp for sweeping each point by calculating the deflection angle of each point relative to a rotating central point, calculates the real pose of the laser at the moment according to the interpolation of a uniform motion model, and converts each point into the same laser coordinate system according to the relative relationship between the poses to fulfill the aim of removing the point cloud motion distortion.
Many of the later 3D laser SLAM approaches more or less borrow on the idea of LOAM. The LeGO-LOAM proposed in 2018 is one of the two, and is respectively composed of point cloud segmentation, feature extraction, a laser odometer and laser mapping, the algorithm idea is somewhat similar to that of an RGBD-SLAM, and a system receives input from a three-dimensional laser radar and outputs 6-Degree-of-Freedom (DoF) attitude estimation. The whole system is divided into five modules, the first is segmentation, a single scanned point cloud is projected to an image in a fixed range for segmentation, and then the second is sending the segmented point cloud to a feature extraction module. Thirdly, the laser radar odometer uses the features extracted from the last module to find the transformation of the pose of the robot in the continuous scanning process, and the information is finally used in the point cloud mapping for the laser radar. And the fifth module is used for fusing the attitude estimation results of the laser radar mileage measurement and the mapping and outputting the final attitude estimation. In the attitude estimation, in order to reduce the amount of calculation and accelerate the optimization efficiency, the optimization of 6 degrees of freedom is divided into 2 steps, firstly, the offset (z), the roll angle (roll) and the pitch angle (pitch) in the vertical direction are optimized by the constraint between the face points, and then the two offset angles (x and y) and the yaw angle (yaw) in the horizontal direction are optimized by the constraint between the side points. In addition, a strong priori is that the point cloud information necessarily contains a level of open road surface information, so that the point cloud information is more like a solution for a specific road scene and has no strong universality.
IMLS-SLAM is another pure laser visual positioning solution, an Implicit Moving Least Squares (IMLS) representation surface is used, registration of feature points and the surface is completed through feature screening, high positioning accuracy is achieved, the defect is obvious, due to the fact that the calculated amount is too large, real-time positioning cannot be achieved, and the IMLS-SLAM can be generally used for off-line mapping.
The conventional improved ICP method is prone to degradation in a repeated structural scene (e.g., moving in a narrow and lengthy corridor) or a scene lacking structural information, and loop detection of point cloud structural information still does not have a robust solution at present, so that the laser SLAM and the visual SLAM cannot be directly and simply combined.
Therefore, how to combine the visual and laser information to further improve the accuracy of positioning and mapping and improve the robustness of the algorithm is still a problem worthy of study.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a simultaneous positioning and map building method based on vision and laser radar. In the visual odometer, the calibrated external parameter is used for converting the structural information of the laser point cloud into the depth information of the image, so that the visual odometer is not lack of scale constraint any more, and accurate displacement information can be estimated; the method is characterized in that a visual odometer is added as a constraint while the laser odometer runs, so that motion distortion in laser point cloud is removed, a good motion initial value can be provided for the laser odometer, and the laser odometer is prevented from falling into a local optimal solution in certain specific environments; a loop detection and repositioning module is added at the rear end for constraint, accumulated errors generated by long-time movement are eliminated through pose graph optimization after loop detection, and high-precision positioning and point cloud map construction can be completed in a complex environment.
In order to achieve the purpose, the invention adopts the following technical scheme:
a vision and laser radar based simultaneous localization and map construction method comprises the following steps:
step 1: simultaneously operating a visual odometer and a laser odometer, wherein the visual odometer acquires image information of the surrounding environment in real time by using a camera to obtain image data; the laser odometer utilizes a laser radar to emit laser and receive reflected light of a peripheral measured object for ranging, and obtains structured information of a peripheral environment to obtain laser point cloud data;
step 2: projecting the laser point cloud to a pixel plane by using an external reference result calibrated by combining a laser radar and a camera, and converting the laser point cloud into depth information of an image; taking the depth information as the scale constraint of the visual odometer, matching visual feature points from a previous frame to a current frame in the image data, and calculating visual motion information;
and step 3: removing the motion distortion of the laser point cloud of the current frame by using the visual motion information from the previous frame to the current frame; and (3) taking the visual motion information obtained in the step (2) as an initial pose, registering the laser point cloud through the local point cloud map, constructing an environment map, and continuously performing loop detection and repositioning by using the key frame to optimize the pose of the key frame and map points.
Further, after the laser point cloud is projected to the pixel plane in the step 2, a sparse depth map is generated, and the sparse depth map is completed to obtain depth information of the image; the completion comprises the following steps:
step 2.1.1: inverting effective depth information in the sparse depth map:
Figure BDA0002730917950000051
Figure BDA0002730917950000052
in the formula (I), the compound is shown in the specification,
Figure BDA0002730917950000053
for the reversed visual feature point depth values, DmaxAt a predetermined maximum depth value, DxAs depth values of visual feature points before inversion, DxNot equal to 0 indicates that there is depth information, i.e. it belongs to valid depth information;
D x0 denotes no depth information;
step 2.1.2: expanding a depth information-free region in the sparse depth map;
step 2.1.3: and continuously filling the expanded region without the depth information, then performing median filtering and Gaussian filtering, and finally performing depth re-inversion to recover the depth information.
Further, in the visual feature point matching process described in step 2, two different types of matching point error term weighting are adopted as final reprojection errors; the method specifically comprises the following steps:
for feature points p with depth information1Back-projecting it back into three-dimensional space, and re-projecting it onto another image plane to obtain points
Figure BDA0002730917950000054
And the position of the true matching point is p1', then distance
Figure BDA0002730917950000055
The characteristic point p with depth information is formed1The error term of (2);
for feature points p without depth information2Back-projecting it back into three-dimensional space would appear at ray Op2At any position, O is the origin of the three-dimensional space, and the ray Op2Projecting to another image plane to obtain a straight line, and then matching point p2'the distance from the position of' to the straight line constitutes a feature point p without depth information2The error term of (2);
the final reprojection error is expressed as:
Figure BDA0002730917950000056
in the formula, T*Expressing an error term optimization function, and expressing the relative pose between two frames by T; n denotes the number of feature points with depth information, m denotes the number of feature points without depth information, pi is a projection function of projecting a three-dimensional point to a pixel coordinate system, pi-1Is to sum the pixel coordinatesRestoring the corresponding depth information into a back projection function of the three-dimensional coordinate point; u. of(1,i)Representing the pixel coordinates of the ith point in the left view among the first type of points, i.e. points having depth information, diRepresenting depth information representing the ith point of the first class of points, v(1,i)Representing the pixel coordinates representing the ith point in the right view in the first class of points, u(2,i)Representing the pixel coordinates of the ith point in the left view, phi, of the second type of points, i.e. points without depth information-1Is a back projection function for projecting the pixel points into a three-dimensional ray, | · luminance |2Calculating the modular length of the distance, wherein t is a threshold value; lambda is a weight coefficient for adjusting the confidence between the two types of points; dist (-) represents a point-to-ray distance function.
Further, the calculating of the visual motion information in step 2 specifically includes:
step 2.2.1: extracting and matching the characteristics of the previous frame image and the current frame image;
step 2.2.2: recovering the depth of the matching feature points of the previous frame by using the previous frame image and the laser point cloud of the previous frame after the motion distortion is removed;
step 2.2.3: counting the number of feature points with depth information in the previous frame, if the number of feature points is less than a threshold value, returning error information, and using a uniform motion hypothesis; otherwise, carrying out PnP calculation on the feature points with depth information in the previous frame and the feature points without depth information in the current frame to obtain visual motion information from the previous frame to the current frame;
step 2.2.4: and (5) optimizing the reprojection error to obtain the optimal pose estimation.
Further, the step 3 of removing the motion distortion of the laser point cloud of the current frame by using the visual motion information from the previous frame to the current frame specifically includes:
step 3.1: for each three-dimensional point in the current frame laser point cloud, determining a horizontal deflection angle theta of each three-dimensional point relative to the initial direction;
step 3.2: performing linear interpolation between a previous frame and a current frame according to the horizontal deflection angle theta, and projecting distorted point clouds to the positions where motion distortion is removed according to the result of the linear interpolation;
step 3.3: and (3) further optimizing the result of the step (3.2), extracting edge point features and surface point features for matching, and calculating an error term, wherein the error term of the edge point features is as follows:
Figure BDA0002730917950000061
in the formula (I), the compound is shown in the specification,
Figure BDA0002730917950000062
the three-dimensional coordinates of the points i, j and L in a laser coordinate system { L } are respectively; subscripts k and k-1 denote the current frame and the previous frame; t represents the relative pose between two frames;
the error term of the face point feature is:
Figure BDA0002730917950000071
in the formula (I), the compound is shown in the specification,
Figure BDA0002730917950000072
representing the three-dimensional coordinates of the point m under the laser coordinate system { L }; l, m and j are three face points nearest to the point i;
the optimization loss function is expressed as:
Figure BDA0002730917950000073
in the formula (I), the compound is shown in the specification,
Figure BDA0002730917950000074
represents a point set consisting of edge points in the k-th frame laser coordinate system L,
Figure BDA0002730917950000075
represents a point set formed by surface points under a k frame laser coordinate system { L }, wherein lambda represents a relative weight coefficient between two types of error terms, both of which contain a parameter T, and representsThe relative pose relationship between two frames.
Further, the method for extracting the edge point features and the face point features comprises the following steps:
step 3.3.1: dividing the three-dimensional laser radar point cloud into a plurality of two-dimensional laser radar point cloud sets with different pitch angles, and defining a smoothness c for all points in each beam of laser point cloud through the position relation between each point and adjacent points thereof:
Figure BDA0002730917950000076
in the formula (I), the compound is shown in the specification,
Figure BDA0002730917950000077
the three-dimensional coordinates of the ith point of the nth line beam representing the kth frame point cloud in the laser coordinate system { L }, and
Figure BDA0002730917950000078
then represents the set of all points within the polar angular neighborhood of the ith point;
step 3.3.2: ordering the points in a bundle of point clouds according to smoothness c, and setting two thresholds TpAnd Te
When the smoothness of a point is less than TpWhen the method is used, the method is close enough to an adjacent point and is taken as a face point;
when the smoothness of a point is greater than TeWhen the distance between the adjacent point and the edge point is longer, the adjacent point is taken as the edge point;
when the number of the face points and the edge points exceeds a threshold value, extracting the top NpFacial point feature and front NeCharacteristic of individual edge points, NpAnd NeThe maximum number of face points and edge points.
Further, in the loop detection and repositioning by using the key frame strategy in step 3, after finding out similar key frames, the relative poses of the two frames also need to be solved through feature matching and PnP, the visual motion information in step 2 is optimized, and the optimization target is the sum of the reprojection errors between the two frames after the depth information of the two frames successfully matched by features is restored, which is expressed as:
Figure BDA0002730917950000081
wherein the content of the first and second substances,
Figure BDA0002730917950000082
expressing an error term optimization function, T expressing the relative pose between two frames, uiIndicating the pixel coordinates of the ith matching point in the first image, viRepresenting the pixel coordinates of the ith matching point in the second image, d representing the depth information of the pixel points in the two images, pi being the projection function of projecting the three-dimensional point to the pixel coordinate system, pi-1Restoring the pixel coordinates and the corresponding depth information into a back projection function of a three-dimensional coordinate point; d1,iDepth information representing the ith matching point in the first image, d2,iIndicating depth information of the ith matching point in the second image, T-1Is the inverse of the relative pose;
the final optimization result is the pose constraint between two frames.
Further, the key frame in step 3 satisfies the following condition:
1) the number of the feature points matched with the previous frame is more than NminEnsuring the tracking quality; n is a radical ofminIs the minimum match threshold;
2) the feature point matched with the last key frame is less than NmaxRedundant storage is avoided; n is a radical ofmaxIs the maximum matching threshold;
3) the minimum interval is limited and the storage is reduced by M frames after the last key frame passes; m is the minimum separation threshold.
Compared with the prior art, the invention has the advantages that:
1) the invention builds a SLAM system with visual laser information fusion on the basis of a laser odometer system LOAM. The front end of the system mainly comprises a visual odometer and a laser odometer. In the visual odometer, sparse depth information is obtained by projection from laser point cloud to an image, then depth completion is carried out on the sparse depth image, and an accurate initial pose is estimated through feature matching between two frames and minimized reprojection errors.
2) The method is mainly characterized in that the experimental visual motion information replaces the previous uniform motion hypothesis to remove the point cloud motion distortion, and the precision of the algorithm is improved on the whole. And a good initial value not only avoids entering a degradation environment during point cloud registration, but also avoids falling into a local optimal solution in some specific environments, thereby improving the robustness of the algorithm, accelerating the iterative convergence speed during point cloud registration and realizing real-time positioning (less than 100 milliseconds).
3) According to the invention, a loop detection and relocation module is added at the rear end of the odometer, a visual bag-of-words model is adopted, and a characteristic dictionary formed by a multi-branch tree is maintained through hierarchical clustering, so that the retrieval robustness is ensured and the retrieval speed of the characteristics is accelerated. And when the loop is detected, the integral optimization of the inside of the whole track loop is completed through the pose graph optimization. The module not only eliminates the accumulated error caused by long-time movement and reduces the absolute error, but also ensures that the system has the function of repositioning after returning to the original track after being hijacked, thereby being beneficial to the positioning in long-time movement.
Drawings
FIG. 1 is a flow chart of the SLAM method of the present invention;
FIG. 2 is a schematic diagram of the present invention for calculating the reprojection error of two types of points;
FIG. 3 is a schematic diagram of pose graph optimization;
FIG. 4 is a trace of the present invention under KITTI data set, wherein FIGS. 4-1 to 4-3 show traces over KITTI-00 sequence, KITTI-05 sequence, and KITTI-09 sequence, respectively;
FIG. 5 is the mapping result of the present invention at KITTI-07.
Detailed Description
The invention is described in detail below with reference to the accompanying drawings. The technical features of the embodiments of the present invention can be combined correspondingly without mutual conflict.
The invention relates to a vision and laser radar based simultaneous positioning and map building method, which comprises the steps of simultaneously operating a vision odometer and a laser odometer, wherein the vision odometer acquires image information of the surrounding environment in real time by using a camera to obtain image data; the laser odometer utilizes a laser radar to emit laser and receive reflected light of a peripheral measured object for ranging, and obtains structured information of a peripheral environment to obtain laser point cloud data;
as shown in fig. 1, the method mainly comprises the following steps:
the method comprises the following steps: projecting the laser radar point cloud into an image by using the calibrated external parameters, densifying the sparse depth, and operating a visual odometer with depth information at the front end;
step two: and adding the visual motion information serving as an initial value into a laser radar odometer, removing the motion distortion of the laser radar, and solving the motion information of the laser radar through local point cloud map matching.
Step three: and (4) running a loop detection and repositioning module at the back end, optimizing a closed loop and smoothing a track by using a pose graph, and reducing accumulated errors.
In one embodiment of the present invention, the implementation of step one is described.
Obtaining a dense depth map from a sparse depth map is mainly divided into the following 7 steps:
1) depth Inversion (Depth Inversion): since the expansion operation is performed for the purpose of "bright area expansion", i.e. depth with larger values will cover depth with smaller values in the surroundings, the invention needs to be performed closer, i.e. for expanding depth information with smaller values. The inversion operation of the effective depth information is required.
Figure BDA0002730917950000101
Figure BDA0002730917950000102
Wherein the content of the first and second substances,
Figure BDA0002730917950000103
the depth value of the visual characteristic point after inversion is obtained; dmaxIs a preset maximum depth value. In KITTI data, the depth value is within 80 m, so the invention takes Dmax=100。DxAs depth values of visual feature points before inversion, DxNot equal to 0 indicates that there is depth information, i.e. it belongs to valid depth information; d x0 means no depth information.
2) Dilation (Custom Kernel Dilation) with Custom convolution kernels: and designing a proper convolution kernel to expand the depth information-free area.
3) Close operation (Small Hole close) for Small holes: after the previous expansion operation, a part of the region still has the lack of depth information, and at this time, a closed operation (expansion first and then corrosion) can be adopted to fill the small cavity, and a full convolution kernel of 5 × 5 is adopted.
4) Fill the Hole (Small Hole Fill): some regions that still lack depth information are filled using a 7 x 7 dilation operation of the full convolution kernel.
5) Fill Large holes (Large Hole Fill): the large holes are filled by using a full convolution kernel of 31 × 31, except for areas where the laser beam cannot reach, such as sky, and the like, and areas where the depth information is not present at this time, which are filled by using a large convolution kernel.
6) Median/gaussian filtering (Median and gaussian Blur): in order to reduce the noise introduced by the dilation operation, the outliers of the local edges are removed first using a 5 × 5 median filter, and then the sharp edges are smoothed using a 5 × 5 gaussian filter.
7) Depth re-Inversion (Depth Inversion): and recovering the real depth information.
After the depth completion, the depth information of partial characteristic points is obtained, but because the visible area of the laser radar is limited, about one third of the area above the image has no depth information, and two different types of matching point error terms are designed to be weighted to obtain the final error term.
As shown in FIG. 2, for points with depth, their error term is the three-dimensional to two-dimensional reprojection error, p with depth information1Back-projected into three-dimensional space, and re-projected onto another image plane to obtain points
Figure BDA0002730917950000111
And the position of the true matching point is p1', then distance
Figure BDA0002730917950000112
A reprojection error term with depth points is constructed. For a point without depth, the invention needs to design its error term by another geometric model, and for a point without depth, p2In other words, its three-dimensional position may appear at ray Op2The invention can calculate the true matching point p of the ray at any position and the projection of the ray in the second camera is a straight line2The distance of the' position from the projected line is used as a reprojection error term. Thus, the invention constructs two error terms for different types of matching points.
Figure BDA0002730917950000113
In the formula, T*Expressing an error term optimization function, and expressing the relative pose between two frames by T; n denotes the number of feature points with depth information, m denotes the number of feature points without depth information, pi is a projection function of projecting a three-dimensional point to a pixel coordinate system, pi-1Restoring the pixel coordinates and the corresponding depth information into a back projection function of a three-dimensional coordinate point; u. of(1,i)Representing the pixel coordinates of the ith point in the left view among the first type of points, i.e. points having depth information, diRepresenting depth information representing an ith point of a first class of points,v(1,i)Representing the pixel coordinates representing the ith point in the right view in the first class of points, u(2,i)Representing the pixel coordinates of the ith point in the left view, phi, of the second type of points, i.e. points without depth information-1Is a back projection function for projecting the pixel points into a three-dimensional ray, | · luminance |2Calculating the modular length of the distance, wherein t is a threshold value; lambda is a weight coefficient for adjusting the confidence between the two types of points; dist (-) represents a point-to-ray distance function.
The invention adopts a characteristic point method to complete the relative motion estimation between two frames, and the relative motion can be used for removing the motion distortion of the current frame point cloud and can also be used as an initial relative motion estimation to be transmitted to a laser odometer, thereby avoiding the laser falling into a degradation environment. In an embodiment of the present invention, a description is given of a process of calculating visual motion information, and this embodiment takes calculating the visual motion information of the k frame to the (k + 1) th frame as an example.
Assuming that the motion information for the k-1 th to k-th frames is already known, it is now necessary to estimate the motion of the k-1 th to k +1 th frames:
1) and completing feature extraction and matching of the kth frame image and the (k + 1) th frame image.
2) And removing the motion distortion of the point cloud of the k frame by using the motion information from the k-1 frame to the k frame.
3) And restoring the depth of the matching feature points of the kth frame by using the image of the kth frame and the point cloud information of the kth frame after the motion distortion is removed. The depth information of the (k + 1) th frame point cloud is not recovered here, because there is no motion information at this time, and motion distortion cannot be removed for the point cloud.
4) Counting the number of characteristic points with depth in the kth frame, if the number of the characteristic points is less than 20, judging that the characteristic points are fewer at the moment, the robust motion estimation cannot be obtained, returning error information, and continuously using a uniform motion hypothesis; otherwise, the next step is continued.
5) And carrying out PnP calculation on the feature points with depth in the kth frame and the feature points without depth in the (k + 1) th frame to obtain the motion information from the kth frame to the (k + 1) th frame.
6) And (5) optimizing the reprojection error to obtain the optimal pose estimation.
In one embodiment of the present invention, the implementation of step two is described.
The invention considers the three-dimensional laser radar point cloud as a set of two-dimensional laser radar point clouds with a plurality of different pitch angles. And for all points in each laser point cloud, returning is usually in a clockwise or counterclockwise direction (depending on the rotation direction of the laser radar), that is, if a two-dimensional polar coordinate system with the laser emission center as a pole and the laser initially facing as a polar axis is constructed, sorting by the size of the polar angle is already completed when the point cloud returns. According to the position relation of each point and adjacent points in the same line, the invention extracts edge points and face points from the points. Thus, adjacent points in the index are typically points that are close in polar angle.
The method for extracting the edge point features and the face point features comprises the following steps:
1) a smoothness (smoothness) c is first defined for each point by its positional relationship to its neighbors.
Figure BDA0002730917950000121
Wherein the content of the first and second substances,
Figure BDA0002730917950000122
the three-dimensional coordinates of the ith point of the nth line beam representing the kth frame point cloud in the laser coordinate system { L }, and
Figure BDA0002730917950000123
then the set of all points within the polar angular neighborhood of the ith point is represented.
2) Sorting the points in a bundle of point clouds according to their smoothness c, and setting two thresholds TpAnd Te
3) When the smoothness of a point is less than TpWhen it is said to be close enough to the neighboring point, it is considered to be a planar point.
4) When the smoothness of a point is greater than TeWhen it is far from the neighboring point, it is considered as an edge point (edge point).
5) Meanwhile, the invention also has limit to the number of the edge points and the face points, and when the number of the feature points exceeding the threshold value is excessive, the first N is takenpFacial point feature and front NeAnd (4) characteristics of each edge point.
There are two cases that need to be considered as well:
the first condition is as follows: when the angle of the smooth surface formed by the incident angle and the point cloud is small, the number of points on the plane which can be detected is small, and subsequent feature matching is influenced, so that the points are not set as surface points.
Case two: after enough edge points and face points are obtained, the invention needs to match the features between adjacent frames to design a reasonable error term to estimate the current motion state of the motion.
In one embodiment of the present invention, the present invention generally makes the assumption of uniform motion without other sensor information, and then further fine-registers this estimated motion state. Under the condition of uniform motion, the invention uses the next frame of laser coordinate system { L }k+1Projecting the points below the point to the laser coordinate system (L) of the previous framekAnd (6) below.
The error term of the edge point is considered first. For the k +1 th frame coordinate system { Lk+1The edge points i in the lattice constitute a point set epsilonk+1Projected to the k-th frame coordinate system { L }kIn the preceding, form a point set
Figure BDA0002730917950000131
Then for each of the edge points i,
Figure BDA0002730917950000132
at epsilonkWhere e is the nearest two edge points to point ikA point set composed of k-th frame edge points is shown. Since a real edge in the three-dimensional space and any one of the laser beams have at most one intersection point, it is necessary to verify whether the two found edge points belong to the same beam. When two points j and l nearest to the point i belong to different wire harnesses, the invention can use the distance from the point to the straight line as an error term:
Figure BDA0002730917950000133
wherein the content of the first and second substances,
Figure BDA0002730917950000134
the three-dimensional coordinates of the points i, j, L in the laser coordinate system { L }, respectively. Subscripts k and k +1 denote two adjacent frames; t represents the relative pose between two frames;
the error term for the face point is then considered. For the k +1 th frame coordinate system { Lk+1Set of points formed by surface points i in
Figure BDA0002730917950000135
Project it to the k frame coordinate system { L }kIn the preceding, form a point set
Figure BDA0002730917950000136
Then for each of the points i of the face,
Figure BDA0002730917950000137
in that
Figure BDA0002730917950000138
Finding three face points nearest to point i, wherein
Figure BDA0002730917950000139
A point set composed of k-th frame edge points is shown. In order to be able to construct a plane without degradation, the three points cannot be located on the same beam, while it is also verified whether the three points are on the same line. Only three surface points which simultaneously satisfy non-identical wire harnesses and non-identical straight lines can form an error term of one surface point:
Figure BDA00027309179500001310
wherein the content of the first and second substances,
Figure BDA0002730917950000141
representing the three-dimensional coordinates of the point m under the laser coordinate system { L }; l, m and j are three face points nearest to the point i;
projection error term d when determining edge points and surface pointsεAnd
Figure BDA0002730917950000142
then, the invention accumulates the error terms and uses a nonlinear optimization method to solve an optimal pose.
The invention combines two types of error terms through a parameter lambda:
Figure BDA0002730917950000143
1) wherein λ represents the relative weight coefficient between the two types of error terms, and that coefficient in this loop can be adjusted according to the specific link. For some scenes with rich planar structure, the value of λ (λ >1) is increased so that the registration of the face feature points has higher weight. For some more complex regions, where the face feature points are less abundant, the relative weight of the edge points can be increased by turning λ down (λ < 1). In this experiment, the present invention provides λ ≡ 1.
2) The two error terms both contain a parameter T, which represents the relative pose relationship between two frames, and the projection position of the point i in the previous frame can be changed by adjusting the T, and the adjacent point is searched to obtain the error.
The final optimization objective is
Figure BDA0002730917950000144
The invention discovers that in the iterative optimization process, the projection position of the point i changes when the parameter T is updated each time, and at the moment, the adjacent characteristic points are recalculated to calculate the optimal error term. If the adjacent feature points are recalculated for each iteration, the time complexity of the algorithm is greatly increased. In the experimental process, the invention finds that if the initial value is better, and under the condition of less iteration times, the neighboring points of most of the reprojection points can not be changed. Therefore, in the practical realization process of the algorithm, the invention completes the mutual iteration process through two cycles:
the outer loop control seeks the neighboring points again at most for the number of times MaxSearchTimes.
Iterative optimization only under the condition that internal recirculation control adjacent points are not changed
Figure BDA0002730917950000145
The number of times { maxtiertimes }.
In addition, in order to improve the efficiency of searching adjacent points, the invention adopts a kd-tree to maintain three-dimensional point cloud, and reduces the complexity of linear search O (N) to O (Klog)2And N), wherein K is 3, the coordinate dimension number of the point, and N is the number of the characteristic points.
The inter-frame motion estimation can obtain an overall motion track through the relative motion relationship between every two frames, but due to the existence of observation errors and matching errors, the overall track obtained through the relative motion relationship between the two frames often has larger accumulated errors after moving for a long time. In order to reduce accumulated errors, the method based on the sliding window is adopted, the feature points of each frame are matched with the feature point set of the nearest N frames, and all the feature points are represented by a world coordinate system { W }, so that the result obtained after final optimization is directly the pose of the world coordinate system { W }. In one embodiment of the present application, N ≡ 10 is taken.
In absolute motion estimation, the number of feature points is the superposition of the nearest N frames of point clouds and their motion states. The KD-tree searching for neighbors is enlarged by a factor of N, and over-relying on nearest neighbors is prone to overfitting due to observation errors. Therefore, the invention does not use the nearest neighbor point to construct an error term, but finds the main directions of the edges and the faces from several nearest points.
The invention also uses kd-tree to maintain the feature point set to be matched of the nearest N frames of point clouds. Similar to the same-phase relative motion estimation, the initial pose T is determined through a uniform motion model0
In one embodiment of the present invention, a process for removing motion distortion of a lidar using visual motion information is described.
The angular speed of the rotary laser radar is assumed to be uniform, the specific time corresponding to each point can be calculated quickly, and the pose of any time can be calculated, so that the side points and the surface points are projected to the position after distortion removal according to the relative poses.
In one embodiment of the present invention, the motion compensation for the point cloud between the time k and the time k +1 includes the following steps:
1) for each three-dimensional point, the horizontal deflection angle θ of the point relative to the initial direction is calculated from the x and y direction length values.
2) Linear interpolation of attitude in time k and k +1 according to theta
3) Projecting the distorted point cloud to the position with the motion distortion removed according to the linear interpolation result
4) The result of the linear interpolation is further optimized. In an original laser radar odometer, edge points and face points need to be extracted for matching, error items are calculated, the optimal laser radar pose is solved by optimizing the error items, and the error items of the edge points are as follows:
Figure BDA0002730917950000151
wherein
Figure BDA0002730917950000161
The three-dimensional coordinates of the points i, j, L in the { L } coordinate system, respectively.
The error term of the face point is
Figure BDA0002730917950000162
Assuming the position of each point in the undistorted point cloud is
Figure BDA0002730917950000163
It is no longer fixed in common ICPA value, but one that changes constantly as a function of updates to the relative pose T, then the edge point error term is rewritten as:
Figure BDA0002730917950000164
likewise, the face point error term is rewritten as:
Figure BDA0002730917950000165
the final loss function to be optimized is:
Figure BDA0002730917950000166
in the formula (I), the compound is shown in the specification,
Figure BDA0002730917950000167
represents a point set consisting of edge points in the k-th frame laser coordinate system L,
Figure BDA0002730917950000168
and (3) representing a point set formed by surface points under a k-th frame laser coordinate system { L }, wherein lambda represents a relative weight coefficient between two types of error terms, the two error terms both contain a parameter T, and the relative pose relationship between the two frames is represented.
In one embodiment of the present invention, the process of step three is described.
According to the invention, a loop detection and repositioning module is added and is realized through a visual bag, so that a complete visual-laser SLAM system is formed, after loop is detected, the system carries out pose diagram optimization according to loop constraint, and accumulated errors after long-time movement are eliminated to a certain extent.
The pose graph optimization process specifically comprises the following steps:
after the loop constraint is obtained through the visual bag-of-words model, the relative position relationship between two loop frames can be calculated, so that new constraint relationships except for the adjacent frames before and after are obtained. If the current k frame and the previous i frame are determined to have a loop through loop detection, a large number of similar features must exist between the two frames, the similar features are matched, meanwhile, the depth information of feature points can be obtained through projection and depth completion of laser point cloud on an image, and the estimation of the relative pose between the two frames is completed through calculating PnP.
Suppose the pose estimated by the odometer is { T }i|i∈[0,9]Detecting T by a loop detection module0And T7,T2And T9There is a loop back. By T0And T7For example, the relative pose between two frames is solved as T through the PnP between the two frames0,7Then, then
T0,7=T0-1T7
Theoretically, if the calculation of the odometer and the pose calculation are accurate, the equation can be established; however, in the visual problem, since an uncertain factor such as an observation error inevitably exists, the equation is not strictly satisfied, and therefore, it is necessary to construct a least square error. A common method is by Pose Graph Optimization (PGO). As shown in FIG. 3, the nodes represent the pose of each frame, and the directed edges between the two nodes represent the relative constraints between the poses.
The loop detection needs to store historical information of image frames, and if the information of each previous frame is stored indiscriminately, when the track is gradually increased, the consumption of memory is increased, and the time loss of loop detection is also seriously affected, so a key frame strategy is generally adopted in the most advanced SLAM system.
In order to more efficiently complete loop detection and relocation, namely, to retrieve the picture most similar to the current frame from the historical pictures, the invention also uses the idea of the key frame for reference, namely only storing the key frame information, so that the memory loss in operation can be greatly reduced. Regarding the strategy of the key frame, the SLAM systems are slightly different in detail implementation, but are basically the same and different, and some frames with rich characteristics are selected as the key frame. The following is a key frame policy used in the embodiments of the present invention, and the following three conditions must be satisfied simultaneously:
1) the number of the feature points matched with the previous frame is more than NminAnd the tracking quality is ensured.
2) The feature point matched with the last key frame is less than NmaxRedundant storage is avoided.
3) The minimum spacing is limited by the distance that the last key frame has passed M frames, further reducing storage.
After the similar key frames are found through the loop detection and repositioning module, the relative poses of the similar key frames also need to be solved through feature matching and PnP.
Different from the PnP between the previous frame and the previous frame, the motion distortion of the two frames forming the loop is removed through the motion information of the previous frame and the next frame, and the current depth information is credible, so that the pose solved by the PnP can be further optimized, and the optimization target is the sum of the reprojection errors of the two frames after the depth information is recovered:
Figure BDA0002730917950000171
wherein the content of the first and second substances,
Figure BDA0002730917950000181
expressing an error term optimization function, T expressing the relative pose between two frames, uiIndicating the pixel coordinates of the ith matching point in the first image, viRepresenting the pixel coordinates of the ith matching point in the second image, d representing the depth information of the pixel points in the two images, pi being the projection function of projecting the three-dimensional point to the pixel coordinate system, pi-1Restoring the pixel coordinates and the corresponding depth information into a back projection function of a three-dimensional coordinate point; d1,iDepth information representing the ith matching point in the first image, d2,iIndicating depth information of the ith matching point in the second image, T-1Is the inverse of the relative pose;
the final optimization result is the pose constraint between two frames.
Examples
In order to further demonstrate the implementation effect of the present invention, in this embodiment, the original data of the KITTI in the raw data set, which is not partially undistorted, is adopted, and 10 of the first 11 sequences (00-10) in the KITTI are selected as experimental data (the sequence 03 does not find corresponding original point cloud data from the raw data).
Evaluation indexes are as follows:
according to different positioning requirements, the relative error and the absolute error are adopted to respectively evaluate the positioning accuracy.
Relative error evaluation takes 8 poses of 100 m, 200 m and 300 m … … 800 m as an initial frame, compares the poses with the poses of real values calculated by the same method, divides the error by the actual length of each segment, and calculates the average rotation error R (unit: degree/100 m) and translation error t (unit:%). The relative error evaluation avoids the accumulated error after long-time movement by setting the maximum length of the sequence, and is used for evaluating the relative positioning accuracy.
The absolute error evaluation compares all poses directly with the true pose, divides by the total trajectory length, and then calculates the average rotation error R (unit: degree/100 meter) and translation error t (unit:%). The absolute error evaluation maximally considers the existence of accumulated errors and is used for evaluating the overall positioning accuracy of the whole sequence.
It should be noted that the relative error and the absolute error are measured in different dimensions, there is no necessary positive correlation, and the small relative error does not mean the small absolute error. In addition, the following abbreviated VLO indicates a method for fusing laser and visual information proposed by the present invention, i.e., the loop back detection step is removed in the overall scheme of the present invention. VL-SLAM represents VLO after addition of loop back detection, the complete invention.
The difference between the odometer method provided by the invention and the pure laser odometer method is that the motion distortion of the motion information obtained by calculation is removed by assisting the laser point cloud through a visual method, and a better initial value is provided for the laser odometer, so that the laser odometer is prevented from entering a local optimal solution.
In the experiment, the VLO (Visual-LiDAR Odometry) proposed by the present invention was compared with the most advanced pure laser odometer LOAM in the current open source project. In addition, to better verify positioning accuracy, the VLO was also tested against the current state-of-the-art open source SLAM system ORB-SLAM 2.
Experiment 1: comparison of positioning accuracy of original point cloud data
TABLE 1 comparison of positioning accuracy of VLO (raw Point clouds) and LOAM (raw Point clouds) on KITTI
Figure BDA0002730917950000191
The experiment used raw point cloud data and the positioning accuracy comparison results are shown in table 1. It can be seen from the table that compared with a pure laser odometer LOAM, the positioning accuracy of the VLO fusing laser and visual information provided by the invention is greatly improved in both relative positioning accuracy and absolute positioning accuracy, on one hand, because reliable visual motion information is provided to replace the assumption of uniform motion to help remove the motion distortion of the laser, on the other hand, a good initial value is provided for ICP in the laser odometer, thereby not only accelerating the convergence speed, but also avoiding the falling into a degradation environment.
Experiment 2: verification of distortion removal effect
In order to verify the effect of removing the laser point cloud motion distortion by using the visual method, the distortion removal switch in the LOAM is turned off, the point cloud with the motion distortion removed is collected by using the odometry data set, then the positioning accuracy is compared with the VLO with the motion distortion removed by using the visual method, and the relative error and the absolute error of the point cloud with the motion distortion removed are calculated on each sequence respectively, wherein the VLO experiment is not performed on the sequence 03 due to the lack of the original point cloud information. The results of the experiment are shown in table 2.
TABLE 2 comparison of positioning accuracy of VLO (original point cloud) and LOAM (undistorted point cloud) on KITTI
Figure BDA0002730917950000192
Figure BDA0002730917950000201
As can be seen from table 2, the LOAM using the undistorted point cloud is equivalent in precision to the VLO using the original point cloud proposed by the present invention on most data sets, and the difference is within the error range, which indicates that the distortion removal effect in the VLO is already close to the true value.
Experiment 3: run time comparison
In addition to the consideration of the positioning accuracy, the time performance is also one of the considered targets of the present invention, and in the motion estimation, the matching relationship between the edge point and the face point is changed after each iteration is completed, and ICP iteration is required again after re-matching. In the LOAM, due to the lack of a good initial motion estimate, more re-matches and ICP times are often required, where the maximum number of re-matches is set to 5. Since there is already a good visual motion estimation in VLO, the maximum number of times of re-matching is set to 3, and it can be seen from table 2 that the number of times of re-matching is small, but the positioning accuracy of VLO is not affected.
The average running time of each frame is considered in the LOAM, and the average running time of the visual odometer and the average moving time of the laser odometer are considered separately in the VLO due to the need of the visual moving information as a priori, and the comparison results are shown in table 3.
TABLE 3 LOAM vs VLO runtime on KITTI
Figure BDA0002730917950000202
As can be seen from table 3, although the vision module is added to the VLO, which brings extra time consumption to the system, a good initial solution can reduce the burden of the laser odometer to a certain extent, and a good pose can be obtained by optimization through a small number of iterations, so that not much consumption is brought to the total time efficiency, and a real-time effect (less than 100 milliseconds) can be achieved on the KITTI.
Experiment 4: comparison with the accuracy of a pure vision odometer
This experiment uses the most advanced open source system ORB-SLAM2 (closed loop closure module) to perform a precision comparison experiment with VLO. Because the pure monocular odometer lacks scale information and cannot be compared fairly, the binocular system in the ORB-SLAM2 is selected for the experiment, the system uses stereo parallax to complete depth estimation through a calibrated binocular camera, but compared with laser information, the depth confidence obtained by binocular vision is not high, and continuous iterative optimization is needed.
TABLE 4 comparison of location accuracy of VLO (raw Point cloud) and ORB-SLAM2-stereo on KITTI
Figure BDA0002730917950000211
It can be seen from table 4 that the two have good and bad performance on different sequences because the two have different main data sources, the binocular vision adds more visual information of the right-side camera, and the lidar adds more point cloud structure information of the scene, so that the two have respective advantageous scenes (for example, the binocular odometer in ORB-SLAM2 in KITTI-04, KITTI-10 performs better, and the accuracy of VLO in KITTI-00, KITTI-07 is higher). Furthermore, the displacement error of the VLO location of the present invention is relatively smaller over most of the data set than ORB-SLAM 2-stereo.
Similarly, with the addition of the loop back constraint, the positioning accuracy of the VL-SLAM of the present invention is significantly improved, as shown in Table 5.
TABLE 5 location accuracy comparison of VL-SLAM on KITTI Loop data
Figure BDA0002730917950000212
Figure BDA0002730917950000221
Therefore, after loop detection is added, the positioning precision is obviously improved.
In addition, compared graphs of the VL-SLAM and the real track are shown in FIGS. 4-1 to 4-3, which respectively show track graphs on a KITTI-00 sequence, a KITTI-05 sequence and a KITTI-09 sequence, and the track is basically coincident with the real track, so that the description precision is extremely high. FIG. 5 shows the mapping result of the present invention at KITTI-07.
The invention builds a SLAM system with visual laser information fusion on the basis of a laser odometer system LOAM. The front end of the system mainly comprises a visual odometer and a laser odometer. In the visual odometer, sparse depth information is obtained through projection from laser point cloud to an image, then depth completion is carried out on the sparse depth image, a better initial pose is estimated through feature matching between two frames and minimized reprojection error, the initial pose is substituted into the laser odometer to complete further point cloud distortion removal and point cloud registration, and a good initial value not only prevents the point cloud from entering a degradation environment during point cloud registration, but also accelerates the iterative convergence speed during point cloud registration.
In addition, a loop detection and repositioning module is added at the rear end of the odometer, a visual bag-of-words model is adopted, and a characteristic dictionary formed by a multi-branch tree is maintained through hierarchical clustering, so that the retrieval robustness is ensured, and the retrieval speed of the characteristics is accelerated. And when the loop is detected, the integral optimization of the inside of the whole track loop is completed through the pose graph optimization. The module not only eliminates the accumulated error caused by long-time movement, but also ensures that the system has the function of repositioning after returning to the original track after being hijacked.
The following conclusions can be drawn from comparative experiments:
1) the VLO provided by the invention has greatly improved positioning accuracy compared with a pure laser odometer under the condition of using the original point cloud, is mainly reflected in that experimental visual motion information replaces the previous uniform motion hypothesis to remove point cloud motion distortion, and improves the accuracy of the algorithm on the whole.
2) The VLO can avoid the laser odometer from falling into a degraded environment (such as KITTI-02) by adding visual prior motion information in some special scenes, and the robustness of the algorithm is improved.
3) The VLO can perform real-time positioning on the PC (less than 100 ms), consuming 88 ms per frame at KITTI, slightly slower than 82 ms for pure laser odometer LOAM, which is essentially negligible.
4) The addition of the loop detection module can reduce absolute errors to a certain extent, and is favorable for positioning in long-time movement.
The foregoing lists merely illustrate specific embodiments of the invention. It is obvious that the invention is not limited to the above embodiments, but that many variations are possible. All modifications which can be derived or suggested by a person skilled in the art from the disclosure of the present invention are to be considered within the scope of the invention.

Claims (8)

1. A simultaneous localization and map construction method based on vision and laser radar is characterized by comprising the following steps:
step 1: simultaneously operating a visual odometer and a laser odometer, wherein the visual odometer acquires image information of the surrounding environment in real time by using a camera to obtain image data; the laser odometer utilizes a laser radar to emit laser and receive reflected light of a peripheral measured object for ranging, and obtains structured information of a peripheral environment to obtain laser point cloud data;
step 2: projecting the laser point cloud to a pixel plane by using an external reference result calibrated by combining a laser radar and a camera, and converting the laser point cloud into depth information of an image; taking the depth information as the scale constraint of the visual odometer, matching visual feature points from a previous frame to a current frame in the image data, and calculating visual motion information;
and step 3: removing the motion distortion of the laser point cloud of the current frame by using the visual motion information from the previous frame to the current frame; and (3) taking the visual motion information obtained in the step (2) as an initial pose, registering the laser point cloud through the local point cloud map, constructing an environment map, and continuously performing loop detection and repositioning by using the key frame to optimize the pose of the key frame and map points.
2. The vision and lidar based simultaneous localization and mapping method according to claim 1, wherein after the laser point cloud is projected to the pixel plane in step 2, a sparse depth map is generated, and the sparse depth map is complemented to obtain depth information of the image; the completion comprises the following steps:
step 2.1.1: inverting effective depth information in the sparse depth map:
Figure FDA0002730917940000011
Figure FDA0002730917940000013
in the formula (I), the compound is shown in the specification,
Figure FDA0002730917940000012
for the reversed visual feature point depth values, DmaxAt a predetermined maximum depth value, DxAs depth values of visual feature points before inversion, DxNot equal to 0 indicates that there is depth information, i.e. it belongs to valid depth information; dx0 denotes no depth information;
step 2.1.2: expanding a depth information-free region in the sparse depth map;
step 2.1.3: and continuously filling the expanded region without the depth information, then performing median filtering and Gaussian filtering, and finally performing depth re-inversion to recover the depth information.
3. The simultaneous localization and mapping method based on vision and lidar of claim 1, wherein two different types of matching point error term weighting are used as final reprojection errors in the visual feature point matching process of step 2; the method specifically comprises the following steps:
for feature points p with depth information1Back-projecting it back into three-dimensional space and re-projecting it to anotherObtaining points in the image plane
Figure FDA0002730917940000021
And the position of the true matching point is p1', then distance
Figure FDA0002730917940000022
The characteristic point p with depth information is formed1The error term of (2);
for feature points p without depth information2Back-projecting it back into three-dimensional space would appear at ray Op2At any position, O is the origin of the three-dimensional space, and the ray Op2Projecting to another image plane to obtain a straight line, and then matching point p2'the distance from the position of' to the straight line constitutes a feature point p without depth information2The error term of (2);
the final reprojection error is expressed as:
Figure FDA0002730917940000023
in the formula, T*Expressing an error term optimization function, and expressing the relative pose between two frames by T; n denotes the number of feature points with depth information, m denotes the number of feature points without depth information, pi is a projection function of projecting a three-dimensional point to a pixel coordinate system, pi-1Restoring the pixel coordinates and the corresponding depth information into a back projection function of a three-dimensional coordinate point; u. of(1,i)Representing the pixel coordinates of the ith point in the left view among the first type of points, i.e. points having depth information, diRepresenting depth information representing the ith point of the first class of points, v(1,i)Representing the pixel coordinates representing the ith point in the right view in the first class of points, u(2,i)Representing the pixel coordinates of the ith point in the left view, phi, of the second type of points, i.e. points without depth information-1Is a back projection function for projecting the pixel points into a three-dimensional ray, | · luminance |2Calculating the modular length of the distance, wherein t is a threshold value; λ is a weight coefficient for adjusting two kinds of pointsInter-trust; dist (-) represents a point-to-ray distance function.
4. The vision and lidar based simultaneous localization and mapping method according to claim 3, wherein the calculating of the vision motion information in step 2 is specifically:
step 2.2.1: extracting and matching the characteristics of the previous frame image and the current frame image;
step 2.2.2: recovering the depth of the matching feature points of the previous frame by using the previous frame image and the laser point cloud of the previous frame after the motion distortion is removed;
step 2.2.3: counting the number of feature points with depth information in the previous frame, if the number of feature points is less than a threshold value, returning error information, and using a uniform motion hypothesis; otherwise, carrying out PnP calculation on the feature points with depth information in the previous frame and the feature points without depth information in the current frame to obtain visual motion information from the previous frame to the current frame;
step 2.2.4: and (5) optimizing the reprojection error to obtain the optimal pose estimation.
5. The vision and lidar based simultaneous localization and mapping method according to claim 1, wherein the removing of the motion distortion of the laser point cloud of the current frame by using the vision motion information from the previous frame to the current frame in step 3 specifically comprises:
step 3.1: for each three-dimensional point in the current frame laser point cloud, determining a horizontal deflection angle theta of each three-dimensional point relative to the initial direction;
step 3.2: performing linear interpolation between a previous frame and a current frame according to the horizontal deflection angle theta, and projecting distorted point clouds to the positions where motion distortion is removed according to the result of the linear interpolation;
step 3.3: and (3) further optimizing the result of the step (3.2), extracting edge point features and surface point features for matching, and calculating an error term, wherein the error term of the edge point features is as follows:
Figure FDA0002730917940000031
in the formula (I), the compound is shown in the specification,
Figure FDA0002730917940000032
the three-dimensional coordinates of the points i, j and L in a laser coordinate system { L } are respectively; subscripts k and k-1 denote the current frame and the previous frame; t represents the relative pose between two frames;
the error term of the face point feature is:
Figure FDA0002730917940000033
in the formula (I), the compound is shown in the specification,
Figure FDA0002730917940000034
representing the three-dimensional coordinates of the point m under the laser coordinate system { L }; l, m and j are three face points nearest to the point i;
the optimization loss function is expressed as:
Figure FDA0002730917940000035
in the formula (I), the compound is shown in the specification,
Figure FDA0002730917940000036
represents a point set consisting of edge points in the k-th frame laser coordinate system L,
Figure FDA0002730917940000037
and (3) representing a point set formed by surface points under a k-th frame laser coordinate system { L }, wherein lambda represents a relative weight coefficient between two types of error terms, the two error terms both contain a parameter T, and the relative pose relationship between the two frames is represented.
6. The vision and lidar based simultaneous localization and mapping method according to claim 5, wherein the method for extracting the edge point feature and the face point feature comprises:
step 3.3.1: dividing the three-dimensional laser radar point cloud into a plurality of two-dimensional laser radar point cloud sets with different pitch angles, and defining a smoothness c for all points in each beam of laser point cloud through the position relation between each point and adjacent points thereof:
Figure FDA0002730917940000041
in the formula (I), the compound is shown in the specification,
Figure FDA0002730917940000042
the three-dimensional coordinates of the ith point of the nth line beam representing the kth frame point cloud in the laser coordinate system { L }, and
Figure FDA0002730917940000043
then represents the set of all points within the polar angular neighborhood of the ith point;
step 3.3.2: ordering the points in a bundle of point clouds according to smoothness c, and setting two thresholds TpAnd Te
When the smoothness of a point is less than TpWhen the method is used, the method is close enough to an adjacent point and is taken as a face point;
when the smoothness of a point is greater than TeWhen the distance between the adjacent point and the edge point is longer, the adjacent point is taken as the edge point;
when the number of the face points and the edge points exceeds a threshold value, extracting the top NpFacial point feature and front NeCharacteristic of individual edge points, NpAnd NeThe maximum number of face points and edge points.
7. The vision and lidar based simultaneous localization and mapping method according to claim 1, wherein in the step 3, the keyframe strategy is used for loop detection and repositioning, after finding out similar keyframes, the relative poses of the two frames also need to be solved through feature matching and PnP, the vision motion information in the step 2 is optimized, and the sum of the reprojection errors between the two frames after recovering depth information is expressed as:
Figure FDA0002730917940000044
wherein the content of the first and second substances,
Figure FDA0002730917940000045
expressing an error term optimization function, T expressing the relative pose between two frames, uiIndicating the pixel coordinates of the ith matching point in the first image, viRepresenting the pixel coordinates of the ith matching point in the second image, d representing the depth information of the pixel points in the two images, pi being the projection function of projecting the three-dimensional point to the pixel coordinate system, pi-1Restoring the pixel coordinates and the corresponding depth information into a back projection function of a three-dimensional coordinate point; d1,iDepth information representing the ith matching point in the first image, d2,iIndicating depth information of the ith matching point in the second image, T-1Is the inverse of the relative pose;
the final optimization result is the pose constraint between two frames.
8. The simultaneous vision and lidar-based localization and mapping method according to claim 1, wherein the keyframe of step 3 satisfies the following condition:
1) the number of the feature points matched with the previous frame is more than NminEnsuring the tracking quality; n is a radical ofminIs the minimum match threshold;
2) the feature point matched with the last key frame is less than NmaxRedundant storage is avoided; n is a radical ofmaxIs the maximum matching threshold;
3) the minimum interval is limited and the storage is reduced by M frames after the last key frame passes; m is the minimum separation threshold.
CN202011117756.6A 2020-10-19 2020-10-19 Simultaneous positioning and map construction method based on vision and laser radar Withdrawn CN112258600A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011117756.6A CN112258600A (en) 2020-10-19 2020-10-19 Simultaneous positioning and map construction method based on vision and laser radar

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011117756.6A CN112258600A (en) 2020-10-19 2020-10-19 Simultaneous positioning and map construction method based on vision and laser radar

Publications (1)

Publication Number Publication Date
CN112258600A true CN112258600A (en) 2021-01-22

Family

ID=74244771

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011117756.6A Withdrawn CN112258600A (en) 2020-10-19 2020-10-19 Simultaneous positioning and map construction method based on vision and laser radar

Country Status (1)

Country Link
CN (1) CN112258600A (en)

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112925322A (en) * 2021-01-26 2021-06-08 哈尔滨工业大学(深圳) Autonomous positioning method of unmanned vehicle in long-term scene
CN112965063A (en) * 2021-02-11 2021-06-15 深圳市安泽智能机器人有限公司 Robot mapping and positioning method
CN113012191A (en) * 2021-03-11 2021-06-22 中国科学技术大学 Laser mileage calculation method based on point cloud multi-view projection graph
CN113012197A (en) * 2021-03-19 2021-06-22 华南理工大学 Binocular vision odometer positioning method suitable for dynamic traffic scene
CN113031002A (en) * 2021-02-25 2021-06-25 桂林航天工业学院 SLAM running car based on Kinect3 and laser radar
CN113066152A (en) * 2021-03-18 2021-07-02 内蒙古工业大学 AGV map construction method and system
CN113126115A (en) * 2021-04-06 2021-07-16 北京航空航天大学杭州创新研究院 Semantic SLAM method and device based on point cloud, electronic equipment and storage medium
CN113327296A (en) * 2021-06-28 2021-08-31 浙江大学 Laser radar and camera online combined calibration method based on depth weighting
CN113359810A (en) * 2021-07-29 2021-09-07 东北大学 Unmanned aerial vehicle landing area identification method based on multiple sensors
CN113379841A (en) * 2021-06-21 2021-09-10 上海仙工智能科技有限公司 Laser SLAM method based on phase correlation method and factor graph and readable storage medium thereof
CN113450410A (en) * 2021-06-29 2021-09-28 浙江大学 Monocular depth and pose joint estimation method based on epipolar geometry
CN113506369A (en) * 2021-07-13 2021-10-15 阿波罗智能技术(北京)有限公司 Method, apparatus, electronic device, and medium for generating map
CN113587933A (en) * 2021-07-29 2021-11-02 山东山速机器人科技有限公司 Indoor mobile robot positioning method based on branch-and-bound algorithm
CN113625288A (en) * 2021-06-15 2021-11-09 中国科学院自动化研究所 Camera and laser radar pose calibration method and device based on point cloud registration
CN113624221A (en) * 2021-06-30 2021-11-09 同济人工智能研究院(苏州)有限公司 2.5D map construction method fusing vision and laser
CN113625271A (en) * 2021-07-29 2021-11-09 中汽创智科技有限公司 Millimeter wave radar and binocular camera based simultaneous positioning and image building method
CN113689485A (en) * 2021-08-25 2021-11-23 北京三快在线科技有限公司 Method and device for determining depth information of unmanned aerial vehicle, unmanned aerial vehicle and storage medium
CN113777615A (en) * 2021-07-19 2021-12-10 派特纳(上海)机器人科技有限公司 Positioning method and system of indoor robot and cleaning robot
CN113781582A (en) * 2021-09-18 2021-12-10 四川大学 Synchronous positioning and map creating method based on laser radar and inertial navigation combined calibration
CN113820697A (en) * 2021-09-09 2021-12-21 中国电子科技集团公司第五十四研究所 Visual positioning method based on urban building characteristics and three-dimensional map
CN113837277A (en) * 2021-09-24 2021-12-24 东南大学 Multisource fusion SLAM system based on visual point-line feature optimization
CN113947639A (en) * 2021-10-27 2022-01-18 北京斯年智驾科技有限公司 Self-adaptive online estimation calibration system and method based on multi-radar-point cloud line characteristics
CN113947665A (en) * 2021-09-14 2022-01-18 广西大学 Method for constructing map of spherical hedge trimmer based on multi-line laser radar and monocular vision
CN114088087A (en) * 2022-01-21 2022-02-25 深圳大学 High-reliability high-precision navigation positioning method and system under unmanned aerial vehicle GPS-DENIED
CN114154117A (en) * 2021-06-15 2022-03-08 元橡科技(苏州)有限公司 SLAM method
CN114170280A (en) * 2021-12-09 2022-03-11 北京能创科技有限公司 Laser odometer method, system and device based on double windows
CN114199235A (en) * 2021-11-29 2022-03-18 珠海一微半导体股份有限公司 Positioning system and positioning method based on sector depth camera
CN114440892A (en) * 2022-01-27 2022-05-06 中国人民解放军军事科学院国防科技创新研究院 Self-positioning method based on topological map and odometer
WO2022170847A1 (en) * 2021-02-09 2022-08-18 中国科学院深圳先进技术研究院 Online calibration method based on laser and visual fusion
CN115015955A (en) * 2022-05-23 2022-09-06 天津卡尔狗科技有限公司 Method, apparatus, device, storage medium and program product for determining motion information
CN115079199A (en) * 2022-08-22 2022-09-20 山东省科学院海洋仪器仪表研究所 Underwater target multi-mode information sensing system and method
CN115267796A (en) * 2022-08-17 2022-11-01 深圳市普渡科技有限公司 Positioning method, positioning device, robot and storage medium
CN115631319A (en) * 2022-11-02 2023-01-20 北京科技大学 Loopback detection method based on cross attention network
CN115631314A (en) * 2022-12-19 2023-01-20 中汽研(天津)汽车工程研究院有限公司 Point cloud map construction method based on multiple features and self-adaptive key frames
WO2023004956A1 (en) * 2021-07-30 2023-02-02 西安交通大学 Laser slam method and system in high-dynamic environment, and device and storage medium
CN115797490A (en) * 2022-12-19 2023-03-14 广州宸境科技有限公司 Drawing construction method and system based on laser vision fusion
CN116071550A (en) * 2023-02-09 2023-05-05 安徽海博智能科技有限责任公司 Laser radar dust point cloud filtering method
CN116295457A (en) * 2022-12-21 2023-06-23 辉羲智能科技(上海)有限公司 Vehicle vision positioning method and system based on two-dimensional semantic map
CN116299500A (en) * 2022-12-14 2023-06-23 江苏集萃清联智控科技有限公司 Laser SLAM positioning method and device integrating target detection and tracking
CN116859360A (en) * 2023-09-04 2023-10-10 深圳安智杰科技有限公司 Motion estimation method and system based on single-frame millimeter wave Lei Dadian cloud
CN112881999B (en) * 2021-01-25 2024-02-02 上海西虹桥导航技术有限公司 Semi-automatic calibration method for multi-line laser radar and vision sensor
CN113379841B (en) * 2021-06-21 2024-04-30 上海仙工智能科技有限公司 Laser SLAM method based on phase correlation method and factor graph and readable storage medium thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446815A (en) * 2016-09-14 2017-02-22 浙江大学 Simultaneous positioning and map building method
CN110009739A (en) * 2019-01-29 2019-07-12 浙江省北大信息技术高等研究院 The extraction and coding method of the motion feature of the digital retina of mobile camera

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446815A (en) * 2016-09-14 2017-02-22 浙江大学 Simultaneous positioning and map building method
CN110009739A (en) * 2019-01-29 2019-07-12 浙江省北大信息技术高等研究院 The extraction and coding method of the motion feature of the digital retina of mobile camera

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
LI JINYU,YANG BANGBANG,AND ETC,.: "Survey and evaluation of monocular visual-inertial SLAM algorithms for augmented reality", 《VIRTUAL REALITY & INTELLIGENT HARDWARE》 *
刘浩敏,章国锋,鲍虎军: "基于单目视觉的同时定位与地图构建方法综述", 《计算机辅助设计与图形学学报》 *
刘浩敏,章国锋,鲍虎军: "面向大尺度场景的单目同时定位与地图构建", 《中国科学:信息科学》 *
王宇伟: "基于视觉与激光雷达融合的鲁棒同时定位与地图构建", 《万方学位论文》 *

Cited By (69)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112881999B (en) * 2021-01-25 2024-02-02 上海西虹桥导航技术有限公司 Semi-automatic calibration method for multi-line laser radar and vision sensor
CN112925322A (en) * 2021-01-26 2021-06-08 哈尔滨工业大学(深圳) Autonomous positioning method of unmanned vehicle in long-term scene
WO2022170847A1 (en) * 2021-02-09 2022-08-18 中国科学院深圳先进技术研究院 Online calibration method based on laser and visual fusion
CN112965063A (en) * 2021-02-11 2021-06-15 深圳市安泽智能机器人有限公司 Robot mapping and positioning method
CN113031002B (en) * 2021-02-25 2023-10-24 桂林航天工业学院 SLAM accompany running trolley based on Kinect3 and laser radar
CN113031002A (en) * 2021-02-25 2021-06-25 桂林航天工业学院 SLAM running car based on Kinect3 and laser radar
CN113012191A (en) * 2021-03-11 2021-06-22 中国科学技术大学 Laser mileage calculation method based on point cloud multi-view projection graph
CN113012191B (en) * 2021-03-11 2022-09-02 中国科学技术大学 Laser mileage calculation method based on point cloud multi-view projection graph
CN113066152A (en) * 2021-03-18 2021-07-02 内蒙古工业大学 AGV map construction method and system
CN113012197B (en) * 2021-03-19 2023-07-18 华南理工大学 Binocular vision odometer positioning method suitable for dynamic traffic scene
CN113012197A (en) * 2021-03-19 2021-06-22 华南理工大学 Binocular vision odometer positioning method suitable for dynamic traffic scene
CN113126115B (en) * 2021-04-06 2023-11-17 北京航空航天大学杭州创新研究院 Semantic SLAM method and device based on point cloud, electronic equipment and storage medium
CN113126115A (en) * 2021-04-06 2021-07-16 北京航空航天大学杭州创新研究院 Semantic SLAM method and device based on point cloud, electronic equipment and storage medium
CN113625288A (en) * 2021-06-15 2021-11-09 中国科学院自动化研究所 Camera and laser radar pose calibration method and device based on point cloud registration
CN114154117B (en) * 2021-06-15 2022-08-23 元橡科技(苏州)有限公司 SLAM method
CN114154117A (en) * 2021-06-15 2022-03-08 元橡科技(苏州)有限公司 SLAM method
CN113379841B (en) * 2021-06-21 2024-04-30 上海仙工智能科技有限公司 Laser SLAM method based on phase correlation method and factor graph and readable storage medium thereof
CN113379841A (en) * 2021-06-21 2021-09-10 上海仙工智能科技有限公司 Laser SLAM method based on phase correlation method and factor graph and readable storage medium thereof
CN113327296A (en) * 2021-06-28 2021-08-31 浙江大学 Laser radar and camera online combined calibration method based on depth weighting
CN113327296B (en) * 2021-06-28 2022-04-12 浙江大学 Laser radar and camera online combined calibration method based on depth weighting
CN113450410B (en) * 2021-06-29 2022-07-26 浙江大学 Monocular depth and pose joint estimation method based on epipolar geometry
CN113450410A (en) * 2021-06-29 2021-09-28 浙江大学 Monocular depth and pose joint estimation method based on epipolar geometry
WO2023273169A1 (en) * 2021-06-30 2023-01-05 同济人工智能研究院(苏州)有限公司 Vision and laser-fused 2.5d map construction method
CN113624221A (en) * 2021-06-30 2021-11-09 同济人工智能研究院(苏州)有限公司 2.5D map construction method fusing vision and laser
CN113624221B (en) * 2021-06-30 2023-11-28 同济人工智能研究院(苏州)有限公司 2.5D map construction method integrating vision and laser
CN113506369A (en) * 2021-07-13 2021-10-15 阿波罗智能技术(北京)有限公司 Method, apparatus, electronic device, and medium for generating map
CN113777615A (en) * 2021-07-19 2021-12-10 派特纳(上海)机器人科技有限公司 Positioning method and system of indoor robot and cleaning robot
CN113777615B (en) * 2021-07-19 2024-03-29 派特纳(上海)机器人科技有限公司 Positioning method and system of indoor robot and cleaning robot
CN113625271A (en) * 2021-07-29 2021-11-09 中汽创智科技有限公司 Millimeter wave radar and binocular camera based simultaneous positioning and image building method
CN113625271B (en) * 2021-07-29 2023-10-27 中汽创智科技有限公司 Simultaneous positioning and mapping method based on millimeter wave radar and binocular camera
CN113359810A (en) * 2021-07-29 2021-09-07 东北大学 Unmanned aerial vehicle landing area identification method based on multiple sensors
CN113587933A (en) * 2021-07-29 2021-11-02 山东山速机器人科技有限公司 Indoor mobile robot positioning method based on branch-and-bound algorithm
CN113587933B (en) * 2021-07-29 2024-02-02 山东山速机器人科技有限公司 Indoor mobile robot positioning method based on branch-and-bound algorithm
WO2023004956A1 (en) * 2021-07-30 2023-02-02 西安交通大学 Laser slam method and system in high-dynamic environment, and device and storage medium
CN113689485A (en) * 2021-08-25 2021-11-23 北京三快在线科技有限公司 Method and device for determining depth information of unmanned aerial vehicle, unmanned aerial vehicle and storage medium
CN113820697B (en) * 2021-09-09 2024-03-26 中国电子科技集团公司第五十四研究所 Visual positioning method based on city building features and three-dimensional map
CN113820697A (en) * 2021-09-09 2021-12-21 中国电子科技集团公司第五十四研究所 Visual positioning method based on urban building characteristics and three-dimensional map
CN113947665B (en) * 2021-09-14 2022-10-28 广西大学 Method for constructing map of spherical hedge trimmer based on multi-line laser radar and monocular vision
CN113947665A (en) * 2021-09-14 2022-01-18 广西大学 Method for constructing map of spherical hedge trimmer based on multi-line laser radar and monocular vision
CN113781582A (en) * 2021-09-18 2021-12-10 四川大学 Synchronous positioning and map creating method based on laser radar and inertial navigation combined calibration
CN113781582B (en) * 2021-09-18 2023-09-19 四川大学 Synchronous positioning and map creation method based on laser radar and inertial navigation combined calibration
CN113837277A (en) * 2021-09-24 2021-12-24 东南大学 Multisource fusion SLAM system based on visual point-line feature optimization
CN113837277B (en) * 2021-09-24 2022-11-18 东南大学 Multisource fusion SLAM system based on visual point-line feature optimization
CN113947639B (en) * 2021-10-27 2023-08-18 北京斯年智驾科技有限公司 Self-adaptive online estimation calibration system and method based on multi-radar point cloud line characteristics
CN113947639A (en) * 2021-10-27 2022-01-18 北京斯年智驾科技有限公司 Self-adaptive online estimation calibration system and method based on multi-radar-point cloud line characteristics
CN114199235B (en) * 2021-11-29 2023-11-03 珠海一微半导体股份有限公司 Positioning system and positioning method based on sector depth camera
CN114199235A (en) * 2021-11-29 2022-03-18 珠海一微半导体股份有限公司 Positioning system and positioning method based on sector depth camera
CN114170280A (en) * 2021-12-09 2022-03-11 北京能创科技有限公司 Laser odometer method, system and device based on double windows
CN114170280B (en) * 2021-12-09 2023-11-28 北京能创科技有限公司 Laser odometer method, system and device based on double windows
CN114088087A (en) * 2022-01-21 2022-02-25 深圳大学 High-reliability high-precision navigation positioning method and system under unmanned aerial vehicle GPS-DENIED
CN114088087B (en) * 2022-01-21 2022-04-15 深圳大学 High-reliability high-precision navigation positioning method and system under unmanned aerial vehicle GPS-DENIED
CN114440892B (en) * 2022-01-27 2023-11-03 中国人民解放军军事科学院国防科技创新研究院 Self-positioning method based on topological map and odometer
CN114440892A (en) * 2022-01-27 2022-05-06 中国人民解放军军事科学院国防科技创新研究院 Self-positioning method based on topological map and odometer
CN115015955A (en) * 2022-05-23 2022-09-06 天津卡尔狗科技有限公司 Method, apparatus, device, storage medium and program product for determining motion information
CN115267796B (en) * 2022-08-17 2024-04-09 深圳市普渡科技有限公司 Positioning method, positioning device, robot and storage medium
CN115267796A (en) * 2022-08-17 2022-11-01 深圳市普渡科技有限公司 Positioning method, positioning device, robot and storage medium
CN115079199A (en) * 2022-08-22 2022-09-20 山东省科学院海洋仪器仪表研究所 Underwater target multi-mode information sensing system and method
CN115631319B (en) * 2022-11-02 2023-06-23 北京科技大学 Loop detection method based on cross attention network
CN115631319A (en) * 2022-11-02 2023-01-20 北京科技大学 Loopback detection method based on cross attention network
CN116299500B (en) * 2022-12-14 2024-03-15 江苏集萃清联智控科技有限公司 Laser SLAM positioning method and device integrating target detection and tracking
CN116299500A (en) * 2022-12-14 2023-06-23 江苏集萃清联智控科技有限公司 Laser SLAM positioning method and device integrating target detection and tracking
CN115631314A (en) * 2022-12-19 2023-01-20 中汽研(天津)汽车工程研究院有限公司 Point cloud map construction method based on multiple features and self-adaptive key frames
CN115797490A (en) * 2022-12-19 2023-03-14 广州宸境科技有限公司 Drawing construction method and system based on laser vision fusion
CN115631314B (en) * 2022-12-19 2023-06-09 中汽研(天津)汽车工程研究院有限公司 Point cloud map construction method based on multi-feature and self-adaptive key frames
CN116295457A (en) * 2022-12-21 2023-06-23 辉羲智能科技(上海)有限公司 Vehicle vision positioning method and system based on two-dimensional semantic map
CN116071550A (en) * 2023-02-09 2023-05-05 安徽海博智能科技有限责任公司 Laser radar dust point cloud filtering method
CN116071550B (en) * 2023-02-09 2023-10-20 安徽海博智能科技有限责任公司 Laser radar dust point cloud filtering method
CN116859360A (en) * 2023-09-04 2023-10-10 深圳安智杰科技有限公司 Motion estimation method and system based on single-frame millimeter wave Lei Dadian cloud
CN116859360B (en) * 2023-09-04 2023-11-24 深圳安智杰科技有限公司 Motion estimation method and system based on single-frame millimeter wave Lei Dadian cloud

Similar Documents

Publication Publication Date Title
CN112258600A (en) Simultaneous positioning and map construction method based on vision and laser radar
CN110223348B (en) Robot scene self-adaptive pose estimation method based on RGB-D camera
CN109166149B (en) Positioning and three-dimensional line frame structure reconstruction method and system integrating binocular camera and IMU
CN107301654B (en) Multi-sensor high-precision instant positioning and mapping method
CN110569704B (en) Multi-strategy self-adaptive lane line detection method based on stereoscopic vision
Yang et al. Monocular object and plane slam in structured environments
CN107025668B (en) Design method of visual odometer based on depth camera
US11270148B2 (en) Visual SLAM method and apparatus based on point and line features
CN111076733B (en) Robot indoor map building method and system based on vision and laser slam
CN107980150B (en) Modeling three-dimensional space
Clipp et al. Parallel, real-time visual SLAM
CN108615246B (en) Method for improving robustness of visual odometer system and reducing calculation consumption of algorithm
CN110009732B (en) GMS feature matching-based three-dimensional reconstruction method for complex large-scale scene
EP3293700B1 (en) 3d reconstruction for vehicle
CN113223045A (en) Vision and IMU sensor fusion positioning system based on dynamic object semantic segmentation
CN112419497A (en) Monocular vision-based SLAM method combining feature method and direct method
CN111144213A (en) Object detection method and related equipment
CN113994376A (en) Detection, three-dimensional reconstruction and tracking of a plurality of rigid objects moving relative to one another
Ceriani et al. Pose interpolation slam for large maps using moving 3d sensors
CN111998862A (en) Dense binocular SLAM method based on BNN
CN116449384A (en) Radar inertial tight coupling positioning mapping method based on solid-state laser radar
CN113658337A (en) Multi-mode odometer method based on rut lines
CN116878501A (en) High-precision positioning and mapping system and method based on multi-sensor fusion
CN112945233A (en) Global drift-free autonomous robot simultaneous positioning and map building method
CN116468786B (en) Semantic SLAM method based on point-line combination and oriented to dynamic environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210122