CN116299525A - Dynamic environment RGB-D vision SLAM method based on point cloud region correlation - Google Patents
Dynamic environment RGB-D vision SLAM method based on point cloud region correlation Download PDFInfo
- Publication number
- CN116299525A CN116299525A CN202310176750.3A CN202310176750A CN116299525A CN 116299525 A CN116299525 A CN 116299525A CN 202310176750 A CN202310176750 A CN 202310176750A CN 116299525 A CN116299525 A CN 116299525A
- Authority
- CN
- China
- Prior art keywords
- point cloud
- point
- sub
- dynamic
- correlation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 230000003068 static effect Effects 0.000 claims abstract description 17
- 230000009466 transformation Effects 0.000 claims description 29
- 239000011159 matrix material Substances 0.000 claims description 25
- 230000033001 locomotion Effects 0.000 claims description 13
- 230000000007 visual effect Effects 0.000 claims description 10
- 238000010586 diagram Methods 0.000 claims description 7
- 238000002474 experimental method Methods 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 4
- 230000000877 morphologic effect Effects 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 230000005484 gravity Effects 0.000 claims description 3
- 238000003064 k means clustering Methods 0.000 claims description 3
- 238000007670 refining Methods 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims 1
- 238000012545 processing Methods 0.000 abstract description 3
- 238000007781 pre-processing Methods 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 6
- 238000001514 detection method Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 230000004807 localization Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000004392 development of vision Effects 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S17/00—Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
- G01S17/86—Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S17/00—Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
- G01S17/88—Lidar systems specially adapted for specific applications
- G01S17/89—Lidar systems specially adapted for specific applications for mapping or imaging
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S17/00—Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
- G01S17/88—Lidar systems specially adapted for specific applications
- G01S17/93—Lidar systems specially adapted for specific applications for anti-collision purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Theoretical Computer Science (AREA)
- Electromagnetism (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
A dynamic environment RGB-D vision SLAM method based on point cloud region correlation can be used as front-end preprocessing of an ORBSLAM2/3 system to filter out dynamic regions, so that erroneous data association is reduced. The input of the whole system requires the generation of point cloud by the RGB image and the depth map of the current frame, and the point cloud is temporarily stored in the system as a reference frame after one frame is processed, and secondary processing is not performed. The depth map is clustered by using a clustering algorithm to divide sub-point clouds, then sub-pose changes between the sub-point clouds and a reference frame are sequentially calculated, meanwhile, an idea of dynamic and static correlation of the sub-point clouds is provided, which sub-point clouds belong to dynamic point clouds can be effectively identified by combining the idea and the sub-pose, the constraint of epipolar geometry is not needed, and the approximate self pose of a camera is not needed to be known in advance. The invention well reduces the positioning influence of the dynamic object on the robot, and improves the robustness of the system and the generalization of the scene.
Description
Technical Field
The invention relates to the field of global positioning of robots, in particular to a dynamic environment RGB-D vision SLAM method based on point cloud region correlation.
Background
Today, visual real-time localization and mapping is widely studied, and it is attempted to be applied to various devices as the core of localization, such as blind aids, reality augmentation technology, robots, autopilot, etc. With the development of nearly 20 years, the overall framework of visual SLAM is gradually maturing, and at the same time, many advanced visual positioning systems with high precision and high robustness have been proposed. These vision-based systems may estimate the camera's own motion from only successive images captured by the surrounding environment and generate successive motion trajectories representing the camera. However, most excellent systems are designed assuming that the surrounding environment is static in order to reduce the estimation problem. If dynamic objects exist in the environment, the irregular moving landmarks are introduced in actual observation, so that the system cannot distinguish whether the system is moving or the objects in the environment are moving, and finally, inaccuracy of positioning information and serious deviation of environment map construction are caused. In order to solve the problem of poor performance of the visual SLAM in a dynamic environment, researchers take a number of measures, such as patent number CN115393538A, which performs feature extraction on the image information based on GCNv2 algorithm, and removes dynamic feature points in the image information by using an optical flow dynamic point removal algorithm; patent number CN114283198A detects a dynamic target in an image using a target detection neural network YOLOv 5; and combining the image depth information, and processing by a pixel level segmentation method to achieve the effect of pixel level segmentation. The existing method is based on deep learning, however, the method based on deep learning needs to know which objects are moving in advance and which objects appear in the field of view of the equipment, besides, the calculation amount of an inference model is huge, and the method is difficult to apply to occasions with low cost and high real-time requirements.
At present, the following defects and disadvantages exist for the positioning of the robot in a dynamic environment: 1) The unknown characteristic points can be further identified and judged on the basis of the pose estimation information, and the system failure can be directly caused under severe environments. 2) The deep learning method needs to know which objects are moving and which objects appear in the visual field of the equipment in advance, besides, the calculation amount of the inference model is huge, and the method is difficult to be applied to occasions with low cost and high real-time requirements. 3) The existence of dynamic objects is unavoidable in a real environment based on the fact that the geometrical method is easy to have false negative (static mark is dynamic) phenomenon 4), and the existence of the dynamic objects can greatly influence the positioning precision of equipment, and is one of main problems which prevent the further development of vision SLAM.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a dynamic environment RGB-D vision SLAM method based on the correlation of the point cloud area, which can improve the positioning accuracy and the robustness in a dynamic real environment. The invention relates to an independent segmentation method which does not need to consider the motion of a camera and gets rid of projective transformation which is needed to be relied on by most of the current algorithms, and in addition, the identification of a dynamic region can be achieved without prior information of training samples and dynamic objects.
In order to achieve the above purpose, the present invention provides the following technical solutions: a dynamic environment RGB-D vision SLAM method based on point cloud region correlation comprises the following steps:
s1: performing two-dimensional plane downsampling on the RGB image and generating a point cloud by combining a depth map;
the specific step of generating the point cloud comprises the following steps:
s11: and carrying out downsampling on a two-dimensional plane on the gray level diagram to generate a point cloud by simulating a data acquisition form of the rotary laser radar on the current frame so as to accelerate construction of the KD tree.
S12: the downsampled point cloud is rasterized by a grid with a side length of l, and an ith grid G is calculated i Gray average value of all points in the gridIntensity value of center of gravity:
S2: minimizing the point cloud generated in the step S1 into sub point clouds;
the generating of the sub-point cloud specifically comprises the following steps:
s21: and removing the depth unstable region, and setting the depth value of a certain point in the depth image to be 0 when the depth value is larger than a certain value according to the noise model of the depth camera and the empirical value obtained in the experiment.
S22: since the point cloud obtained by the original depth of the RGBD camera has a multi-layer structure or a cavity, the point cloud obtained by the S1 is smoothed by using the least square of the movement.
S23: and clustering the point clouds by using a K-Means clustering algorithm to obtain K sub-point clouds.
S3: setting the previous frames of the current frame as reference frames;
s4: an accurate transformation matrix is obtained by adopting a method of fusion of gray level auxiliary nearest neighbor iteration and traditional nearest neighbor iteration, comprising the following steps: carrying out gray nearest neighbor iteration on each sub-point cloud of the S2 and the target point cloud of the reference frame to calculate a rough transformation matrix and a corresponding point, and then calculating an accurate transformation matrix by utilizing the conventional nearest neighbor iteration to register the point cloud;
the intensity-assisted nearest neighbor iterative registration method specifically comprises the following steps:
s41: estimating and predicting initial pose according to uniform motion model of ORBSLAM2Providing the initial pose can effectively prevent the sinking into the locally optimal solution and accelerate the iteration speed:
t in (2) (·) Are all T in the world coordinate system pre Representing the increment of pose transformation between the last two common frames, T last Representing the camera pose of the previous frame, T ref Pose of reference frame.
S42: according to the initial poseSearch for sub-point cloud->Alpha closest points closest to the target point cloud as candidate corresponding points +.>
S43: calculating candidate pointsSelecting the highest score S (i, k) as the corresponding point +.>
Delta T represents the incremental matrix for each iteration, r G Representing geometric residual error, r I And (×) represents the intensity residual, w is the weight conversion function.
S431: wherein the weight transformation function of S43 is assumed herein to obey the t-distribution, which weight transformation function can be expressed as:
v represents the degree of freedom of the distribution, μ (*) Mean value representing geometric or intensity residual, and the same thing sigma (*) Represented as variance.
S44: and S4, calculating the pairing points of the points, and calculating a gain matrix by using the pairing points to ensure that the total Euclidean distance sum is minimum:
s45: updating transformation matrix T * ←△TT * 。
S5: transforming the source point clouds according to the transformation matrix calculated in the step S4, and estimating the degree of correlation between every two sub point clouds;
the correlation degree calculation includes:
s51: find the i-th pointIs the closest point P of (2) tgt (j) Besides, the Euclidean distance is calculated as a main index, an additional intensity value is used as a punishment item, and the score of all points is accumulated and averaged to be used as a correlation degree score:
pose transformation matrix T for representing source point cloud passing through jth sub point cloud Bj In the point cloud obtained later->Ith point, ++>Representation dot->Intensity values of (2).
S6: calculating a threshold divides all sub-point clouds into two groups: a dynamic correlation group and a static correlation group;
s61: a correlation degree score { R (j) } of the sub-point cloud calculated according to S5 j∈K Setting a threshold epsilon as a zone static sub-point cloud, the threshold can be determined by the following formula:
s62: according to the threshold value, which point clouds belong to the static point clouds can be effectively judged:
s represents a static sub-point cloud set, D belongs to a dynamic sub-point cloud set, and all point clouds in the dynamic sub-point cloud set are projected onto a camera plane to generate a mask.
S7: projecting the point cloud of the dynamic correlation group to a camera plane, drawing a circle by taking the point cloud as a center, and filling the point cloud by morphological geometry to finally form a required pixel mask;
s8: the feature points in the mask are not added into pose estimation of the camera;
s9: the processed frame is stored in the system as a reference frame and is used as an alternative frame for updating the reference frame;
s10: and refining the mask generated by the key frames by using the historical observation data to generate more stable map points.
S101, when the key frame is generated, taking the previous N frames as historical observation data, not all historical data, and estimating the pixel point weight value by the following formula:
I i (x, y) represents the pixel point with the coordinate of (x, y) of the ith data observation, ρ is the weight proportion of the pixel, and the pixel weight is lambda times higher than the total observation is regarded as a dynamic pixel point:
s102: since feature points are easily generated at the edges of the object, but these edge points are extremely unstable, the mask generated by the refined key frame will be properly expanded so that the edge points do not participate in the camera pose estimation process.
The invention can be used as front-end preprocessing of the ORBSLAM2/3 system to filter out dynamic areas, thereby reducing erroneous data association. The input of the whole system requires the generation of point cloud by the RGB image and the depth map of the current frame, and the point cloud is temporarily stored in the system as a reference frame after one frame is processed, and secondary processing is not performed. The K-Means is utilized to cluster and divide the depth map into sub point clouds, then sub pose changes between the sub point clouds and the reference frame are sequentially calculated, meanwhile, the invention provides an idea of dynamic and static correlation of the sub point clouds, which sub point clouds belong to dynamic point clouds can be effectively identified by combining the idea and the sub pose, the dependency on epipolar geometric constraint is avoided, and the approximate self pose of a camera is not required to be known in advance. The invention well reduces the positioning influence of the dynamic object on the robot, and improves the robustness of the system and the generalization of the scene.
Compared with the prior art, the invention has the beneficial effects that:
the invention discloses a dynamic environment RGB-D vision SLAM method based on the correlation of point cloud areas, which solves the problem that the traditional dynamic vision SLAM can judge the dynamic areas only by considering the movement of a camera, and in addition, the prior knowledge of training samples and dynamic objects is not needed. Meanwhile, the positioning accuracy and the robustness of the visual SLAM in a dynamic environment are improved.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
fig. 2a to 2b are schematic diagrams of detection effects of a dynamic region according to an embodiment of the disclosure, where fig. 2a is an original image and fig. 2b is a schematic diagram of detection results of the dynamic region;
FIGS. 3 a-3 b are schematic diagrams of the present invention in comparison to a true trajectory and a trajectory of the ORB-SLAM2 algorithm on a TUM dataset, wherein FIG. 3a is a translational movement of the camera along the X-Y-Z axis, FIG. 3b is a movement of the camera along a hemisphere with a radius of 1 meter, a blue trajectory is a positioning result trajectory of the present invention, a dashed line is a movement trajectory of the true camera, and a green trajectory is a positioning result trajectory of the ORB-SLAM 2;
FIGS. 4 a-4 b are diagrams illustrating the effect of keyframe refinement in accordance with embodiments of the present disclosure, wherein FIG. 4a is a sequence of historical observation data frames, FIG. 4b is an original keyframe prior to refinement, and FIG. 4c is a keyframe after refinement;
fig. 5a to 5b are schematic views of the dynamic region detection effect under an actual scene according to the disclosed embodiment of the invention, wherein fig. 5a is an original image, and fig. 5b is a schematic view of the dynamic region detection result.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings. It should be understood that the examples are presented for the purpose of illustrating the invention only and are not intended to limit the invention.
The invention discloses a dynamic environment RGB-D vision SLAM method flow chart based on point cloud area correlation, which is shown in figure 1 and comprises the following steps.
S1: performing two-dimensional plane downsampling on the RGB image and generating a point cloud by combining a depth map;
the specific steps of the point cloud generation are as follows:
s11: and (3) carrying out data acquisition form of simulating a rotary laser radar on the current frame, extracting a gray level image at intervals of 4 pixels and 4 pixels, and generating a point cloud by combining a depth image, so as to realize downsampling on a two-dimensional plane and accelerate construction of a KD tree.
S12: the downsampled point cloud is rasterized by a grid with a side length of 0.05, and an ith grid G is calculated i Gray average value of all points in the grid, and the gray average value is taken as the intensity value of the center of gravity of the grid:
S2: minimizing the point cloud generated in the step S1 into sub point clouds;
the specific steps of generating the sub-point cloud are as follows:
s21: and removing the depth unstable region, and setting the depth value of a certain point in the depth image to be 0 when the depth value is larger than 6 meters according to the noise model of the depth camera and the empirical value obtained in the experiment.
S22: since the point cloud obtained by the original depth of the RGBD camera has a multi-layer structure or a cavity, the point cloud obtained by the S1 is smoothed by using the least square of the movement.
S23: and clustering the point clouds by using a K-Means clustering algorithm to obtain K=10 sub-point clouds.
S3: setting the first 3 frames of the current frame as reference frames;
s4: an accurate transformation matrix is obtained by adopting a method of fusion of gray level auxiliary nearest neighbor iteration and traditional nearest neighbor iteration, comprising the following steps: performing 5 gray nearest neighbor iterations on each sub-point cloud of the S2 and the target point cloud of the reference frame to calculate a rough transformation matrix and a corresponding point, and performing 5 iteration to point cloud registration by using the traditional nearest neighbor iterations to calculate an accurate transformation matrix;
the intensity-assisted nearest neighbor iterative registration method specifically comprises the following steps:
s41: estimating and predicting initial pose according to uniform motion model of ORBSLAM2Providing the initial pose can effectively prevent the sinking into the locally optimal solution and accelerate the iteration speed:
t in (2) (·) Are all T in the world coordinate system pre Representing the increment of pose transformation between the last two common frames, T last Representing the camera pose of the previous frame, T ref Pose of reference frame.
S42: according to the initial poseSearch for sub-point cloud->α=10 closest points closest to the target point cloud as candidate corresponding points +.>
S43: calculating candidate pointsSelecting the highest score S (i, k) as the corresponding point +.>
Delta T represents the incremental matrix for each iteration, r G Representing geometric residual error, r I And (×) represents the intensity residual, w is the weight conversion function.
S431: wherein the weight transformation function of S43 is assumed herein to obey the t-distribution, which weight transformation function can be expressed as:
v represents the degree of freedom of the distribution, and is set to 5 in the experiment. Mu (mu) (*) Mean value representing geometric or intensity residual, and the same thing sigma (*) Represented as variance.
S44: and S4, calculating the pairing points of the points, and calculating a gain matrix by using the pairing points to ensure that the total Euclidean distance sum is minimum:
s45: updating transformation matrix T * ←△TT * 。
S5: transforming the source point clouds according to the transformation matrix calculated in the step S4, and estimating the degree of correlation between every two sub point clouds;
the correlation degree calculation includes:
find the i-th pointIs the closest point P of (2) tgt (j) Besides, the Euclidean distance is calculated as a main index, an additional intensity value is used as a punishment item, and the score of all points is accumulated and averaged to be used as a correlation degree score:
pose transformation matrix representing source point cloud passing through jth sub point cloud>In the point cloud obtained later->Ith point, ++>Representation dot->Intensity values of (2).
S6: calculating a threshold divides all sub-point clouds into two groups: a dynamic correlation group and a static correlation group;
s61: a correlation degree score { R (j) } of the sub-point cloud calculated according to S5 j∈K Setting a threshold epsilon as a zone static sub-point cloud, the threshold can be determined by the following formula:
s62: according to the threshold value, which point clouds belong to the static point clouds can be effectively judged:
s represents a static sub-point cloud set, D belongs to a dynamic sub-point cloud set, and all point clouds in the dynamic sub-point cloud set are projected onto a camera plane to generate a mask.
S7: projecting the point cloud of the dynamic correlation group to a camera plane, drawing a circle by taking the point cloud as a center, and filling the point cloud by morphological geometry to finally form a required pixel mask;
s8: the feature points in the mask are not added into pose estimation of the camera;
s9: the processed frames are stored in a system and used as alternative frames updated by reference frames;
s10: and refining the mask generated by the key frames by using the historical observation data to generate more stable map points.
S101, when the key frame is generated, the first n=7 frames are taken as history observation data, not all history data, and the pixel weight is estimated by the following formula:
I i (x, y) represents the pixel point with the coordinate of (x, y) of the ith data observation, and ρ is the weight proportion of the pixel. In the experiment, the weight rho of the key frame is 3, the weight rho of the common frame is 2, and N is set to 7. A pixel weight exceeding λ=0.6 times the total observation is considered a dynamic pixel point:
s102: since feature points are easily generated at the edges of the object, but these edge points are extremely unstable, the mask generated by the refined key frame will be properly expanded so that the edge points do not participate in the camera pose estimation process.
Fig. 2a and 2b identify a dynamic region on the TUM dataset for a pedestrian, which is well identified by the present invention and less affected by the rotation of the camera. Fig. 3a and 3b are graphs comparing absolute positioning accuracy of the present invention with the trajectories of the true and ORB-SLAM2 algorithms, where the trajectories of the present invention closely fit the true trajectories, however, the trajectories of ORB-SLAM2 deviate severely, which proves that the present invention can improve positioning accuracy and robustness of SLAM systems in dynamic environments. Fig. 4a and 4b are visual effect diagrams of the key frame refinement. The original keyframes before refinement of fig. 4b are filtered by the historical observation data frame sequence of fig. 4a, and finally the keyframes after refinement of fig. 4c are obtained. As the mask area of the refined key frame is greatly reduced, the influence of false negative can be effectively reduced, and the robustness of the system is improved. Fig. 5a and 5b are visualizations of dynamic region identification in a real environment, which fully represent the reliability and utility of the present invention.
It should be emphasized that the embodiments described herein are illustrative rather than limiting, and that this invention includes, by way of example, but is not limited to the specific embodiments described herein, as other embodiments similar thereto will occur to those of skill in the art based upon the teachings herein and fall within the scope of this invention.
Claims (8)
1. The dynamic environment RGB-D vision SLAM method based on the point cloud region correlation is characterized by comprising the following steps:
s1: performing two-dimensional plane downsampling on the RGB image and generating a point cloud by combining a depth map;
s2: minimizing the point cloud generated in the step S1 into sub point clouds;
s3: setting the previous frames of the current frame as reference frames;
s4: an accurate transformation matrix is obtained by adopting a method of fusion of gray level auxiliary nearest neighbor iteration and traditional nearest neighbor iteration, comprising the following steps: carrying out gray nearest neighbor iteration on each sub-point cloud of the S2 and the target point cloud of the reference frame to calculate a rough transformation matrix and a corresponding point, and then calculating an accurate transformation matrix by utilizing the conventional nearest neighbor iteration to register the point cloud;
s5: transforming the source point clouds according to the transformation matrix calculated in the step S4, and estimating the degree of correlation between every two sub point clouds;
s6: calculating a threshold divides all sub-point clouds into two groups: a dynamic correlation group and a static correlation group;
s7: projecting the point cloud of the dynamic correlation group to a camera plane, drawing a circle by taking the point cloud as a center, and filling by morphological geometry to finally form a pixel mask required by us;
s8: the feature points in the mask are not added into pose estimation of the camera;
s9: storing the processed frame in a system as an alternative frame for updating the reference frame;
s10: and refining the mask generated by the key frames by using the historical observation data to generate more stable map points.
2. The dynamic environment RGB-D visual SLAM method based on point cloud region correlation of claim 1, wherein the point cloud generation in step S1 comprises:
s11: the method comprises the steps that a data acquisition form of a rotary laser radar is simulated for a current frame, and down-sampling on a two-dimensional plane is conducted on a gray level diagram to generate a point cloud so as to accelerate construction of a KD tree;
s12: the downsampled point cloud is rasterized by a grid with a side length of l, and an ith grid G is calculated i Gray average value of all points in the grid, and the gray average value is taken as the intensity value of the center of gravity of the grid:
3. The dynamic environment RGB-D visual SLAM method based on point cloud region correlation of claim 1, wherein the generating the sub-point cloud in step S2 comprises:
s21: and removing the depth unstable region, and setting the depth value of a certain point in the depth image to be 0 when the depth value is larger than a certain value according to the noise model of the depth camera and the empirical value obtained in the experiment.
S22: because the point cloud obtained by the original depth of the RGBD camera has a multi-layer structure or a cavity, the point cloud obtained by the S1 is smoothed by using the least square of the movement;
s23: and clustering the point clouds by using a K-Means clustering algorithm to obtain K sub-point clouds.
4. The method for dynamic environment RGB-D vision SLAM based on point cloud region correlation of claim 1, wherein step S4 gray-scale assisted nearest neighbor iterative computation of the transformation matrix comprises:
s41: estimating and predicting initial pose according to uniform motion model of ORBSLAM2Providing the initial pose can effectively prevent the sinking into the locally optimal solution and accelerate the iteration speed:
t in (2) (·) Are all T in the world coordinate system pre Representing the increment of pose transformation between the last two common frames, T last Representing the camera pose of the previous frame, T ref Pose of reference frame;
s42: according to the initial poseSearch for sub-point cloud->Alpha nearest points nearest to the target point cloud are used as candidate corresponding points
S43: calculating candidate pointsSelecting the highest score S (i, k) as the corresponding point +.>
Delta T represents the incremental matrix for each iteration, r G Representing geometric residual error, r I Representing the intensity residual, w being the weight conversion function;
s431: wherein the weight transformation function of S43 is assumed herein to obey the t-distribution, which weight transformation function can be expressed as:
v represents the degree of freedom of the distribution, μ (*) Mean value representing geometric or intensity residual, and the same thing sigma (*) Represented as variance;
s44: and S4, calculating the pairing points of the points, and calculating a gain matrix by using the pairing points to ensure that the total Euclidean distance sum is minimum:
s45: updating transformation matrix T * ←ΔTT * 。
5. The dynamic environment RGB-D visual SLAM method based on point cloud region correlation of claim 1, wherein the calculating of the degree of correlation between sub-point clouds in step S5 comprises:
find the i-th pointIs the closest point P of (2) tgt (j) Besides, the Euclidean distance is calculated as a main index, an additional intensity value is used as a punishment item, and the score of all points is accumulated and averaged to be used as a correlation degree score:
6. The method of dynamic environment RGB-D vision SLAM based on point cloud area correlation of claim 1, wherein step S6 of calculating the threshold value of the zone transfer static correlation group comprises:
s61: a correlation degree score { R (j) } of the sub-point cloud calculated according to S5 j∈K Setting a threshold epsilon as a zone static sub-point cloud, the threshold can be determined by the following formula:
s62: judging which point clouds belong to static point clouds according to a threshold value:
s represents a static sub-point cloud set, D belongs to a dynamic sub-point cloud set, and all point clouds belonging to the D set are projected onto a camera plane to generate a mask.
7. The method of claim 1, wherein in step S7, all the point clouds in the dynamic correlation group are projected onto the camera plane, and a circle is drawn around it, and then filled by morphological geometry to finally form the required pixel mask.
8. The dynamic environment RGB-D vision SLAM method based on point cloud region correlation of claim 1, wherein step S10 comprises:
s101, when the key frame is generated, taking the previous N frames as historical observation data, not all historical data, and estimating the pixel point weight value by the following formula:
I i (x, y) represents a pixel point with the coordinates (x, y) of the ith data observation, ρ is the pixel weight proportion, and the pixel weight exceeding λ times of the total observation is regarded as a dynamic pixel point:
s102: because the feature points are easily generated at the edges of the object, but the edge points are unstable, the mask generated by the refined key frames is expanded, so that the edge points do not participate in the pose estimation process of the camera.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310176750.3A CN116299525A (en) | 2023-02-28 | 2023-02-28 | Dynamic environment RGB-D vision SLAM method based on point cloud region correlation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310176750.3A CN116299525A (en) | 2023-02-28 | 2023-02-28 | Dynamic environment RGB-D vision SLAM method based on point cloud region correlation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116299525A true CN116299525A (en) | 2023-06-23 |
Family
ID=86802418
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310176750.3A Pending CN116299525A (en) | 2023-02-28 | 2023-02-28 | Dynamic environment RGB-D vision SLAM method based on point cloud region correlation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116299525A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117190900A (en) * | 2023-11-07 | 2023-12-08 | 中铁八局集团第二工程有限公司 | Tunnel surrounding rock deformation monitoring method |
-
2023
- 2023-02-28 CN CN202310176750.3A patent/CN116299525A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117190900A (en) * | 2023-11-07 | 2023-12-08 | 中铁八局集团第二工程有限公司 | Tunnel surrounding rock deformation monitoring method |
CN117190900B (en) * | 2023-11-07 | 2024-01-02 | 中铁八局集团第二工程有限公司 | Tunnel surrounding rock deformation monitoring method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111563442B (en) | Slam method and system for fusing point cloud and camera image data based on laser radar | |
CN110097553B (en) | Semantic mapping system based on instant positioning mapping and three-dimensional semantic segmentation | |
CN111325843B (en) | Real-time semantic map construction method based on semantic inverse depth filtering | |
CN110232350B (en) | Real-time water surface multi-moving-object detection and tracking method based on online learning | |
CN112396595B (en) | Semantic SLAM method based on point-line characteristics in dynamic environment | |
CN113570629B (en) | Semantic segmentation method and system for removing dynamic objects | |
CN114549549B (en) | Dynamic target modeling tracking method based on instance segmentation in dynamic environment | |
CN111914832B (en) | SLAM method of RGB-D camera under dynamic scene | |
CN113345008A (en) | Laser radar dynamic obstacle detection method considering wheel type robot position and posture estimation | |
CN111998862A (en) | Dense binocular SLAM method based on BNN | |
CN113406659A (en) | Mobile robot position re-identification method based on laser radar information | |
CN116879870A (en) | Dynamic obstacle removing method suitable for low-wire-harness 3D laser radar | |
CN113487631A (en) | Adjustable large-angle detection sensing and control method based on LEGO-LOAM | |
CN116299525A (en) | Dynamic environment RGB-D vision SLAM method based on point cloud region correlation | |
CN116953723A (en) | Mobile robot instant positioning and map construction method in locomotive maintenance workshop environment | |
Liu et al. | KMOP-vSLAM: Dynamic visual SLAM for RGB-D cameras using K-means and OpenPose | |
CN113689459B (en) | Real-time tracking and mapping method based on GMM and YOLO under dynamic environment | |
CN117029870A (en) | Laser odometer based on road surface point cloud | |
CN112446885A (en) | SLAM method based on improved semantic optical flow method in dynamic environment | |
CN115797490B (en) | Graph construction method and system based on laser vision fusion | |
CN115797397B (en) | Method and system for all-weather autonomous following of robot by target personnel | |
CN116429126A (en) | Quick repositioning method for ground vehicle point cloud map in dynamic environment | |
CN115235505A (en) | Visual odometer method based on nonlinear optimization | |
CN115187614A (en) | Real-time simultaneous positioning and mapping method based on STDC semantic segmentation network | |
CN111239761B (en) | Method for indoor real-time establishment of two-dimensional map |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |