CN110490900B - Binocular vision positioning method and system under dynamic environment - Google Patents
Binocular vision positioning method and system under dynamic environment Download PDFInfo
- Publication number
- CN110490900B CN110490900B CN201910634021.1A CN201910634021A CN110490900B CN 110490900 B CN110490900 B CN 110490900B CN 201910634021 A CN201910634021 A CN 201910634021A CN 110490900 B CN110490900 B CN 110490900B
- Authority
- CN
- China
- Prior art keywords
- map
- pose
- frame
- matching
- points
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 230000009466 transformation Effects 0.000 claims abstract description 29
- 239000011159 matrix material Substances 0.000 claims abstract description 27
- 230000010354 integration Effects 0.000 claims abstract description 23
- 230000000007 visual effect Effects 0.000 claims description 42
- 238000005457 optimization Methods 0.000 claims description 32
- 238000012216 screening Methods 0.000 claims description 17
- 238000001914 filtration Methods 0.000 claims description 8
- 230000003287 optical effect Effects 0.000 claims description 8
- 238000006073 displacement reaction Methods 0.000 claims description 6
- 230000004927 fusion Effects 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000005259 measurement Methods 0.000 claims description 5
- 238000012805 post-processing Methods 0.000 claims description 5
- 230000008030 elimination Effects 0.000 claims description 4
- 238000003379 elimination reaction Methods 0.000 claims description 4
- 230000005484 gravity Effects 0.000 claims description 4
- 230000001131 transforming effect Effects 0.000 claims description 4
- 230000008859 change Effects 0.000 claims description 3
- 230000003068 static effect Effects 0.000 abstract description 11
- 238000001514 detection method Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000013135 deep learning Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 230000001133 acceleration Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/593—Depth or shape recovery from multiple images from stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10012—Stereo images
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Image Analysis (AREA)
- Length Measuring Devices By Optical Means (AREA)
Abstract
A binocular vision positioning method and system under dynamic environment, apply to the technical field of robot positioning, withdraw the characteristic point of the picture, distinguish the characteristic point on the moving object point and characteristic point on the non-moving object through clustering the way to the characteristic point, this kind of method can reject the characteristic point on the moving object effectively, avoid rejecting the characteristic point on the static dynamic attribute object, and then keep more available characteristic points, position and posture transformation matrix of interframe obtained through IMU pre-integration algorithm screens and close to IMU data on the basis of different characteristic point sets estimated position and posture, thus reject some exterior points such as the characteristic point belonging to moving object effectively; the pose and depth estimation method integrating monocular motion vision and binocular stereo vision can effectively improve the estimation precision.
Description
Technical Field
The invention relates to the technical field of robot positioning, in particular to a binocular vision positioning method and system in a dynamic environment.
Background
The traditional GPS positioning technology has strong dependence on satellite signals, greatly reduces the precision in the environment with more buildings and serious shielding, and is especially basically unusable in the indoor environment. The inertial navigation technology based on the inertial navigation module (gyroscope and accelerometer) does not depend on external information, the current pose information of the carrier can be obtained through the integral of the inertial navigation measurement information, but the inertial navigation positioning error is increased along with the time, and the long-term accuracy is poor due to the existence of accumulated error. The vision-based positioning method determines the pose of the robot by analyzing the image sequence, can be used in indoor and other sheltered environments, and can also be used for carrying out closed-loop detection by using abundant visual information to eliminate accumulated errors so as to realize long-term high-precision positioning. In the field of robots, a vision-based positioning method is mainly based on vision and Simultaneous positioning and Mapping (SLAM), which is called vision SLAM for short, and a vision SLAM without map building is also called vision odometer, and is called vision SLAM collectively herein. The current vision SLAM is divided into a monocular SLAM, a binocular SLAM and an RGB-D SLAM according to different sensors, and the monocular SLAM has the scale uncertainty problem, but the RGB-D SLAM is too high in cost and low in precision, so that the binocular SLAM is adopted, and the monocular scale uncertainty problem does not exist, and the cost is relatively low. At present, most SLAM positioning technologies are based on static environment assumptions, however, most practical environments are complex dynamic scenes, in the dynamic scenes, the precision of SLAM algorithms based on the static assumptions is greatly reduced, in recent years, some semantic-based dynamic scene SLAM technologies are proposed and used, semantic SLAM is called semantic SLAM for short, the semantic SLAM detects object types with dynamic attributes in scenes through deep learning or other computer vision methods, and the objects of the dynamic attribute types are removed before pose estimation is carried out by utilizing image information, so that the influence of dynamic objects on the pose estimation is reduced.
However, this method has the following disadvantages: (1) the image detection technology has high calculation complexity, the positioning technology has high requirements on real-time performance, and the added frames of deep learning and the like in the SLAM system can greatly influence the real-time performance. (2) In the SLAM system, particularly a visual odometer without an environment map, when the pose estimation is performed by using the image information, the object with dynamic property is not the object which affects the pose estimation accuracy, but the moving object, such as a car parked in a parking lot, belongs to the object with dynamic property, but the car may stay in the parking lot for a long time, at least the pose estimation of the current image information is not affected, and more characteristic information is provided for the pose estimation, particularly in a scene with rare characteristics, the temporarily static dynamic object can provide more characteristics, and the pose estimation accuracy is increased.
Disclosure of Invention
The invention mainly aims to provide a binocular vision positioning method and system in a dynamic environment.
In order to achieve the above object, a first aspect of the embodiments of the present invention provides a binocular vision positioning method in a dynamic environment, including:
acquiring a left eye image and a right eye image under binocular vision of a camera, and extracting feature points corresponding to map points in a current frame to which the left eye image and the right eye image belong and a previous frame to which the left eye image and the right eye image belong, wherein the map points are feature points added into a map;
according to the characteristic points, obtaining the left-eye and right-eye movement visual matching characteristic point pairs and stereoscopic vision matching characteristic point pairs;
screening the shared matching characteristic point pairs of the stereoscopic vision matching characteristic point pair and the motion vision matching characteristic point pair;
respectively carrying out stereo depth estimation on the characteristic point pairs in the common matching characteristic point pair set to obtain a stereo depth estimation value;
clustering the left-target and right-target motion visual matching feature point pairs according to the optical flow velocity and the rotation angle between the motion visual matching feature point pairs by using a K-means + + clustering algorithm to obtain clustering results, and classifying the motion visual matching feature point pairs into a plurality of categories according to the clustering results;
calculating pose estimation results of the IMU of the left eye and the right eye in a time period between a current frame and a previous frame by utilizing an IMU pre-integration algorithm, wherein the pose estimation results comprise inter-frame poses between the current frame and the previous frame;
respectively carrying out EPNP (extended post-processing) pose estimation and map point estimation on the motion visual matching feature point pairs under each category to obtain a pose estimation result and a map point estimation result of the EPNP;
selecting the pose estimation result of the EPNP closest to the pose estimation result of the IMU as an optimal EPNP pose estimation result and a map point estimation result, and recording effective matching feature point pairs corresponding to the optimal EPNP pose estimation result and the map point estimation result;
and selecting a characteristic point pair corresponding to the effective matching characteristic point pair from the common matching characteristic point pair, and performing Kalman filtering fusion on a stereo depth estimation value corresponding to the characteristic point pair and an EPNP estimated depth value to obtain an estimated pose and an estimated map point.
Further, the IMU pre-integration algorithm:
wherein,quaternion, a, representing displacement increment, velocity increment, and rotation angle change in a period from the k-th frame to the k + 1-th frame in the k-th frame camera coordinate system, respectivelytRepresenting accelerometer measurements, wtRepresenting measured values of angular velocity, batAnd bwtRepresenting the bias of the accelerometer and gyroscope, respectively, naAnd nwRespectively, represent the noise, and are,a rotation transformation matrix representing the time t relative to the k frame camera coordinate systemThe correlation equation yields, Ω is the matrix related to the quaternion calculation, Δ tkRepresents tkTo tk+1Time interval of (g)wRepresenting the gravity vector in the world coordinate system.
Further, after the classifying the moving visual matching feature point pairs into a plurality of categories according to the clustering result, the method includes:
performing histogram statistics on the number of the feature points in each category;
and deleting the categories and the corresponding feature points under the categories, wherein the number of the corresponding feature points under each category is less than a first preset number, and the difference between the number of the corresponding feature points under each category and the number of the corresponding feature points under other categories is greater than a second preset number.
Further, the performing EPNP pose estimation and map point estimation on the motion visual matching feature point pairs under each category, respectively, to obtain a pose estimation result and a map point estimation result of the EPNP includes:
respectively carrying out EPNP position and map point estimation on the feature points under each category, randomly adopting a consistency algorithm through RANSAC, and carrying out outlier elimination on the position and orientation estimation result of each category to obtain the position and orientation estimation result of the EPNP corresponding to each category and the map point estimation result.
Further, after obtaining the estimated pose and the estimated map point, the method includes:
optimizing the estimated pose and the estimated map points by utilizing a map-based optimization algorithm according to the inter-frame pose of the IMU to obtain optimized map points;
the optimization algorithm of the graph is as follows:
which is composed ofRepresents an estimated pose transformation matrix obtained by EPNP pose estimation,representing a pose transformation matrix, z, obtained by an IMU pre-integration algorithmk+1Representing camera imagesPixel coordinate of (2), PkRepresenting the 3D coordinates of a map point in the kth frame camera coordinate system, pic(.) is a projective transformation based on a model of the camera for transforming 3D coordinates in the camera coordinate system into 2D pixel coordinates in the camera image.
Further, after the optimizing the estimated pose and the estimated map point to obtain the optimized map point, the method includes:
replacing the map points before optimization by the optimized map points, updating the local map, and eliminating map points which do not belong to the local map;
comparing whether the common-view feature point between the current frame and the next new key frame is higher than a preset threshold value or not;
if the current key frame is higher than the preset threshold, replacing the latest key frame with the current key frame;
and if the current frame is lower than the preset threshold, adding the current frame into a key frame set.
Further, the obtaining the left-eye and right-eye motion visual matching feature point pairs according to the feature points includes:
carrying out ORB descriptor matching on the feature points corresponding to the map points in the current frame to which the left target image belongs and the map points in the previous frame to which the left target image belongs to obtain a left target matching feature point pair;
carrying out ORB descriptor matching on the feature points corresponding to the map points in the current frame to which the right-eye image belongs and the map points in the previous frame to which the right-eye image belongs to obtain right-eye matched feature point pairs; wherein the left-eye and right-eye moving visual matching feature point pairs include the left-eye matching feature point pair and the right-eye matching feature point pair.
Further, the obtaining the stereoscopic vision matching feature point pair of the left eye and the right eye according to the feature point comprises:
and respectively carrying out ORB descriptor matching and SAD stereo matching on the feature points in the current frame to which the left eye image belongs and the feature points in the current frame to which the right eye image belongs to obtain the stereo matching feature point pairs of the left eye and the right eye.
A second aspect of the embodiments of the present invention provides a binocular vision positioning system in a dynamic environment, including:
the system comprises a main thread, a pose estimation thread, a loop thread and a back-end optimization thread;
the main thread is used for acquiring a left eye image and a right eye image under binocular vision of the camera, initializing the left eye image and the right eye image, receiving a restart signal sent by the pose estimation thread or the loop thread after the initialization is successful, and initializing the pose estimation thread or the loop thread again according to the restart signal;
the pose estimation thread is used for acquiring a left eye image and a right eye image under binocular vision of the camera, extracting feature points in a current frame to which the left eye image and the right eye image belong and feature points corresponding to a map point in a previous frame to which the left eye image and the right eye image belong, and calculating an estimated pose and an estimated map point of the current frame according to the feature points;
the loop thread is used for detecting a loop frame, estimating a pose transformation matrix between a current frame and the loop frame, and updating the pose of the current frame by using the pose of the loop frame as a reference;
and the back-end optimization thread is used for optimizing the poses of all the frames according to the pose estimation result of the IMU pre-integration algorithm and the pose transformation matrix between the current frame and the loop frame.
Further, the calculating the estimated pose and the estimated map point of the current frame according to the feature points includes:
according to the characteristic points, obtaining the left-eye and right-eye movement visual matching characteristic point pairs and stereoscopic vision matching characteristic point pairs;
screening the shared matching characteristic point pairs of the stereoscopic vision matching characteristic point pair and the motion vision matching characteristic point pair;
respectively carrying out stereo depth estimation on the characteristic point pairs in the common matching characteristic point pair set to obtain a stereo depth estimation value;
clustering the left-target and right-target motion visual matching feature point pairs according to the optical flow velocity and the rotation angle between the motion visual matching feature point pairs by using a K-means + + clustering algorithm to obtain clustering results, and classifying the motion visual matching feature point pairs into a plurality of categories according to the clustering results;
calculating pose estimation results of the IMU of the left eye and the right eye in a time period between a current frame and a previous frame by utilizing an IMU pre-integration algorithm, wherein the pose estimation results comprise inter-frame poses between the current frame and the previous frame;
respectively carrying out EPNP (extended post-processing) pose estimation and map point estimation on the motion visual matching feature point pairs under each category to obtain a pose estimation result and a map point estimation result of the EPNP;
selecting the pose estimation result of the EPNP closest to the pose estimation result of the IMU as an optimal EPNP pose estimation result and a map point estimation result, and recording effective matching feature point pairs corresponding to the optimal EPNP pose estimation result and the map point estimation result;
and selecting a characteristic point pair corresponding to the effective matching characteristic point pair from the common matching characteristic point pair, and performing Kalman filtering fusion on a stereo depth estimation value corresponding to the characteristic point pair and an EPNP estimated depth value to obtain an estimated pose and an estimated map point.
The method extracts the image characteristic points, and distinguishes the characteristic points on the moving object points and the characteristic points on the non-moving object points in a characteristic point clustering mode, so that the method can effectively eliminate the characteristic points on the moving object, avoid eliminating the characteristic points on the static dynamic attribute object and further reserve more available characteristic points; screening poses which are closest to IMU data and estimated based on different feature point sets through an interframe pose transformation matrix obtained by an IMU pre-integration algorithm, and effectively removing external points such as feature points belonging to moving objects; the pose and depth estimation method integrating monocular motion vision and binocular stereo vision can effectively improve the estimation precision.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
fig. 2 is a general frame schematic diagram of a binocular vision positioning method under a dynamic environment according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of a main thread of a binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
fig. 4 is a schematic flowchart of a pose estimation thread of a binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
fig. 5 is a partially enlarged view of a schematic flow chart of a pose estimation thread of the binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
fig. 6 is a partially enlarged view of a schematic flow chart of a pose estimation thread of the binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
fig. 7 is a partially enlarged view of a schematic flow chart of a pose estimation thread of the binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
fig. 8 is a schematic diagram of a feature point pair of a binocular vision positioning method in a dynamic environment according to an embodiment of the present invention illustrating relative speed and angle;
fig. 9 is a schematic flowchart of a looping thread of a binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
FIG. 10 is a schematic diagram of loop detection according to an embodiment of the present invention;
FIG. 11 is a schematic diagram of a back-end optimization process according to an embodiment of the present invention;
fig. 12 is a schematic diagram of a backend optimization effect according to an embodiment of the present invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the application, the method for screening the moving object is not only suitable for a binocular camera, but also can be used for monocular, monocular and binocular RGB-D cameras, so that the monocular, monocular and binocular RGB-D cameras can also be used for achieving the purpose of the invention. In the embodiment of the application, a visual SLAM algorithm is taken as an example, but for a laser sensor, a mechanism for screening a moving object based on the method provided by the invention instead of a dynamic object can be completed in the laser SLAM, so that the pose estimation of the moving object can be completed in a dynamic environment by adopting a method for clustering similar features in the laser SLAM.
Referring to fig. 1, fig. 1 is a schematic flow chart of a binocular vision positioning method under a dynamic environment according to an embodiment of the present invention, the method mainly includes:
s101, acquiring a left eye image and a right eye image under binocular vision of a camera;
s102, extracting feature points corresponding to the feature points in the current frame to which the left eye image and the right eye image belong and the map point in the previous frame to which the left eye image and the right eye image belong, wherein the map points are feature points added into a map;
s103, obtaining the left-eye and right-eye movement visual matching characteristic point pairs and stereoscopic vision matching characteristic point pairs according to the characteristic points;
s104, screening a common matching characteristic point pair of the stereoscopic vision matching characteristic point pair and the motion vision matching characteristic point pair;
s105, respectively carrying out stereo depth estimation on the characteristic point pairs in the common matching characteristic point pair set to obtain a stereo depth estimation value;
s106, clustering the left-eye and right-eye motion visual matching feature point pairs according to the optical flow velocity and the rotation angle between the motion visual matching feature point pairs by using a K-means + + clustering algorithm to obtain a clustering result, and classifying the motion visual matching feature point pairs into a plurality of categories according to the clustering result;
s107, calculating pose estimation results of the IMU of the left eye and the right eye in a time period between the current frame and the previous frame by utilizing an IMU pre-integration algorithm;
s108, respectively carrying out EPNP (extended post-processing) pose estimation and map point estimation on the motion vision matching feature point pairs under each category to obtain an EPNP pose estimation result and a map point estimation result;
s109, selecting the pose estimation result of the EPNP closest to the pose estimation result of the IMU as an optimal EPNP pose estimation result and a map point estimation result, and recording effective matching feature point pairs corresponding to the optimal EPNP pose estimation result and the map point estimation result;
s110, selecting a characteristic point pair corresponding to the effective matching characteristic point pair from the common matching characteristic point pair, and performing Kalman filtering fusion on a stereo depth estimation value corresponding to the characteristic point pair and an EPNP estimated depth value to obtain an estimated pose and an estimated map point.
Further, the IMU pre-integration algorithm:
wherein,respectively representing displacement increment, speed increment and rotation in the time period from the k frame to the k +1 frame under the k frame camera coordinate systemQuaternion of angular variation, atRepresenting accelerometer measurements, wtRepresenting measured values of angular velocity, batAnd bwtRepresenting the bias of the accelerometer and gyroscope, respectively, naAnd nwRespectively, represent the noise, and are,a rotation transformation matrix representing the time t relative to the k frame camera coordinate systemThe correlation equation yields, Ω is the matrix related to the quaternion calculation, Δ tkRepresents tkTo tk+1Time interval of (g)wRepresenting the gravity vector in the world coordinate system.
Further, after the classifying the moving visual matching feature point pairs into a plurality of categories according to the clustering result, the method includes:
performing histogram statistics on the number of the feature points in each category;
and deleting the category and the corresponding feature points under the category, wherein the number of the corresponding feature points under each category is less than a first preset number, and the difference value between the number of the corresponding feature points under each category and the number of the corresponding feature points under other categories is greater than a second preset number.
Further, the step of respectively performing EPNP pose estimation and map point estimation on the motion visual matching feature point pairs under each category to obtain an EPNP pose estimation result and a map point estimation result includes:
respectively carrying out EPNP position and map point estimation on the feature points under each category, randomly adopting a consistency algorithm through RANSAC, and carrying out outlier elimination on the position and orientation estimation result of each category to obtain the position and orientation estimation result of the EPNP corresponding to each category and the map point estimation result.
Further, after obtaining the estimated pose and the estimated map point, the method includes:
optimizing the estimated pose and the estimated map points by utilizing a map-based optimization algorithm according to the pose estimation result of the IMU to obtain optimized map points;
the optimization algorithm of the graph is as follows:
which is composed ofRepresents an estimated pose transformation matrix obtained by EPNP pose estimation,representing a pose transformation matrix, z, obtained by an IMU pre-integration algorithmk+1Pixel coordinates representing the camera image, Pk represents the 3D coordinates of the map point in the k-th frame camera coordinate system, pic(.) is a projective transformation based on a model of the camera for transforming 3D coordinates in the camera coordinate system into 2D pixel coordinates in the camera image.
Further, the optimizing the estimated pose and the estimated map point to obtain an optimized map point includes:
replacing the map points before optimization by the optimized map points, updating the local map, and eliminating points which do not belong to the local map;
comparing whether the common-view feature point between the current frame and the next new key frame is higher than a preset threshold value or not;
if the current key frame is higher than the preset threshold, replacing the latest key frame with the current key frame;
if the current frame is lower than the preset threshold, adding the current frame into the key frame set.
Further, the obtaining the pair of left-eye and right-eye motion visual matching feature points according to the feature points comprises:
carrying out ORB descriptor matching on the feature points corresponding to the map points in the current frame to which the left target image belongs and the map points in the previous frame to which the left target image belongs to obtain a left target matching feature point pair;
carrying out ORB descriptor matching on the feature point in the current frame to which the right-eye image belongs and the feature point corresponding to the map point in the previous frame to which the right-eye image belongs to obtain a right-eye matching feature point pair; wherein the left-eye and right-eye moving visual matching feature point pairs include the left-eye matching feature point pair and the right-eye matching feature point pair.
Further, the obtaining the pair of stereoscopic vision matching feature points of the left eye and the right eye according to the feature points comprises:
and respectively carrying out ORB descriptor matching and SAD stereo matching on the feature points in the current frame to which the left eye image belongs and the feature points in the current frame to which the right eye image belongs to obtain the stereo matching feature point pairs of the left eye and the right eye.
In the embodiment of the invention, the moving object is removed instead of the dynamic object, the moving object and the non-moving object are distinguished in a clustering mode, and the speed and the angle of the inter-frame feature point are clustered, so that the matching feature point set of the moving object is distinguished from the non-moving object feature point set; screening poses which are closest to IMU data and estimated based on different feature point sets through an interframe pose transformation matrix obtained by an IMU pre-integration algorithm, and effectively removing external points such as feature points belonging to moving objects; the pose and depth estimation method integrating monocular motion vision and binocular stereo vision can effectively improve the estimation precision.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a binocular vision positioning system under a dynamic environment according to another embodiment of the present invention, the system mainly includes:
the system comprises a main thread, a pose estimation thread, a loop thread and a back-end optimization thread;
the main thread is used for acquiring a left eye image and a right eye image under binocular vision of the camera, initializing the images, receiving a restart signal sent by the pose estimation thread or the loop thread after the initialization is successful, and initializing the images again according to the restart signal;
specifically, as shown in fig. 3, the main thread of the system mainly functions to start other threads and initialize a map, and provide initial trusted data for the subsequent other threads, and the main flow and the working principle are as follows:
the method comprises the steps of respectively obtaining images from the left eye and the right eye of a binocular camera, extracting feature points of the left eye image and the right eye image by using FAST corner detection, dividing the images into 10 sub-image areas, and selecting N corner points with the maximum response values from large to small in each sub-image area according to Harris response values to serve as a final feature point set. And calculating the rotation direction of each feature point, wherein the rotation direction of the feature point is defined as a direction vector from the geometric center to the centroid of the image block with the feature point as the center. Then, a descriptor of the feature point is calculated, which uses the same binary BRIEF descriptor as orb (organized FAST and Rotated BRIEF).
ORB description sub-matching is firstly carried out on the feature points of the left eye image and the feature points of the right eye image, then SAD (sum of absolute differences) sub-pixel level matching is carried out, and effective feature point screening is carried out by utilizing the rotation angle consistency of the feature points, so that accurate feature matching feature point pairs of the left camera image and the right camera image are obtained.
Using binocular camera modelsCalculating the real depth of the matched characteristic point, wherein z represents the real coordinate of the characteristic point along the z-axis direction under the reference system of the camera, f represents the focal length of the camera, b is the base line of the camera, u is the base line of the cameraL、uRRespectively, the abscissa of the pixel of the feature point in the left eye image and the right eye image. And detecting whether z is positive, if the depth estimation is regular, restoring the 3D coordinates of the feature point, wherein the 3D coordinates are called map points, if the depth estimation is not regular, the depth estimation of the feature point fails, and rejecting the feature point pair.
If the successfully matched characteristic point pair is larger than a certain threshold value, the initialization is considered to be successful, otherwise, the initialization is carried out again.
Once the initialization validation is successful, map points are added to the local map and global map classes and the current frame is added as a key frame and marked as the previous frame for subsequent processing. The local map is a map point set composed of map points corresponding to all matched feature point pairs on the current image frame and the past frame having a common view relationship with the current image frame, and the global map is a map point set composed of all past map points. The key frame set is used for maintaining the data set size, and only the image frames meeting the conditions are stored locally as key frames for subsequent closed-loop detection and back-end optimization.
After the whole system is initialized successfully, the system enters a state of polling signals of other threads, and if restart signals sent by other threads are received, the system enters the state of acquiring images from the left eye and the right eye of the binocular camera again to start to initialize the system again.
The pose estimation thread is used for acquiring a left eye image and a right eye image under binocular vision of the camera, extracting feature points in a current frame to which the left eye image and the right eye image belong and feature points corresponding to a map point in a previous frame to which the left eye image and the right eye image belong, and calculating an estimated pose and an estimated map point of the current frame according to the feature points;
specifically, referring to fig. 4, 5, 6, and 7, the working process of the pose estimation thread is as follows:
images are acquired from the left and right eyes of a binocular camera respectively,
the method comprises the steps of respectively obtaining images from the left eye and the right eye of a binocular camera, extracting feature points of the left eye image and the right eye image by using FAST corner detection, dividing the images into 10 sub-image areas, and selecting N corner points with the maximum response values from large to small in each sub-image area according to Harris response values to serve as a final feature point set. And calculating the rotation direction of each feature point, wherein the rotation direction of the feature point is defined as a direction vector from the geometric center to the centroid of the image block with the feature point as the center. Then, a descriptor of the feature point is calculated, which uses the same binary BRIEF descriptor as orb (organized FAST and Rotated BRIEF). Reading acceleration a from a preset inertial navigation module (IMU module)tAnd angular velocity wtAnd filtering the acceleration and the angular velocity.
The IMU inertial navigation module is used as an auxiliary means for screening a proper visual positioning result in a dynamic environment, and sensors capable of obtaining displacement, speed, angular speed and angle of the robot by using a code disc, an encoder, an electronic compass and the like are used as auxiliary means to achieve the processing effect of matching with the visual sensors in the dynamic environment.
Carrying out ORB descriptor matching on the feature points corresponding to the map points in the previous frame to which the feature points belong and the current frame to which the left eye and the right eye belong respectively, accelerating the feature point matching by adopting a DBOW3 bag-of-vocabulary technology, then screening effective feature points by utilizing the feature point rotation angle consistency to respectively obtain a left eye matching feature point pair and a right eye feature point pair, which are called as a left eye and right eye motion vision matching feature point pair. And performing ORB descriptor matching and SAD stereo matching optimization on the current left eye frame and the current right eye frame to obtain more accurate stereo matching feature point pairs of the left eye and the right eye, namely the stereo matching feature point pairs, then screening out a common matching feature point pair set of the motion visual matching feature point pairs and the extracted visual matching feature point pairs, performing stereo depth estimation on corresponding feature points, and restoring the 3D coordinates of the feature points.
Calculating the optical flow velocity v and the rotation angle theta between the left-eye and right-eye motion visual matching feature point pairs, wherein the principle of the optical flow velocity and the rotation angle is schematically shown in fig. 8, and the calculation formula is as follows:
and (4) utilizing a K-means + + clustering algorithm to cluster the left-target and right-target motion visual matching feature point pairs into M categories according to the optical flow velocity and the rotation angle. In an actual environment, the speed and the direction of a moving object are greatly different from those of a background element, the moving objects with different speeds can be isolated from the characteristic points of a static object through clustering, after a clustering result is obtained, histogram statistics is carried out on the characteristic points of each category, and the categories with too small statistical quantity and large difference with other data and corresponding characteristic point pairs are screened out.
The IMU pre-integration technology is utilized to calculate the interframe displacement of the left eye camera and the right eye camera in the time period of the previous frame and the current frame, and the core formula of the IMU pre-integration principle is as follows
Wherein,quaternion, a, representing displacement increment, velocity increment, and rotation angle change in a period from the k-th frame to the k + 1-th frame in the k-th frame camera coordinate system, respectivelytRepresenting accelerometer measurements, wtRepresenting measured values of angular velocity, batAnd bwtRepresenting the bias of the accelerometer and gyroscope, respectively, naAnd nwRespectively, represent the noise, and are,a rotation transformation matrix representing the time t relative to the k frame camera coordinate systemThe correlation formula is obtained, and omega is the phase of quaternion calculationMatrix of off, Δ tkRepresents tkTo tk+1Time interval of (g)wRepresenting the gravity vector in the world coordinate system.
And respectively carrying out EPNP (extended range performance) pose estimation and map point estimation according to different types of matching feature point sets by utilizing different clusters of the obtained motion vision matching feature point sets, and carrying out outlier rejection on the pose estimation result of each type by randomly adopting a consistency algorithm through RANSAC (random sample consensus). After the pose estimation result corresponding to each category is obtained, the pose estimation result is compared with the IMU pose result, generally, the IMU short-term pose estimation after filtering optimization is accurate, so that real EPNP can be effectively screened by using IMU inter-frame pose estimation, the pose estimation result of a moving object feature point set is eliminated, the image pose estimation result is more real and accurate, and the influence of the moving object on the estimation result is reduced. By combining with a screening mechanism of IMU data, accurate EPNP pose and map point estimation results can be obtained, the EPNP pose and map point estimation results are called as optimal EPNP pose and map point estimation results, and a corresponding matching feature point set is called as a motion vision effective matching feature point pair set.
And (3) screening out feature points belonging to an effective matching feature point pair set from the common matching feature point set, namely, selecting out feature point pairs belonging to feature point classes corresponding to the optimal EPNP pose estimation result, and then performing Kalman filtering fusion on the stereo depth estimation value corresponding to the feature point pairs and the depth value estimated by the EPNP, thereby obtaining more accurate map points.
The pose and map points obtained by using a map-based optimization algorithm are optimized, the result obtained by IMU pre-integration is blended in the optimization process, and the IMU pre-integration result is also used as one of optimization constraint conditions, so that the effect of blending optimization is achieved, and the core formula of the optimization is as follows:
whereinRepresents an estimated pose transformation matrix obtained by EPNP pose estimation,representing a pose transformation matrix, z, obtained by an IMU pre-integration algorithmk+1Pixel coordinates representing the camera image, Pk represents the 3D coordinates of the map point in the k-th frame camera coordinate system, pic(.) is a projective transformation based on a model of the camera for transforming 3D coordinates in the camera coordinate system into 2D pixel coordinates in the camera image.
After pose estimation is completed, adding correct map points into a local map and a global map, updating and replacing the local map once, removing points which do not belong to the local map, simultaneously comparing whether a common-view feature point between a current frame and a next new key frame is higher than a threshold value, if the common-view feature point is higher than the threshold value, indicating that the coincidence information between the current frame and the latest key frame is excessive, adding a new key frame is not needed, replacing the latest key frame by using the current key frame, and partially replacing and deleting the local map points and the global map points. If the information difference between the current frame and the latest key frame is larger than the threshold value, the current frame is added into the key frame set to increase the information capacity of the environment.
And performing BundleAdjustment optimization and outlier elimination on the poses of the key frames with enough common-view feature points with the latest key frame and the map points related to the key frames. And adding the previous frame and the key frame having the common-view relation with the current frame into the common view of the current frame, and updating the data map structure. And then enters the next round of circulation.
In the process, the method is carried out on the premise that the map point of the previous frame and the current frame are enough matched with the feature point, if the matched feature point is insufficient, the feature point needs to be supplemented, two methods are mainly used for supplementing the feature point, one is to search a local map and increase the map point which can be matched, the other is to carry out feature matching on the previous frame and the current frame again, the range of the matched feature point is not limited to the feature point of the previous frame corresponding to the map point, but all the feature points in the previous frame are involved in matching, so that the vacancy of the feature point is effectively compensated, once the feature point meets the requirement, the pose estimation is continuously carried out, if the pose does not meet the requirement, the IMU is used for recording short-term pose transformation to keep the continuity of the track, the loop is carried out in the next round until enough feature point is found, if the enough feature point is not found continuously for a long time, the positioning is invalid, the IMU is also inaccurate at this time due to accumulated errors, requiring re-initialization of the system.
The loop thread is used for detecting a loop frame, estimating a pose transformation matrix between a current frame and the loop frame, and updating the pose of the current frame by using the pose of the loop frame as a reference;
specifically, referring to fig. 9 and 10, the loop detection means detecting whether the robot returns to a place where the robot has arrived, and if the robot returns to the place where the robot has arrived, a geometric constraint between the current frame and the loop frame may be established, and an accumulated error between the loop frame and the current frame is eliminated through optimization. Loop detection is similar to a feature matching process, DBOW3 bag acceleration is mainly adopted, frames with a very high matching rate with the latest key frame are detected in all the past key frames, once the frames are detected, the frames are likely to be loop frames, whether the frames are the loop frames or not is determined according to a series of screening mechanisms such as geometric consistency, group matching and time consistency performance methods, once the loop frames are found in a key frame set, a pose transformation matrix between the current frame and the loop frames is estimated, the pose transformation matrix between the loop frames and the current frame and map points of the current frame are subjected to BundleAdjustment optimization and outlier rejection, finally the pose of the current frame is updated through the pose transformation matrix between the loop frames and the current frame by taking the pose of the loop frames as a reference, the pose of the common view frame with high degree of common view with the current frame is updated, and the map points related to related frames are updated, the whole loop optimization function eliminates part of accumulated errors, sets up set constraints and provides constraint conditions for global consistency optimization.
The back-end optimization thread is used for optimizing the poses of all the frames according to the pose estimation result of the IMU pre-integration algorithm and the pose transformation matrix between the current frame and the loop frame.
Specifically, please refer to fig. 11 and 12, which are the keys for improving the positioning accuracy and reducing the global accumulated error, and the back-end optimization module optimizes all the key frames and the common-view frame and the global map point of the current frame. The whole optimization process is divided into two major steps, wherein the pose of a key frame is optimized in the first step, and all poses are optimized by introducing inter-frame pose constraints of IMU pre-integration and constraints between a current frame and a loop frame.
The invention changes the existing mode of processing dynamic environment, can keep a large number of effective characteristic points by rejecting the moving object but not the mode of the dynamic object, can keep the static characteristic points of the dynamic object to improve the accuracy of pose estimation, and compared with the mode of rejecting the dynamic object, the invention can adapt to extreme conditions, such as only static dynamic object in a scene, if the mode of rejecting the dynamic object is adopted, enough characteristic points can not be obtained, and the method of rejecting the moving object can keep the static characteristic points of the dynamic object for pose estimation.
The invention changes the prior SLAM method for dealing with a dynamic environment by a semantic tag screening mode based on deep learning, provides a method for clustering the speed and the rotation direction of an inter-frame matching feature point pair to distinguish moving objects from non-moving objects (static objects and static dynamic objects), adopts a mode of taking IMU data as auxiliary constraint information, screens the pose estimation closest to the IMU data in the pose estimation results of different clustering data sets as the optimal pose estimation to eliminate the interference of the moving objects on the pose estimation under a dynamic scene, and greatly reduces the operation complexity and improves the real-time performance of a positioning algorithm compared with the semantic mode based on deep learning.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present invention is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required of the invention.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The above description is provided for the binocular vision positioning method and system in a dynamic environment, and for those skilled in the art, there may be variations in the specific implementation and application scope according to the ideas of the embodiments of the present invention, and in summary, the content of the present description should not be construed as limiting the present invention.
Claims (10)
1. A binocular vision positioning method under a dynamic environment is characterized by comprising the following steps:
acquiring a left eye image and a right eye image under binocular vision of a camera, and extracting feature points corresponding to map points in a current frame to which the left eye image and the right eye image belong and a previous frame to which the left eye image and the right eye image belong, wherein the map points are feature points added into a map;
according to the characteristic points, obtaining the left-eye and right-eye movement visual matching characteristic point pairs and stereoscopic vision matching characteristic point pairs;
screening the shared matching characteristic point pairs of the stereoscopic vision matching characteristic point pair and the motion vision matching characteristic point pair;
respectively carrying out stereo depth estimation on the characteristic point pairs in the common matching characteristic point pair set to obtain a stereo depth estimation value;
clustering the left-target and right-target motion visual matching feature point pairs according to the optical flow velocity and the rotation angle between the motion visual matching feature point pairs by using a K-means + + clustering algorithm to obtain clustering results, and classifying the motion visual matching feature point pairs into a plurality of categories according to the clustering results;
calculating pose estimation results of the IMU of the left eye and the right eye in a time period between a current frame and a previous frame by utilizing an IMU pre-integration algorithm, wherein the pose estimation results comprise inter-frame poses between the current frame and the previous frame;
respectively carrying out EPNP (extended post-processing) pose estimation and map point estimation on the motion visual matching feature point pairs under each category to obtain a pose estimation result and a map point estimation result of the EPNP;
selecting the pose estimation result of the EPNP closest to the pose estimation result of the IMU as an optimal EPNP pose estimation result and a map point estimation result, and recording effective matching feature point pairs corresponding to the optimal EPNP pose estimation result and the map point estimation result;
and selecting a characteristic point pair corresponding to the effective matching characteristic point pair from the common matching characteristic point pair, and performing Kalman filtering fusion on a stereo depth estimation value corresponding to the characteristic point pair and an EPNP estimated depth value to obtain an estimated pose and an estimated map point.
2. The binocular vision positioning method under the dynamic environment of claim 1, wherein the IMU pre-integration algorithm:
whereinQuaternion, a, representing displacement increment, velocity increment, and rotation angle change in a period from the k-th frame to the k + 1-th frame in the k-th frame camera coordinate system, respectivelytRepresenting accelerometer measurements, wtRepresenting measured values of angular velocity, batAnd bwtRepresenting the bias of the accelerometer and gyroscope, respectively, naAnd nwRespectively, represent the noise, and are,a rotation transformation matrix representing the time t relative to the k frame camera coordinate systemThe correlation equation yields, Ω is the matrix related to the quaternion calculation, Δ tkRepresents tkTo tk+1Time interval of (g)wRepresenting the gravity vector in the world coordinate system.
3. The binocular vision positioning method under the dynamic environment according to claim 1, wherein after the classifying the moving vision matching feature point pairs into a plurality of categories according to the clustering result, the method comprises:
performing histogram statistics on the number of the feature points in each category;
and deleting the categories and the corresponding feature points under the categories, wherein the number of the corresponding feature points under each category is less than a first preset number, and the difference between the number of the corresponding feature points under each category and the number of the corresponding feature points under other categories is greater than a second preset number.
4. The binocular vision positioning method under the dynamic environment of claim 3, wherein the performing EPNP pose estimation and map point estimation on the pairs of the moving vision matching feature points under each category respectively to obtain EPNP pose estimation results and map point estimation results comprises:
respectively carrying out EPNP position and map point estimation on the feature points under each category, randomly adopting a consistency algorithm through RANSAC, and carrying out outlier elimination on the position and orientation estimation result of each category to obtain the position and orientation estimation result of the EPNP corresponding to each category and the map point estimation result.
5. The binocular vision positioning method under the dynamic environment according to claim 4, wherein after obtaining the estimated pose and the estimated map point, the method comprises:
optimizing the estimated pose and the estimated map points by utilizing a map-based optimization algorithm according to the inter-frame pose of the IMU to obtain optimized map points;
the optimization algorithm of the graph is as follows:
wherein,represents an estimated pose transformation matrix obtained by EPNP pose estimation,representing a pose transformation matrix, z, obtained by an IMU pre-integration algorithmk+1Pixel coordinates, P, representing a camera imagekRepresenting the 3D coordinates of a map point in the kth frame camera coordinate system, pic(.) is a projective transformation based on a model of the camera for transforming 3D coordinates in the camera coordinate system into 2D pixel coordinates in the camera image.
6. The binocular vision positioning method under the dynamic environment according to claim 5, wherein the optimizing the estimated pose and the estimated map points to obtain the optimized map points comprises:
replacing the map points before optimization by the optimized map points, updating a local map, and eliminating map points which do not belong to the local map;
comparing whether the common-view feature point between the current frame and the next new key frame is higher than a preset threshold value or not;
if the current key frame is higher than the preset threshold, replacing the latest key frame with the current key frame;
and if the current frame is lower than the preset threshold, adding the current frame into a key frame set.
7. The binocular vision positioning method under the dynamic environment according to claim 1, wherein the obtaining the left-eye and right-eye moving vision matching feature point pairs according to the feature points comprises:
carrying out ORB descriptor matching on the feature points corresponding to the map points in the current frame to which the left target image belongs and the map points in the previous frame to which the left target image belongs to obtain a left target matching feature point pair;
carrying out ORB descriptor matching on the feature points corresponding to the map points in the current frame to which the right-eye image belongs and the map points in the previous frame to which the right-eye image belongs to obtain right-eye matched feature point pairs; wherein the left-eye and right-eye moving visual matching feature point pairs include the left-eye matching feature point pair and the right-eye matching feature point pair.
8. The binocular vision positioning method under the dynamic environment according to claim 1, wherein the obtaining the stereoscopic vision matching feature point pairs of the left and right eyes according to the feature points comprises:
and respectively carrying out ORB descriptor matching and SAD stereo matching on the feature points in the current frame to which the left eye image belongs and the feature points in the current frame to which the right eye image belongs to obtain the stereo matching feature point pairs of the left eye and the right eye.
9. A binocular vision positioning system in a dynamic environment, comprising:
the system comprises a main thread, a pose estimation thread, a loop thread and a back-end optimization thread;
the main thread is used for acquiring a left eye image and a right eye image under binocular vision of the camera, initializing the left eye image and the right eye image, receiving a restart signal sent by the pose estimation thread or the loop thread after the initialization is successful, and initializing the pose estimation thread or the loop thread again according to the restart signal;
the pose estimation thread is used for acquiring a left eye image and a right eye image under binocular vision of the camera, extracting feature points in a current frame to which the left eye image and the right eye image belong and feature points corresponding to a map point in a previous frame to which the left eye image and the right eye image belong, and calculating an estimated pose and an estimated map point of the current frame according to the feature points;
the loop thread is used for detecting a loop frame, estimating a pose transformation matrix between a current frame and the loop frame, and updating the pose of the current frame by using the pose of the loop frame as a reference;
the back-end optimization thread is used for calculating pose estimation results of the IMU of the left eye and the right eye in a time period between a current frame and a previous frame by utilizing an IMU pre-integration algorithm, wherein the pose estimation results comprise inter-frame poses between the current frame and the previous frame, and the poses of all the frames are optimized according to the pose estimation results of the IMU pre-integration algorithm and a pose transformation matrix between the current frame and a loop frame.
10. The binocular vision positioning system under dynamic environment of claim 9, wherein the calculating the estimated pose and the estimated map point of the current frame according to the feature points comprises:
according to the characteristic points, obtaining the left-eye and right-eye movement visual matching characteristic point pairs and stereoscopic vision matching characteristic point pairs;
screening the shared matching characteristic point pairs of the stereoscopic vision matching characteristic point pair and the motion vision matching characteristic point pair;
respectively carrying out stereo depth estimation on the characteristic point pairs in the common matching characteristic point pair set to obtain a stereo depth estimation value;
clustering the left-target and right-target motion visual matching feature point pairs according to the optical flow velocity and the rotation angle between the motion visual matching feature point pairs by using a K-means + + clustering algorithm to obtain clustering results, and classifying the motion visual matching feature point pairs into a plurality of categories according to the clustering results;
respectively carrying out EPNP (extended post-processing) pose estimation and map point estimation on the motion visual matching feature point pairs under each category to obtain a pose estimation result and a map point estimation result of the EPNP;
selecting the pose estimation result of the EPNP closest to the pose estimation result of the IMU as an optimal EPNP pose estimation result and a map point estimation result, and recording effective matching feature point pairs corresponding to the optimal EPNP pose estimation result and the map point estimation result;
and selecting a characteristic point pair corresponding to the effective matching characteristic point pair from the common matching characteristic point pair, and performing Kalman filtering fusion on a stereo depth estimation value corresponding to the characteristic point pair and an EPNP estimated depth value to obtain an estimated pose and an estimated map point.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910634021.1A CN110490900B (en) | 2019-07-12 | 2019-07-12 | Binocular vision positioning method and system under dynamic environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910634021.1A CN110490900B (en) | 2019-07-12 | 2019-07-12 | Binocular vision positioning method and system under dynamic environment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110490900A CN110490900A (en) | 2019-11-22 |
CN110490900B true CN110490900B (en) | 2022-04-19 |
Family
ID=68546103
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910634021.1A Active CN110490900B (en) | 2019-07-12 | 2019-07-12 | Binocular vision positioning method and system under dynamic environment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110490900B (en) |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111091621A (en) * | 2019-12-11 | 2020-05-01 | 东南数字经济发展研究院 | Binocular vision synchronous positioning and composition method, device, equipment and storage medium |
CN111016887A (en) * | 2019-12-23 | 2020-04-17 | 深圳市豪恩汽车电子装备股份有限公司 | Automatic parking device and method for motor vehicle |
CN111060113B (en) * | 2019-12-31 | 2022-04-08 | 歌尔股份有限公司 | Map updating method and device |
CN111476812A (en) * | 2020-04-03 | 2020-07-31 | 浙江大学 | Map segmentation method and device, pose estimation method and equipment terminal |
CN111598939B (en) * | 2020-05-22 | 2021-01-26 | 中原工学院 | Human body circumference measuring method based on multi-vision system |
CN111639658B (en) * | 2020-06-03 | 2023-07-21 | 北京维盛泰科科技有限公司 | Method and device for detecting and eliminating dynamic feature points in image matching |
CN111683203B (en) | 2020-06-12 | 2021-11-09 | 达闼机器人有限公司 | Grid map generation method and device and computer readable storage medium |
CN111862150B (en) * | 2020-06-19 | 2024-06-14 | 杭州易现先进科技有限公司 | Image tracking method, device, AR equipment and computer equipment |
CN111914784B (en) * | 2020-08-10 | 2022-02-08 | 北京大成国测科技有限公司 | Method and device for detecting intrusion of trackside obstacle in real time and electronic equipment |
CN113759384B (en) * | 2020-09-22 | 2024-04-05 | 北京京东乾石科技有限公司 | Method, device, equipment and medium for determining pose conversion relation of sensor |
CN112241983A (en) * | 2020-10-19 | 2021-01-19 | 深圳市目心智能科技有限公司 | Perception system and robot based on initiative binocular vision |
CN112444246B (en) * | 2020-11-06 | 2024-01-26 | 北京易达恩能科技有限公司 | Laser fusion positioning method in high-precision digital twin scene |
CN112734842B (en) * | 2020-12-31 | 2022-07-01 | 武汉第二船舶设计研究所(中国船舶重工集团公司第七一九研究所) | Auxiliary positioning method and system for centering installation of large ship equipment |
CN113220818B (en) * | 2021-05-27 | 2023-04-07 | 南昌智能新能源汽车研究院 | Automatic mapping and high-precision positioning method for parking lot |
CN113503873B (en) * | 2021-07-14 | 2024-03-12 | 北京理工大学 | Visual positioning method for multi-sensor fusion |
CN113920194B (en) * | 2021-10-08 | 2023-04-21 | 电子科技大学 | Positioning method of four-rotor aircraft based on visual inertia fusion |
CN114066824B (en) * | 2021-10-28 | 2024-05-14 | 华南理工大学 | Binocular vision odometer method with dynamic target detection function |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105469405A (en) * | 2015-11-26 | 2016-04-06 | 清华大学 | Visual ranging-based simultaneous localization and map construction method |
CN105953796A (en) * | 2016-05-23 | 2016-09-21 | 北京暴风魔镜科技有限公司 | Stable motion tracking method and stable motion tracking device based on integration of simple camera and IMU (inertial measurement unit) of smart cellphone |
CN107193279A (en) * | 2017-05-09 | 2017-09-22 | 复旦大学 | Robot localization and map structuring system based on monocular vision and IMU information |
CN109166149A (en) * | 2018-08-13 | 2019-01-08 | 武汉大学 | A kind of positioning and three-dimensional wire-frame method for reconstructing and system of fusion binocular camera and IMU |
CN109307508A (en) * | 2018-08-29 | 2019-02-05 | 中国科学院合肥物质科学研究院 | A kind of panorama inertial navigation SLAM method based on more key frames |
CN109465832A (en) * | 2018-12-18 | 2019-03-15 | 哈尔滨工业大学(深圳) | High-precision vision and the tight fusion and positioning method of IMU and system |
CN109631855A (en) * | 2019-01-25 | 2019-04-16 | 西安电子科技大学 | High-precision vehicle positioning method based on ORB-SLAM |
CN109993113A (en) * | 2019-03-29 | 2019-07-09 | 东北大学 | A kind of position and orientation estimation method based on the fusion of RGB-D and IMU information |
-
2019
- 2019-07-12 CN CN201910634021.1A patent/CN110490900B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105469405A (en) * | 2015-11-26 | 2016-04-06 | 清华大学 | Visual ranging-based simultaneous localization and map construction method |
CN105953796A (en) * | 2016-05-23 | 2016-09-21 | 北京暴风魔镜科技有限公司 | Stable motion tracking method and stable motion tracking device based on integration of simple camera and IMU (inertial measurement unit) of smart cellphone |
CN107193279A (en) * | 2017-05-09 | 2017-09-22 | 复旦大学 | Robot localization and map structuring system based on monocular vision and IMU information |
CN109166149A (en) * | 2018-08-13 | 2019-01-08 | 武汉大学 | A kind of positioning and three-dimensional wire-frame method for reconstructing and system of fusion binocular camera and IMU |
CN109307508A (en) * | 2018-08-29 | 2019-02-05 | 中国科学院合肥物质科学研究院 | A kind of panorama inertial navigation SLAM method based on more key frames |
CN109465832A (en) * | 2018-12-18 | 2019-03-15 | 哈尔滨工业大学(深圳) | High-precision vision and the tight fusion and positioning method of IMU and system |
CN109631855A (en) * | 2019-01-25 | 2019-04-16 | 西安电子科技大学 | High-precision vehicle positioning method based on ORB-SLAM |
CN109993113A (en) * | 2019-03-29 | 2019-07-09 | 东北大学 | A kind of position and orientation estimation method based on the fusion of RGB-D and IMU information |
Non-Patent Citations (3)
Title |
---|
Simultaneous localization and mapping: Present,future,and the robust-perception age;CADENA C;《IEEE Transactions on Robotics》;20161231;全文 * |
基于 Vision-IMU 的机器人同时定位与地图创建算法;姚二亮;《仪器仪表学报》;20180115;第39卷(第4期);全文 * |
融合IMU信息的双目视觉SLAM研究;徐宽;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190115(第01期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110490900A (en) | 2019-11-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110490900B (en) | Binocular vision positioning method and system under dynamic environment | |
CN109993113B (en) | Pose estimation method based on RGB-D and IMU information fusion | |
CN112734852B (en) | Robot mapping method and device and computing equipment | |
CN112634451B (en) | Outdoor large-scene three-dimensional mapping method integrating multiple sensors | |
Tanskanen et al. | Live metric 3D reconstruction on mobile phones | |
CN112233177B (en) | Unmanned aerial vehicle pose estimation method and system | |
CN111024066A (en) | Unmanned aerial vehicle vision-inertia fusion indoor positioning method | |
CN114623817B (en) | Self-calibration-contained visual inertial odometer method based on key frame sliding window filtering | |
US12073630B2 (en) | Moving object tracking method and apparatus | |
CN112115980A (en) | Binocular vision odometer design method based on optical flow tracking and point line feature matching | |
CN113313763B (en) | Monocular camera pose optimization method and device based on neural network | |
CN111882602B (en) | Visual odometer implementation method based on ORB feature points and GMS matching filter | |
CN113223045A (en) | Vision and IMU sensor fusion positioning system based on dynamic object semantic segmentation | |
CN112419497A (en) | Monocular vision-based SLAM method combining feature method and direct method | |
CN111623773B (en) | Target positioning method and device based on fisheye vision and inertial measurement | |
CN113012224B (en) | Positioning initialization method and related device, equipment and storage medium | |
CN115406447A (en) | Autonomous positioning method of quad-rotor unmanned aerial vehicle based on visual inertia in rejection environment | |
CN115661341A (en) | Real-time dynamic semantic mapping method and system based on multi-sensor fusion | |
CN112907633B (en) | Dynamic feature point identification method and application thereof | |
CN112731503B (en) | Pose estimation method and system based on front end tight coupling | |
CN116147618B (en) | Real-time state sensing method and system suitable for dynamic environment | |
WO2023130842A1 (en) | Camera pose determining method and apparatus | |
CN116045965A (en) | Multi-sensor-integrated environment map construction method | |
CN112509006A (en) | Sub-map recovery fusion method and device | |
CN113011212A (en) | Image recognition method and device and vehicle |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |