CN110490900B - Binocular vision positioning method and system under dynamic environment - Google Patents

Binocular vision positioning method and system under dynamic environment Download PDF

Info

Publication number
CN110490900B
CN110490900B CN201910634021.1A CN201910634021A CN110490900B CN 110490900 B CN110490900 B CN 110490900B CN 201910634021 A CN201910634021 A CN 201910634021A CN 110490900 B CN110490900 B CN 110490900B
Authority
CN
China
Prior art keywords
map
pose
frame
matching
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910634021.1A
Other languages
Chinese (zh)
Other versions
CN110490900A (en
Inventor
秦家虎
余雷
陈宇航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN201910634021.1A priority Critical patent/CN110490900B/en
Publication of CN110490900A publication Critical patent/CN110490900A/en
Application granted granted Critical
Publication of CN110490900B publication Critical patent/CN110490900B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

A binocular vision positioning method and system under dynamic environment, apply to the technical field of robot positioning, withdraw the characteristic point of the picture, distinguish the characteristic point on the moving object point and characteristic point on the non-moving object through clustering the way to the characteristic point, this kind of method can reject the characteristic point on the moving object effectively, avoid rejecting the characteristic point on the static dynamic attribute object, and then keep more available characteristic points, position and posture transformation matrix of interframe obtained through IMU pre-integration algorithm screens and close to IMU data on the basis of different characteristic point sets estimated position and posture, thus reject some exterior points such as the characteristic point belonging to moving object effectively; the pose and depth estimation method integrating monocular motion vision and binocular stereo vision can effectively improve the estimation precision.

Description

Binocular vision positioning method and system under dynamic environment
Technical Field
The invention relates to the technical field of robot positioning, in particular to a binocular vision positioning method and system in a dynamic environment.
Background
The traditional GPS positioning technology has strong dependence on satellite signals, greatly reduces the precision in the environment with more buildings and serious shielding, and is especially basically unusable in the indoor environment. The inertial navigation technology based on the inertial navigation module (gyroscope and accelerometer) does not depend on external information, the current pose information of the carrier can be obtained through the integral of the inertial navigation measurement information, but the inertial navigation positioning error is increased along with the time, and the long-term accuracy is poor due to the existence of accumulated error. The vision-based positioning method determines the pose of the robot by analyzing the image sequence, can be used in indoor and other sheltered environments, and can also be used for carrying out closed-loop detection by using abundant visual information to eliminate accumulated errors so as to realize long-term high-precision positioning. In the field of robots, a vision-based positioning method is mainly based on vision and Simultaneous positioning and Mapping (SLAM), which is called vision SLAM for short, and a vision SLAM without map building is also called vision odometer, and is called vision SLAM collectively herein. The current vision SLAM is divided into a monocular SLAM, a binocular SLAM and an RGB-D SLAM according to different sensors, and the monocular SLAM has the scale uncertainty problem, but the RGB-D SLAM is too high in cost and low in precision, so that the binocular SLAM is adopted, and the monocular scale uncertainty problem does not exist, and the cost is relatively low. At present, most SLAM positioning technologies are based on static environment assumptions, however, most practical environments are complex dynamic scenes, in the dynamic scenes, the precision of SLAM algorithms based on the static assumptions is greatly reduced, in recent years, some semantic-based dynamic scene SLAM technologies are proposed and used, semantic SLAM is called semantic SLAM for short, the semantic SLAM detects object types with dynamic attributes in scenes through deep learning or other computer vision methods, and the objects of the dynamic attribute types are removed before pose estimation is carried out by utilizing image information, so that the influence of dynamic objects on the pose estimation is reduced.
However, this method has the following disadvantages: (1) the image detection technology has high calculation complexity, the positioning technology has high requirements on real-time performance, and the added frames of deep learning and the like in the SLAM system can greatly influence the real-time performance. (2) In the SLAM system, particularly a visual odometer without an environment map, when the pose estimation is performed by using the image information, the object with dynamic property is not the object which affects the pose estimation accuracy, but the moving object, such as a car parked in a parking lot, belongs to the object with dynamic property, but the car may stay in the parking lot for a long time, at least the pose estimation of the current image information is not affected, and more characteristic information is provided for the pose estimation, particularly in a scene with rare characteristics, the temporarily static dynamic object can provide more characteristics, and the pose estimation accuracy is increased.
Disclosure of Invention
The invention mainly aims to provide a binocular vision positioning method and system in a dynamic environment.
In order to achieve the above object, a first aspect of the embodiments of the present invention provides a binocular vision positioning method in a dynamic environment, including:
acquiring a left eye image and a right eye image under binocular vision of a camera, and extracting feature points corresponding to map points in a current frame to which the left eye image and the right eye image belong and a previous frame to which the left eye image and the right eye image belong, wherein the map points are feature points added into a map;
according to the characteristic points, obtaining the left-eye and right-eye movement visual matching characteristic point pairs and stereoscopic vision matching characteristic point pairs;
screening the shared matching characteristic point pairs of the stereoscopic vision matching characteristic point pair and the motion vision matching characteristic point pair;
respectively carrying out stereo depth estimation on the characteristic point pairs in the common matching characteristic point pair set to obtain a stereo depth estimation value;
clustering the left-target and right-target motion visual matching feature point pairs according to the optical flow velocity and the rotation angle between the motion visual matching feature point pairs by using a K-means + + clustering algorithm to obtain clustering results, and classifying the motion visual matching feature point pairs into a plurality of categories according to the clustering results;
calculating pose estimation results of the IMU of the left eye and the right eye in a time period between a current frame and a previous frame by utilizing an IMU pre-integration algorithm, wherein the pose estimation results comprise inter-frame poses between the current frame and the previous frame;
respectively carrying out EPNP (extended post-processing) pose estimation and map point estimation on the motion visual matching feature point pairs under each category to obtain a pose estimation result and a map point estimation result of the EPNP;
selecting the pose estimation result of the EPNP closest to the pose estimation result of the IMU as an optimal EPNP pose estimation result and a map point estimation result, and recording effective matching feature point pairs corresponding to the optimal EPNP pose estimation result and the map point estimation result;
and selecting a characteristic point pair corresponding to the effective matching characteristic point pair from the common matching characteristic point pair, and performing Kalman filtering fusion on a stereo depth estimation value corresponding to the characteristic point pair and an EPNP estimated depth value to obtain an estimated pose and an estimated map point.
Further, the IMU pre-integration algorithm:
Figure BDA0002128142650000031
Figure BDA0002128142650000032
Figure BDA0002128142650000033
wherein,
Figure BDA0002128142650000034
quaternion, a, representing displacement increment, velocity increment, and rotation angle change in a period from the k-th frame to the k + 1-th frame in the k-th frame camera coordinate system, respectivelytRepresenting accelerometer measurements, wtRepresenting measured values of angular velocity, batAnd bwtRepresenting the bias of the accelerometer and gyroscope, respectively, naAnd nwRespectively, represent the noise, and are,
Figure BDA0002128142650000035
a rotation transformation matrix representing the time t relative to the k frame camera coordinate system
Figure BDA0002128142650000036
The correlation equation yields, Ω is the matrix related to the quaternion calculation, Δ tkRepresents tkTo tk+1Time interval of (g)wRepresenting the gravity vector in the world coordinate system.
Further, after the classifying the moving visual matching feature point pairs into a plurality of categories according to the clustering result, the method includes:
performing histogram statistics on the number of the feature points in each category;
and deleting the categories and the corresponding feature points under the categories, wherein the number of the corresponding feature points under each category is less than a first preset number, and the difference between the number of the corresponding feature points under each category and the number of the corresponding feature points under other categories is greater than a second preset number.
Further, the performing EPNP pose estimation and map point estimation on the motion visual matching feature point pairs under each category, respectively, to obtain a pose estimation result and a map point estimation result of the EPNP includes:
respectively carrying out EPNP position and map point estimation on the feature points under each category, randomly adopting a consistency algorithm through RANSAC, and carrying out outlier elimination on the position and orientation estimation result of each category to obtain the position and orientation estimation result of the EPNP corresponding to each category and the map point estimation result.
Further, after obtaining the estimated pose and the estimated map point, the method includes:
optimizing the estimated pose and the estimated map points by utilizing a map-based optimization algorithm according to the inter-frame pose of the IMU to obtain optimized map points;
the optimization algorithm of the graph is as follows:
Figure BDA0002128142650000041
which is composed of
Figure BDA0002128142650000042
Represents an estimated pose transformation matrix obtained by EPNP pose estimation,
Figure BDA0002128142650000043
representing a pose transformation matrix, z, obtained by an IMU pre-integration algorithmk+1Representing camera imagesPixel coordinate of (2), PkRepresenting the 3D coordinates of a map point in the kth frame camera coordinate system, pic(.) is a projective transformation based on a model of the camera for transforming 3D coordinates in the camera coordinate system into 2D pixel coordinates in the camera image.
Further, after the optimizing the estimated pose and the estimated map point to obtain the optimized map point, the method includes:
replacing the map points before optimization by the optimized map points, updating the local map, and eliminating map points which do not belong to the local map;
comparing whether the common-view feature point between the current frame and the next new key frame is higher than a preset threshold value or not;
if the current key frame is higher than the preset threshold, replacing the latest key frame with the current key frame;
and if the current frame is lower than the preset threshold, adding the current frame into a key frame set.
Further, the obtaining the left-eye and right-eye motion visual matching feature point pairs according to the feature points includes:
carrying out ORB descriptor matching on the feature points corresponding to the map points in the current frame to which the left target image belongs and the map points in the previous frame to which the left target image belongs to obtain a left target matching feature point pair;
carrying out ORB descriptor matching on the feature points corresponding to the map points in the current frame to which the right-eye image belongs and the map points in the previous frame to which the right-eye image belongs to obtain right-eye matched feature point pairs; wherein the left-eye and right-eye moving visual matching feature point pairs include the left-eye matching feature point pair and the right-eye matching feature point pair.
Further, the obtaining the stereoscopic vision matching feature point pair of the left eye and the right eye according to the feature point comprises:
and respectively carrying out ORB descriptor matching and SAD stereo matching on the feature points in the current frame to which the left eye image belongs and the feature points in the current frame to which the right eye image belongs to obtain the stereo matching feature point pairs of the left eye and the right eye.
A second aspect of the embodiments of the present invention provides a binocular vision positioning system in a dynamic environment, including:
the system comprises a main thread, a pose estimation thread, a loop thread and a back-end optimization thread;
the main thread is used for acquiring a left eye image and a right eye image under binocular vision of the camera, initializing the left eye image and the right eye image, receiving a restart signal sent by the pose estimation thread or the loop thread after the initialization is successful, and initializing the pose estimation thread or the loop thread again according to the restart signal;
the pose estimation thread is used for acquiring a left eye image and a right eye image under binocular vision of the camera, extracting feature points in a current frame to which the left eye image and the right eye image belong and feature points corresponding to a map point in a previous frame to which the left eye image and the right eye image belong, and calculating an estimated pose and an estimated map point of the current frame according to the feature points;
the loop thread is used for detecting a loop frame, estimating a pose transformation matrix between a current frame and the loop frame, and updating the pose of the current frame by using the pose of the loop frame as a reference;
and the back-end optimization thread is used for optimizing the poses of all the frames according to the pose estimation result of the IMU pre-integration algorithm and the pose transformation matrix between the current frame and the loop frame.
Further, the calculating the estimated pose and the estimated map point of the current frame according to the feature points includes:
according to the characteristic points, obtaining the left-eye and right-eye movement visual matching characteristic point pairs and stereoscopic vision matching characteristic point pairs;
screening the shared matching characteristic point pairs of the stereoscopic vision matching characteristic point pair and the motion vision matching characteristic point pair;
respectively carrying out stereo depth estimation on the characteristic point pairs in the common matching characteristic point pair set to obtain a stereo depth estimation value;
clustering the left-target and right-target motion visual matching feature point pairs according to the optical flow velocity and the rotation angle between the motion visual matching feature point pairs by using a K-means + + clustering algorithm to obtain clustering results, and classifying the motion visual matching feature point pairs into a plurality of categories according to the clustering results;
calculating pose estimation results of the IMU of the left eye and the right eye in a time period between a current frame and a previous frame by utilizing an IMU pre-integration algorithm, wherein the pose estimation results comprise inter-frame poses between the current frame and the previous frame;
respectively carrying out EPNP (extended post-processing) pose estimation and map point estimation on the motion visual matching feature point pairs under each category to obtain a pose estimation result and a map point estimation result of the EPNP;
selecting the pose estimation result of the EPNP closest to the pose estimation result of the IMU as an optimal EPNP pose estimation result and a map point estimation result, and recording effective matching feature point pairs corresponding to the optimal EPNP pose estimation result and the map point estimation result;
and selecting a characteristic point pair corresponding to the effective matching characteristic point pair from the common matching characteristic point pair, and performing Kalman filtering fusion on a stereo depth estimation value corresponding to the characteristic point pair and an EPNP estimated depth value to obtain an estimated pose and an estimated map point.
The method extracts the image characteristic points, and distinguishes the characteristic points on the moving object points and the characteristic points on the non-moving object points in a characteristic point clustering mode, so that the method can effectively eliminate the characteristic points on the moving object, avoid eliminating the characteristic points on the static dynamic attribute object and further reserve more available characteristic points; screening poses which are closest to IMU data and estimated based on different feature point sets through an interframe pose transformation matrix obtained by an IMU pre-integration algorithm, and effectively removing external points such as feature points belonging to moving objects; the pose and depth estimation method integrating monocular motion vision and binocular stereo vision can effectively improve the estimation precision.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
fig. 2 is a general frame schematic diagram of a binocular vision positioning method under a dynamic environment according to an embodiment of the present invention;
fig. 3 is a schematic flow chart of a main thread of a binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
fig. 4 is a schematic flowchart of a pose estimation thread of a binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
fig. 5 is a partially enlarged view of a schematic flow chart of a pose estimation thread of the binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
fig. 6 is a partially enlarged view of a schematic flow chart of a pose estimation thread of the binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
fig. 7 is a partially enlarged view of a schematic flow chart of a pose estimation thread of the binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
fig. 8 is a schematic diagram of a feature point pair of a binocular vision positioning method in a dynamic environment according to an embodiment of the present invention illustrating relative speed and angle;
fig. 9 is a schematic flowchart of a looping thread of a binocular vision positioning method in a dynamic environment according to an embodiment of the present invention;
FIG. 10 is a schematic diagram of loop detection according to an embodiment of the present invention;
FIG. 11 is a schematic diagram of a back-end optimization process according to an embodiment of the present invention;
fig. 12 is a schematic diagram of a backend optimization effect according to an embodiment of the present invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the application, the method for screening the moving object is not only suitable for a binocular camera, but also can be used for monocular, monocular and binocular RGB-D cameras, so that the monocular, monocular and binocular RGB-D cameras can also be used for achieving the purpose of the invention. In the embodiment of the application, a visual SLAM algorithm is taken as an example, but for a laser sensor, a mechanism for screening a moving object based on the method provided by the invention instead of a dynamic object can be completed in the laser SLAM, so that the pose estimation of the moving object can be completed in a dynamic environment by adopting a method for clustering similar features in the laser SLAM.
Referring to fig. 1, fig. 1 is a schematic flow chart of a binocular vision positioning method under a dynamic environment according to an embodiment of the present invention, the method mainly includes:
s101, acquiring a left eye image and a right eye image under binocular vision of a camera;
s102, extracting feature points corresponding to the feature points in the current frame to which the left eye image and the right eye image belong and the map point in the previous frame to which the left eye image and the right eye image belong, wherein the map points are feature points added into a map;
s103, obtaining the left-eye and right-eye movement visual matching characteristic point pairs and stereoscopic vision matching characteristic point pairs according to the characteristic points;
s104, screening a common matching characteristic point pair of the stereoscopic vision matching characteristic point pair and the motion vision matching characteristic point pair;
s105, respectively carrying out stereo depth estimation on the characteristic point pairs in the common matching characteristic point pair set to obtain a stereo depth estimation value;
s106, clustering the left-eye and right-eye motion visual matching feature point pairs according to the optical flow velocity and the rotation angle between the motion visual matching feature point pairs by using a K-means + + clustering algorithm to obtain a clustering result, and classifying the motion visual matching feature point pairs into a plurality of categories according to the clustering result;
s107, calculating pose estimation results of the IMU of the left eye and the right eye in a time period between the current frame and the previous frame by utilizing an IMU pre-integration algorithm;
s108, respectively carrying out EPNP (extended post-processing) pose estimation and map point estimation on the motion vision matching feature point pairs under each category to obtain an EPNP pose estimation result and a map point estimation result;
s109, selecting the pose estimation result of the EPNP closest to the pose estimation result of the IMU as an optimal EPNP pose estimation result and a map point estimation result, and recording effective matching feature point pairs corresponding to the optimal EPNP pose estimation result and the map point estimation result;
s110, selecting a characteristic point pair corresponding to the effective matching characteristic point pair from the common matching characteristic point pair, and performing Kalman filtering fusion on a stereo depth estimation value corresponding to the characteristic point pair and an EPNP estimated depth value to obtain an estimated pose and an estimated map point.
Further, the IMU pre-integration algorithm:
Figure BDA0002128142650000091
Figure BDA0002128142650000092
Figure BDA0002128142650000093
wherein,
Figure BDA0002128142650000094
respectively representing displacement increment, speed increment and rotation in the time period from the k frame to the k +1 frame under the k frame camera coordinate systemQuaternion of angular variation, atRepresenting accelerometer measurements, wtRepresenting measured values of angular velocity, batAnd bwtRepresenting the bias of the accelerometer and gyroscope, respectively, naAnd nwRespectively, represent the noise, and are,
Figure BDA0002128142650000095
a rotation transformation matrix representing the time t relative to the k frame camera coordinate system
Figure BDA0002128142650000101
The correlation equation yields, Ω is the matrix related to the quaternion calculation, Δ tkRepresents tkTo tk+1Time interval of (g)wRepresenting the gravity vector in the world coordinate system.
Further, after the classifying the moving visual matching feature point pairs into a plurality of categories according to the clustering result, the method includes:
performing histogram statistics on the number of the feature points in each category;
and deleting the category and the corresponding feature points under the category, wherein the number of the corresponding feature points under each category is less than a first preset number, and the difference value between the number of the corresponding feature points under each category and the number of the corresponding feature points under other categories is greater than a second preset number.
Further, the step of respectively performing EPNP pose estimation and map point estimation on the motion visual matching feature point pairs under each category to obtain an EPNP pose estimation result and a map point estimation result includes:
respectively carrying out EPNP position and map point estimation on the feature points under each category, randomly adopting a consistency algorithm through RANSAC, and carrying out outlier elimination on the position and orientation estimation result of each category to obtain the position and orientation estimation result of the EPNP corresponding to each category and the map point estimation result.
Further, after obtaining the estimated pose and the estimated map point, the method includes:
optimizing the estimated pose and the estimated map points by utilizing a map-based optimization algorithm according to the pose estimation result of the IMU to obtain optimized map points;
the optimization algorithm of the graph is as follows:
Figure BDA0002128142650000102
which is composed of
Figure BDA0002128142650000103
Represents an estimated pose transformation matrix obtained by EPNP pose estimation,
Figure BDA0002128142650000104
representing a pose transformation matrix, z, obtained by an IMU pre-integration algorithmk+1Pixel coordinates representing the camera image, Pk represents the 3D coordinates of the map point in the k-th frame camera coordinate system, pic(.) is a projective transformation based on a model of the camera for transforming 3D coordinates in the camera coordinate system into 2D pixel coordinates in the camera image.
Further, the optimizing the estimated pose and the estimated map point to obtain an optimized map point includes:
replacing the map points before optimization by the optimized map points, updating the local map, and eliminating points which do not belong to the local map;
comparing whether the common-view feature point between the current frame and the next new key frame is higher than a preset threshold value or not;
if the current key frame is higher than the preset threshold, replacing the latest key frame with the current key frame;
if the current frame is lower than the preset threshold, adding the current frame into the key frame set.
Further, the obtaining the pair of left-eye and right-eye motion visual matching feature points according to the feature points comprises:
carrying out ORB descriptor matching on the feature points corresponding to the map points in the current frame to which the left target image belongs and the map points in the previous frame to which the left target image belongs to obtain a left target matching feature point pair;
carrying out ORB descriptor matching on the feature point in the current frame to which the right-eye image belongs and the feature point corresponding to the map point in the previous frame to which the right-eye image belongs to obtain a right-eye matching feature point pair; wherein the left-eye and right-eye moving visual matching feature point pairs include the left-eye matching feature point pair and the right-eye matching feature point pair.
Further, the obtaining the pair of stereoscopic vision matching feature points of the left eye and the right eye according to the feature points comprises:
and respectively carrying out ORB descriptor matching and SAD stereo matching on the feature points in the current frame to which the left eye image belongs and the feature points in the current frame to which the right eye image belongs to obtain the stereo matching feature point pairs of the left eye and the right eye.
In the embodiment of the invention, the moving object is removed instead of the dynamic object, the moving object and the non-moving object are distinguished in a clustering mode, and the speed and the angle of the inter-frame feature point are clustered, so that the matching feature point set of the moving object is distinguished from the non-moving object feature point set; screening poses which are closest to IMU data and estimated based on different feature point sets through an interframe pose transformation matrix obtained by an IMU pre-integration algorithm, and effectively removing external points such as feature points belonging to moving objects; the pose and depth estimation method integrating monocular motion vision and binocular stereo vision can effectively improve the estimation precision.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a binocular vision positioning system under a dynamic environment according to another embodiment of the present invention, the system mainly includes:
the system comprises a main thread, a pose estimation thread, a loop thread and a back-end optimization thread;
the main thread is used for acquiring a left eye image and a right eye image under binocular vision of the camera, initializing the images, receiving a restart signal sent by the pose estimation thread or the loop thread after the initialization is successful, and initializing the images again according to the restart signal;
specifically, as shown in fig. 3, the main thread of the system mainly functions to start other threads and initialize a map, and provide initial trusted data for the subsequent other threads, and the main flow and the working principle are as follows:
the method comprises the steps of respectively obtaining images from the left eye and the right eye of a binocular camera, extracting feature points of the left eye image and the right eye image by using FAST corner detection, dividing the images into 10 sub-image areas, and selecting N corner points with the maximum response values from large to small in each sub-image area according to Harris response values to serve as a final feature point set. And calculating the rotation direction of each feature point, wherein the rotation direction of the feature point is defined as a direction vector from the geometric center to the centroid of the image block with the feature point as the center. Then, a descriptor of the feature point is calculated, which uses the same binary BRIEF descriptor as orb (organized FAST and Rotated BRIEF).
ORB description sub-matching is firstly carried out on the feature points of the left eye image and the feature points of the right eye image, then SAD (sum of absolute differences) sub-pixel level matching is carried out, and effective feature point screening is carried out by utilizing the rotation angle consistency of the feature points, so that accurate feature matching feature point pairs of the left camera image and the right camera image are obtained.
Using binocular camera models
Figure BDA0002128142650000121
Calculating the real depth of the matched characteristic point, wherein z represents the real coordinate of the characteristic point along the z-axis direction under the reference system of the camera, f represents the focal length of the camera, b is the base line of the camera, u is the base line of the cameraL、uRRespectively, the abscissa of the pixel of the feature point in the left eye image and the right eye image. And detecting whether z is positive, if the depth estimation is regular, restoring the 3D coordinates of the feature point, wherein the 3D coordinates are called map points, if the depth estimation is not regular, the depth estimation of the feature point fails, and rejecting the feature point pair.
If the successfully matched characteristic point pair is larger than a certain threshold value, the initialization is considered to be successful, otherwise, the initialization is carried out again.
Once the initialization validation is successful, map points are added to the local map and global map classes and the current frame is added as a key frame and marked as the previous frame for subsequent processing. The local map is a map point set composed of map points corresponding to all matched feature point pairs on the current image frame and the past frame having a common view relationship with the current image frame, and the global map is a map point set composed of all past map points. The key frame set is used for maintaining the data set size, and only the image frames meeting the conditions are stored locally as key frames for subsequent closed-loop detection and back-end optimization.
After the whole system is initialized successfully, the system enters a state of polling signals of other threads, and if restart signals sent by other threads are received, the system enters the state of acquiring images from the left eye and the right eye of the binocular camera again to start to initialize the system again.
The pose estimation thread is used for acquiring a left eye image and a right eye image under binocular vision of the camera, extracting feature points in a current frame to which the left eye image and the right eye image belong and feature points corresponding to a map point in a previous frame to which the left eye image and the right eye image belong, and calculating an estimated pose and an estimated map point of the current frame according to the feature points;
specifically, referring to fig. 4, 5, 6, and 7, the working process of the pose estimation thread is as follows:
images are acquired from the left and right eyes of a binocular camera respectively,
the method comprises the steps of respectively obtaining images from the left eye and the right eye of a binocular camera, extracting feature points of the left eye image and the right eye image by using FAST corner detection, dividing the images into 10 sub-image areas, and selecting N corner points with the maximum response values from large to small in each sub-image area according to Harris response values to serve as a final feature point set. And calculating the rotation direction of each feature point, wherein the rotation direction of the feature point is defined as a direction vector from the geometric center to the centroid of the image block with the feature point as the center. Then, a descriptor of the feature point is calculated, which uses the same binary BRIEF descriptor as orb (organized FAST and Rotated BRIEF). Reading acceleration a from a preset inertial navigation module (IMU module)tAnd angular velocity wtAnd filtering the acceleration and the angular velocity.
The IMU inertial navigation module is used as an auxiliary means for screening a proper visual positioning result in a dynamic environment, and sensors capable of obtaining displacement, speed, angular speed and angle of the robot by using a code disc, an encoder, an electronic compass and the like are used as auxiliary means to achieve the processing effect of matching with the visual sensors in the dynamic environment.
Carrying out ORB descriptor matching on the feature points corresponding to the map points in the previous frame to which the feature points belong and the current frame to which the left eye and the right eye belong respectively, accelerating the feature point matching by adopting a DBOW3 bag-of-vocabulary technology, then screening effective feature points by utilizing the feature point rotation angle consistency to respectively obtain a left eye matching feature point pair and a right eye feature point pair, which are called as a left eye and right eye motion vision matching feature point pair. And performing ORB descriptor matching and SAD stereo matching optimization on the current left eye frame and the current right eye frame to obtain more accurate stereo matching feature point pairs of the left eye and the right eye, namely the stereo matching feature point pairs, then screening out a common matching feature point pair set of the motion visual matching feature point pairs and the extracted visual matching feature point pairs, performing stereo depth estimation on corresponding feature points, and restoring the 3D coordinates of the feature points.
Calculating the optical flow velocity v and the rotation angle theta between the left-eye and right-eye motion visual matching feature point pairs, wherein the principle of the optical flow velocity and the rotation angle is schematically shown in fig. 8, and the calculation formula is as follows:
Figure BDA0002128142650000141
Figure BDA0002128142650000142
Figure BDA0002128142650000143
Figure BDA0002128142650000144
and (4) utilizing a K-means + + clustering algorithm to cluster the left-target and right-target motion visual matching feature point pairs into M categories according to the optical flow velocity and the rotation angle. In an actual environment, the speed and the direction of a moving object are greatly different from those of a background element, the moving objects with different speeds can be isolated from the characteristic points of a static object through clustering, after a clustering result is obtained, histogram statistics is carried out on the characteristic points of each category, and the categories with too small statistical quantity and large difference with other data and corresponding characteristic point pairs are screened out.
The IMU pre-integration technology is utilized to calculate the interframe displacement of the left eye camera and the right eye camera in the time period of the previous frame and the current frame, and the core formula of the IMU pre-integration principle is as follows
Figure BDA0002128142650000145
Figure BDA0002128142650000146
Figure BDA0002128142650000147
Wherein,
Figure BDA0002128142650000148
quaternion, a, representing displacement increment, velocity increment, and rotation angle change in a period from the k-th frame to the k + 1-th frame in the k-th frame camera coordinate system, respectivelytRepresenting accelerometer measurements, wtRepresenting measured values of angular velocity, batAnd bwtRepresenting the bias of the accelerometer and gyroscope, respectively, naAnd nwRespectively, represent the noise, and are,
Figure BDA0002128142650000149
a rotation transformation matrix representing the time t relative to the k frame camera coordinate system
Figure BDA0002128142650000151
The correlation formula is obtained, and omega is the phase of quaternion calculationMatrix of off, Δ tkRepresents tkTo tk+1Time interval of (g)wRepresenting the gravity vector in the world coordinate system.
And respectively carrying out EPNP (extended range performance) pose estimation and map point estimation according to different types of matching feature point sets by utilizing different clusters of the obtained motion vision matching feature point sets, and carrying out outlier rejection on the pose estimation result of each type by randomly adopting a consistency algorithm through RANSAC (random sample consensus). After the pose estimation result corresponding to each category is obtained, the pose estimation result is compared with the IMU pose result, generally, the IMU short-term pose estimation after filtering optimization is accurate, so that real EPNP can be effectively screened by using IMU inter-frame pose estimation, the pose estimation result of a moving object feature point set is eliminated, the image pose estimation result is more real and accurate, and the influence of the moving object on the estimation result is reduced. By combining with a screening mechanism of IMU data, accurate EPNP pose and map point estimation results can be obtained, the EPNP pose and map point estimation results are called as optimal EPNP pose and map point estimation results, and a corresponding matching feature point set is called as a motion vision effective matching feature point pair set.
And (3) screening out feature points belonging to an effective matching feature point pair set from the common matching feature point set, namely, selecting out feature point pairs belonging to feature point classes corresponding to the optimal EPNP pose estimation result, and then performing Kalman filtering fusion on the stereo depth estimation value corresponding to the feature point pairs and the depth value estimated by the EPNP, thereby obtaining more accurate map points.
The pose and map points obtained by using a map-based optimization algorithm are optimized, the result obtained by IMU pre-integration is blended in the optimization process, and the IMU pre-integration result is also used as one of optimization constraint conditions, so that the effect of blending optimization is achieved, and the core formula of the optimization is as follows:
Figure BDA0002128142650000152
wherein
Figure BDA0002128142650000153
Represents an estimated pose transformation matrix obtained by EPNP pose estimation,
Figure BDA0002128142650000154
representing a pose transformation matrix, z, obtained by an IMU pre-integration algorithmk+1Pixel coordinates representing the camera image, Pk represents the 3D coordinates of the map point in the k-th frame camera coordinate system, pic(.) is a projective transformation based on a model of the camera for transforming 3D coordinates in the camera coordinate system into 2D pixel coordinates in the camera image.
After pose estimation is completed, adding correct map points into a local map and a global map, updating and replacing the local map once, removing points which do not belong to the local map, simultaneously comparing whether a common-view feature point between a current frame and a next new key frame is higher than a threshold value, if the common-view feature point is higher than the threshold value, indicating that the coincidence information between the current frame and the latest key frame is excessive, adding a new key frame is not needed, replacing the latest key frame by using the current key frame, and partially replacing and deleting the local map points and the global map points. If the information difference between the current frame and the latest key frame is larger than the threshold value, the current frame is added into the key frame set to increase the information capacity of the environment.
And performing BundleAdjustment optimization and outlier elimination on the poses of the key frames with enough common-view feature points with the latest key frame and the map points related to the key frames. And adding the previous frame and the key frame having the common-view relation with the current frame into the common view of the current frame, and updating the data map structure. And then enters the next round of circulation.
In the process, the method is carried out on the premise that the map point of the previous frame and the current frame are enough matched with the feature point, if the matched feature point is insufficient, the feature point needs to be supplemented, two methods are mainly used for supplementing the feature point, one is to search a local map and increase the map point which can be matched, the other is to carry out feature matching on the previous frame and the current frame again, the range of the matched feature point is not limited to the feature point of the previous frame corresponding to the map point, but all the feature points in the previous frame are involved in matching, so that the vacancy of the feature point is effectively compensated, once the feature point meets the requirement, the pose estimation is continuously carried out, if the pose does not meet the requirement, the IMU is used for recording short-term pose transformation to keep the continuity of the track, the loop is carried out in the next round until enough feature point is found, if the enough feature point is not found continuously for a long time, the positioning is invalid, the IMU is also inaccurate at this time due to accumulated errors, requiring re-initialization of the system.
The loop thread is used for detecting a loop frame, estimating a pose transformation matrix between a current frame and the loop frame, and updating the pose of the current frame by using the pose of the loop frame as a reference;
specifically, referring to fig. 9 and 10, the loop detection means detecting whether the robot returns to a place where the robot has arrived, and if the robot returns to the place where the robot has arrived, a geometric constraint between the current frame and the loop frame may be established, and an accumulated error between the loop frame and the current frame is eliminated through optimization. Loop detection is similar to a feature matching process, DBOW3 bag acceleration is mainly adopted, frames with a very high matching rate with the latest key frame are detected in all the past key frames, once the frames are detected, the frames are likely to be loop frames, whether the frames are the loop frames or not is determined according to a series of screening mechanisms such as geometric consistency, group matching and time consistency performance methods, once the loop frames are found in a key frame set, a pose transformation matrix between the current frame and the loop frames is estimated, the pose transformation matrix between the loop frames and the current frame and map points of the current frame are subjected to BundleAdjustment optimization and outlier rejection, finally the pose of the current frame is updated through the pose transformation matrix between the loop frames and the current frame by taking the pose of the loop frames as a reference, the pose of the common view frame with high degree of common view with the current frame is updated, and the map points related to related frames are updated, the whole loop optimization function eliminates part of accumulated errors, sets up set constraints and provides constraint conditions for global consistency optimization.
The back-end optimization thread is used for optimizing the poses of all the frames according to the pose estimation result of the IMU pre-integration algorithm and the pose transformation matrix between the current frame and the loop frame.
Specifically, please refer to fig. 11 and 12, which are the keys for improving the positioning accuracy and reducing the global accumulated error, and the back-end optimization module optimizes all the key frames and the common-view frame and the global map point of the current frame. The whole optimization process is divided into two major steps, wherein the pose of a key frame is optimized in the first step, and all poses are optimized by introducing inter-frame pose constraints of IMU pre-integration and constraints between a current frame and a loop frame.
The invention changes the existing mode of processing dynamic environment, can keep a large number of effective characteristic points by rejecting the moving object but not the mode of the dynamic object, can keep the static characteristic points of the dynamic object to improve the accuracy of pose estimation, and compared with the mode of rejecting the dynamic object, the invention can adapt to extreme conditions, such as only static dynamic object in a scene, if the mode of rejecting the dynamic object is adopted, enough characteristic points can not be obtained, and the method of rejecting the moving object can keep the static characteristic points of the dynamic object for pose estimation.
The invention changes the prior SLAM method for dealing with a dynamic environment by a semantic tag screening mode based on deep learning, provides a method for clustering the speed and the rotation direction of an inter-frame matching feature point pair to distinguish moving objects from non-moving objects (static objects and static dynamic objects), adopts a mode of taking IMU data as auxiliary constraint information, screens the pose estimation closest to the IMU data in the pose estimation results of different clustering data sets as the optimal pose estimation to eliminate the interference of the moving objects on the pose estimation under a dynamic scene, and greatly reduces the operation complexity and improves the real-time performance of a positioning algorithm compared with the semantic mode based on deep learning.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.
It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present invention is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required of the invention.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The above description is provided for the binocular vision positioning method and system in a dynamic environment, and for those skilled in the art, there may be variations in the specific implementation and application scope according to the ideas of the embodiments of the present invention, and in summary, the content of the present description should not be construed as limiting the present invention.

Claims (10)

1. A binocular vision positioning method under a dynamic environment is characterized by comprising the following steps:
acquiring a left eye image and a right eye image under binocular vision of a camera, and extracting feature points corresponding to map points in a current frame to which the left eye image and the right eye image belong and a previous frame to which the left eye image and the right eye image belong, wherein the map points are feature points added into a map;
according to the characteristic points, obtaining the left-eye and right-eye movement visual matching characteristic point pairs and stereoscopic vision matching characteristic point pairs;
screening the shared matching characteristic point pairs of the stereoscopic vision matching characteristic point pair and the motion vision matching characteristic point pair;
respectively carrying out stereo depth estimation on the characteristic point pairs in the common matching characteristic point pair set to obtain a stereo depth estimation value;
clustering the left-target and right-target motion visual matching feature point pairs according to the optical flow velocity and the rotation angle between the motion visual matching feature point pairs by using a K-means + + clustering algorithm to obtain clustering results, and classifying the motion visual matching feature point pairs into a plurality of categories according to the clustering results;
calculating pose estimation results of the IMU of the left eye and the right eye in a time period between a current frame and a previous frame by utilizing an IMU pre-integration algorithm, wherein the pose estimation results comprise inter-frame poses between the current frame and the previous frame;
respectively carrying out EPNP (extended post-processing) pose estimation and map point estimation on the motion visual matching feature point pairs under each category to obtain a pose estimation result and a map point estimation result of the EPNP;
selecting the pose estimation result of the EPNP closest to the pose estimation result of the IMU as an optimal EPNP pose estimation result and a map point estimation result, and recording effective matching feature point pairs corresponding to the optimal EPNP pose estimation result and the map point estimation result;
and selecting a characteristic point pair corresponding to the effective matching characteristic point pair from the common matching characteristic point pair, and performing Kalman filtering fusion on a stereo depth estimation value corresponding to the characteristic point pair and an EPNP estimated depth value to obtain an estimated pose and an estimated map point.
2. The binocular vision positioning method under the dynamic environment of claim 1, wherein the IMU pre-integration algorithm:
Figure FDA0003277617540000021
Figure FDA0003277617540000022
Figure FDA0003277617540000023
wherein
Figure FDA0003277617540000024
Quaternion, a, representing displacement increment, velocity increment, and rotation angle change in a period from the k-th frame to the k + 1-th frame in the k-th frame camera coordinate system, respectivelytRepresenting accelerometer measurements, wtRepresenting measured values of angular velocity, batAnd bwtRepresenting the bias of the accelerometer and gyroscope, respectively, naAnd nwRespectively, represent the noise, and are,
Figure FDA0003277617540000025
a rotation transformation matrix representing the time t relative to the k frame camera coordinate system
Figure FDA0003277617540000026
The correlation equation yields, Ω is the matrix related to the quaternion calculation, Δ tkRepresents tkTo tk+1Time interval of (g)wRepresenting the gravity vector in the world coordinate system.
3. The binocular vision positioning method under the dynamic environment according to claim 1, wherein after the classifying the moving vision matching feature point pairs into a plurality of categories according to the clustering result, the method comprises:
performing histogram statistics on the number of the feature points in each category;
and deleting the categories and the corresponding feature points under the categories, wherein the number of the corresponding feature points under each category is less than a first preset number, and the difference between the number of the corresponding feature points under each category and the number of the corresponding feature points under other categories is greater than a second preset number.
4. The binocular vision positioning method under the dynamic environment of claim 3, wherein the performing EPNP pose estimation and map point estimation on the pairs of the moving vision matching feature points under each category respectively to obtain EPNP pose estimation results and map point estimation results comprises:
respectively carrying out EPNP position and map point estimation on the feature points under each category, randomly adopting a consistency algorithm through RANSAC, and carrying out outlier elimination on the position and orientation estimation result of each category to obtain the position and orientation estimation result of the EPNP corresponding to each category and the map point estimation result.
5. The binocular vision positioning method under the dynamic environment according to claim 4, wherein after obtaining the estimated pose and the estimated map point, the method comprises:
optimizing the estimated pose and the estimated map points by utilizing a map-based optimization algorithm according to the inter-frame pose of the IMU to obtain optimized map points;
the optimization algorithm of the graph is as follows:
Figure FDA0003277617540000031
wherein,
Figure FDA0003277617540000032
represents an estimated pose transformation matrix obtained by EPNP pose estimation,
Figure FDA0003277617540000033
representing a pose transformation matrix, z, obtained by an IMU pre-integration algorithmk+1Pixel coordinates, P, representing a camera imagekRepresenting the 3D coordinates of a map point in the kth frame camera coordinate system, pic(.) is a projective transformation based on a model of the camera for transforming 3D coordinates in the camera coordinate system into 2D pixel coordinates in the camera image.
6. The binocular vision positioning method under the dynamic environment according to claim 5, wherein the optimizing the estimated pose and the estimated map points to obtain the optimized map points comprises:
replacing the map points before optimization by the optimized map points, updating a local map, and eliminating map points which do not belong to the local map;
comparing whether the common-view feature point between the current frame and the next new key frame is higher than a preset threshold value or not;
if the current key frame is higher than the preset threshold, replacing the latest key frame with the current key frame;
and if the current frame is lower than the preset threshold, adding the current frame into a key frame set.
7. The binocular vision positioning method under the dynamic environment according to claim 1, wherein the obtaining the left-eye and right-eye moving vision matching feature point pairs according to the feature points comprises:
carrying out ORB descriptor matching on the feature points corresponding to the map points in the current frame to which the left target image belongs and the map points in the previous frame to which the left target image belongs to obtain a left target matching feature point pair;
carrying out ORB descriptor matching on the feature points corresponding to the map points in the current frame to which the right-eye image belongs and the map points in the previous frame to which the right-eye image belongs to obtain right-eye matched feature point pairs; wherein the left-eye and right-eye moving visual matching feature point pairs include the left-eye matching feature point pair and the right-eye matching feature point pair.
8. The binocular vision positioning method under the dynamic environment according to claim 1, wherein the obtaining the stereoscopic vision matching feature point pairs of the left and right eyes according to the feature points comprises:
and respectively carrying out ORB descriptor matching and SAD stereo matching on the feature points in the current frame to which the left eye image belongs and the feature points in the current frame to which the right eye image belongs to obtain the stereo matching feature point pairs of the left eye and the right eye.
9. A binocular vision positioning system in a dynamic environment, comprising:
the system comprises a main thread, a pose estimation thread, a loop thread and a back-end optimization thread;
the main thread is used for acquiring a left eye image and a right eye image under binocular vision of the camera, initializing the left eye image and the right eye image, receiving a restart signal sent by the pose estimation thread or the loop thread after the initialization is successful, and initializing the pose estimation thread or the loop thread again according to the restart signal;
the pose estimation thread is used for acquiring a left eye image and a right eye image under binocular vision of the camera, extracting feature points in a current frame to which the left eye image and the right eye image belong and feature points corresponding to a map point in a previous frame to which the left eye image and the right eye image belong, and calculating an estimated pose and an estimated map point of the current frame according to the feature points;
the loop thread is used for detecting a loop frame, estimating a pose transformation matrix between a current frame and the loop frame, and updating the pose of the current frame by using the pose of the loop frame as a reference;
the back-end optimization thread is used for calculating pose estimation results of the IMU of the left eye and the right eye in a time period between a current frame and a previous frame by utilizing an IMU pre-integration algorithm, wherein the pose estimation results comprise inter-frame poses between the current frame and the previous frame, and the poses of all the frames are optimized according to the pose estimation results of the IMU pre-integration algorithm and a pose transformation matrix between the current frame and a loop frame.
10. The binocular vision positioning system under dynamic environment of claim 9, wherein the calculating the estimated pose and the estimated map point of the current frame according to the feature points comprises:
according to the characteristic points, obtaining the left-eye and right-eye movement visual matching characteristic point pairs and stereoscopic vision matching characteristic point pairs;
screening the shared matching characteristic point pairs of the stereoscopic vision matching characteristic point pair and the motion vision matching characteristic point pair;
respectively carrying out stereo depth estimation on the characteristic point pairs in the common matching characteristic point pair set to obtain a stereo depth estimation value;
clustering the left-target and right-target motion visual matching feature point pairs according to the optical flow velocity and the rotation angle between the motion visual matching feature point pairs by using a K-means + + clustering algorithm to obtain clustering results, and classifying the motion visual matching feature point pairs into a plurality of categories according to the clustering results;
respectively carrying out EPNP (extended post-processing) pose estimation and map point estimation on the motion visual matching feature point pairs under each category to obtain a pose estimation result and a map point estimation result of the EPNP;
selecting the pose estimation result of the EPNP closest to the pose estimation result of the IMU as an optimal EPNP pose estimation result and a map point estimation result, and recording effective matching feature point pairs corresponding to the optimal EPNP pose estimation result and the map point estimation result;
and selecting a characteristic point pair corresponding to the effective matching characteristic point pair from the common matching characteristic point pair, and performing Kalman filtering fusion on a stereo depth estimation value corresponding to the characteristic point pair and an EPNP estimated depth value to obtain an estimated pose and an estimated map point.
CN201910634021.1A 2019-07-12 2019-07-12 Binocular vision positioning method and system under dynamic environment Active CN110490900B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910634021.1A CN110490900B (en) 2019-07-12 2019-07-12 Binocular vision positioning method and system under dynamic environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910634021.1A CN110490900B (en) 2019-07-12 2019-07-12 Binocular vision positioning method and system under dynamic environment

Publications (2)

Publication Number Publication Date
CN110490900A CN110490900A (en) 2019-11-22
CN110490900B true CN110490900B (en) 2022-04-19

Family

ID=68546103

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910634021.1A Active CN110490900B (en) 2019-07-12 2019-07-12 Binocular vision positioning method and system under dynamic environment

Country Status (1)

Country Link
CN (1) CN110490900B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111091621A (en) * 2019-12-11 2020-05-01 东南数字经济发展研究院 Binocular vision synchronous positioning and composition method, device, equipment and storage medium
CN111016887A (en) * 2019-12-23 2020-04-17 深圳市豪恩汽车电子装备股份有限公司 Automatic parking device and method for motor vehicle
CN111060113B (en) * 2019-12-31 2022-04-08 歌尔股份有限公司 Map updating method and device
CN111476812A (en) * 2020-04-03 2020-07-31 浙江大学 Map segmentation method and device, pose estimation method and equipment terminal
CN111598939B (en) * 2020-05-22 2021-01-26 中原工学院 Human body circumference measuring method based on multi-vision system
CN111639658B (en) * 2020-06-03 2023-07-21 北京维盛泰科科技有限公司 Method and device for detecting and eliminating dynamic feature points in image matching
CN111683203B (en) 2020-06-12 2021-11-09 达闼机器人有限公司 Grid map generation method and device and computer readable storage medium
CN111862150B (en) * 2020-06-19 2024-06-14 杭州易现先进科技有限公司 Image tracking method, device, AR equipment and computer equipment
CN111914784B (en) * 2020-08-10 2022-02-08 北京大成国测科技有限公司 Method and device for detecting intrusion of trackside obstacle in real time and electronic equipment
CN113759384B (en) * 2020-09-22 2024-04-05 北京京东乾石科技有限公司 Method, device, equipment and medium for determining pose conversion relation of sensor
CN112241983A (en) * 2020-10-19 2021-01-19 深圳市目心智能科技有限公司 Perception system and robot based on initiative binocular vision
CN112444246B (en) * 2020-11-06 2024-01-26 北京易达恩能科技有限公司 Laser fusion positioning method in high-precision digital twin scene
CN112734842B (en) * 2020-12-31 2022-07-01 武汉第二船舶设计研究所(中国船舶重工集团公司第七一九研究所) Auxiliary positioning method and system for centering installation of large ship equipment
CN113220818B (en) * 2021-05-27 2023-04-07 南昌智能新能源汽车研究院 Automatic mapping and high-precision positioning method for parking lot
CN113503873B (en) * 2021-07-14 2024-03-12 北京理工大学 Visual positioning method for multi-sensor fusion
CN113920194B (en) * 2021-10-08 2023-04-21 电子科技大学 Positioning method of four-rotor aircraft based on visual inertia fusion
CN114066824B (en) * 2021-10-28 2024-05-14 华南理工大学 Binocular vision odometer method with dynamic target detection function

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105469405A (en) * 2015-11-26 2016-04-06 清华大学 Visual ranging-based simultaneous localization and map construction method
CN105953796A (en) * 2016-05-23 2016-09-21 北京暴风魔镜科技有限公司 Stable motion tracking method and stable motion tracking device based on integration of simple camera and IMU (inertial measurement unit) of smart cellphone
CN107193279A (en) * 2017-05-09 2017-09-22 复旦大学 Robot localization and map structuring system based on monocular vision and IMU information
CN109166149A (en) * 2018-08-13 2019-01-08 武汉大学 A kind of positioning and three-dimensional wire-frame method for reconstructing and system of fusion binocular camera and IMU
CN109307508A (en) * 2018-08-29 2019-02-05 中国科学院合肥物质科学研究院 A kind of panorama inertial navigation SLAM method based on more key frames
CN109465832A (en) * 2018-12-18 2019-03-15 哈尔滨工业大学(深圳) High-precision vision and the tight fusion and positioning method of IMU and system
CN109631855A (en) * 2019-01-25 2019-04-16 西安电子科技大学 High-precision vehicle positioning method based on ORB-SLAM
CN109993113A (en) * 2019-03-29 2019-07-09 东北大学 A kind of position and orientation estimation method based on the fusion of RGB-D and IMU information

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105469405A (en) * 2015-11-26 2016-04-06 清华大学 Visual ranging-based simultaneous localization and map construction method
CN105953796A (en) * 2016-05-23 2016-09-21 北京暴风魔镜科技有限公司 Stable motion tracking method and stable motion tracking device based on integration of simple camera and IMU (inertial measurement unit) of smart cellphone
CN107193279A (en) * 2017-05-09 2017-09-22 复旦大学 Robot localization and map structuring system based on monocular vision and IMU information
CN109166149A (en) * 2018-08-13 2019-01-08 武汉大学 A kind of positioning and three-dimensional wire-frame method for reconstructing and system of fusion binocular camera and IMU
CN109307508A (en) * 2018-08-29 2019-02-05 中国科学院合肥物质科学研究院 A kind of panorama inertial navigation SLAM method based on more key frames
CN109465832A (en) * 2018-12-18 2019-03-15 哈尔滨工业大学(深圳) High-precision vision and the tight fusion and positioning method of IMU and system
CN109631855A (en) * 2019-01-25 2019-04-16 西安电子科技大学 High-precision vehicle positioning method based on ORB-SLAM
CN109993113A (en) * 2019-03-29 2019-07-09 东北大学 A kind of position and orientation estimation method based on the fusion of RGB-D and IMU information

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Simultaneous localization and mapping: Present,future,and the robust-perception age;CADENA C;《IEEE Transactions on Robotics》;20161231;全文 *
基于 Vision-IMU 的机器人同时定位与地图创建算法;姚二亮;《仪器仪表学报》;20180115;第39卷(第4期);全文 *
融合IMU信息的双目视觉SLAM研究;徐宽;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190115(第01期);全文 *

Also Published As

Publication number Publication date
CN110490900A (en) 2019-11-22

Similar Documents

Publication Publication Date Title
CN110490900B (en) Binocular vision positioning method and system under dynamic environment
CN109993113B (en) Pose estimation method based on RGB-D and IMU information fusion
CN112734852B (en) Robot mapping method and device and computing equipment
CN112634451B (en) Outdoor large-scene three-dimensional mapping method integrating multiple sensors
Tanskanen et al. Live metric 3D reconstruction on mobile phones
CN112233177B (en) Unmanned aerial vehicle pose estimation method and system
CN111024066A (en) Unmanned aerial vehicle vision-inertia fusion indoor positioning method
CN114623817B (en) Self-calibration-contained visual inertial odometer method based on key frame sliding window filtering
US12073630B2 (en) Moving object tracking method and apparatus
CN112115980A (en) Binocular vision odometer design method based on optical flow tracking and point line feature matching
CN113313763B (en) Monocular camera pose optimization method and device based on neural network
CN111882602B (en) Visual odometer implementation method based on ORB feature points and GMS matching filter
CN113223045A (en) Vision and IMU sensor fusion positioning system based on dynamic object semantic segmentation
CN112419497A (en) Monocular vision-based SLAM method combining feature method and direct method
CN111623773B (en) Target positioning method and device based on fisheye vision and inertial measurement
CN113012224B (en) Positioning initialization method and related device, equipment and storage medium
CN115406447A (en) Autonomous positioning method of quad-rotor unmanned aerial vehicle based on visual inertia in rejection environment
CN115661341A (en) Real-time dynamic semantic mapping method and system based on multi-sensor fusion
CN112907633B (en) Dynamic feature point identification method and application thereof
CN112731503B (en) Pose estimation method and system based on front end tight coupling
CN116147618B (en) Real-time state sensing method and system suitable for dynamic environment
WO2023130842A1 (en) Camera pose determining method and apparatus
CN116045965A (en) Multi-sensor-integrated environment map construction method
CN112509006A (en) Sub-map recovery fusion method and device
CN113011212A (en) Image recognition method and device and vehicle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant