CN115218906A - Indoor SLAM-oriented visual inertial fusion positioning method and system - Google Patents

Indoor SLAM-oriented visual inertial fusion positioning method and system Download PDF

Info

Publication number
CN115218906A
CN115218906A CN202210849157.6A CN202210849157A CN115218906A CN 115218906 A CN115218906 A CN 115218906A CN 202210849157 A CN202210849157 A CN 202210849157A CN 115218906 A CN115218906 A CN 115218906A
Authority
CN
China
Prior art keywords
data
key frame
image
optical flow
visual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210849157.6A
Other languages
Chinese (zh)
Inventor
吕艳
刘大鹏
倪益华
倪忠进
侯英岢
杨维鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang A&F University ZAFU
Original Assignee
Zhejiang A&F University ZAFU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang A&F University ZAFU filed Critical Zhejiang A&F University ZAFU
Priority to CN202210849157.6A priority Critical patent/CN115218906A/en
Publication of CN115218906A publication Critical patent/CN115218906A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/20Instruments for performing navigational calculations
    • G01C21/206Instruments for performing navigational calculations specially adapted for indoor navigation
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • G01C21/1656Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments with passive imaging devices, e.g. cameras
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/183Compensation of inertial measurements, e.g. for temperature effects
    • G01C21/188Compensation of inertial measurements, e.g. for temperature effects for accumulated errors, e.g. by coupling inertial systems with absolute positioning systems
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C25/00Manufacturing, calibrating, cleaning, or repairing instruments or devices referred to in the other groups of this subclass
    • G01C25/005Manufacturing, calibrating, cleaning, or repairing instruments or devices referred to in the other groups of this subclass initial alignment, calibration or starting-up of inertial devices

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Manufacturing & Machinery (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of artificial intelligence, and provides a visual inertial fusion positioning method and system for an indoor SLAM (simultaneous localization and mapping), wherein a binocular fisheye camera acquires video image data, and performs pre-integration processing by tracking characteristics of an I-LK (inter-frame link-link) optical flow method between adjacent frames of the video image data and acquiring position and attitude data by an IMU (inertial measurement unit); fusing and determining system initialization parameter values after loosely coupling multi-source data, and determining key frame data by a key frame strategy; the key frame data is aligned with IMU pre-integration through visual pose initialization; the method comprises the steps of external parameter online calibration, visual inertia initialization and tight coupling nonlinear optimization, a six-degree-of-freedom pose of the mobile robot is established by a sliding window based on a key frame, the position of a current frame camera is known, and the position of a spatial feature point of a scene in a next frame image can be predicted through IMU information, so that the stability of feature point tracking is improved, and the robustness of a system is improved on the premise of realizing the positioning accuracy of the mobile robot.

Description

Indoor SLAM-oriented visual inertial fusion positioning method and system
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a visual inertial fusion positioning method and system for indoor SLAM.
Background
SLAM technology (SLAM), that is, the instant positioning and map building technology plays a role of lifting weight and reducing weight in the process of executing tasks by the indoor mobile robot, for example, the mobile robot identifies the relative position and direction of an obstacle through a sensor to prevent a body from contacting and colliding with an object; the global consistency of the map constructed by the mobile robot and the scene environment is ensured by perceiving and estimating the spatial pose information of the body. Therefore, the study of the SLAM technology can promote the functions of spatial information sensing and positioning of the mobile robot in an indoor environment and provide prior information for autonomous navigation of the mobile robot.
CN111429574B discloses a mobile robot positioning method and system based on three-dimensional point cloud and visual fusion, which performs positioning by establishing an environment map, fusing the point cloud and visual features, matching the map. The method comprises the steps that an environment map is a feature grid map of the environment, each grid in the feature grid map stores a point set formed by feature points extracted from point clouds and feature points of a visual image, and a height value, an intensity value and a normal vector projection value are extracted; point cloud and visual feature fusion is to project feature points extracted from an image to a point cloud space and form combined feature points with point cloud features; and the map matching and positioning are realized by projecting the combined feature points to a two-dimensional grid, extracting to obtain feature vectors, matching the feature grids with the map, determining the posterior probability of each candidate pose by adopting a histogram filter, and determining the position of the robot in the map based on each posterior probability.
In the prior art, SLAM positioning by means of a visual sensor has certain limitations, and environmental characteristic information is easily lost in a weak-light environment, a scene with missing texture and rapid movement of a carrier; meanwhile, the interference and the occlusion of scene dynamic objects are easy to cause the wrong association of the characteristic data.
Disclosure of Invention
Through long-term research practice, in the sensor space positioning principle, vision and IMU fusion positioning have the characteristic of complementary data advantages. The IMU is not influenced by objects, textures and illumination information of an external scene, relative displacement data with high precision can be observed in a short time, the adaptability of the vision sensor to dynamic objects, weak textures and dark scenes can be enhanced after the vision system fuses the IMU data, and the robustness of the system when the camera moves rapidly is improved; the IMU can measure high-frequency positioning attitude data, and the visual system can constrain the pose of the camera among visual image frames after fusion; however, the IMU has accumulated errors due to the influence of noise and offset during long-time operation, and the visual positioning information can inhibit the track drift.
In view of the above, the present invention is directed to a visual inertial fusion positioning method for an indoor SLAM, which includes,
the method comprises the following steps of S1, acquiring video image data through a binocular fisheye camera, and tracking through I-LK optical flow method features between adjacent frames of the video image data to obtain first data;
s2, acquiring position and attitude data by an inertial measurement unit, and performing pre-integration processing to obtain second data;
s3, interactively fusing and determining a system initialization parameter value after the first data and the second data are loosely coupled, and determining key frame data by a key frame strategy; the key frame data are output as third data after being initialized through visual poses and aligned with the pre-integration of the inertial measurement unit;
and S4, constructing the six-degree-of-freedom pose of the mobile robot by the sliding window based on the key frame through external parameter online calibration, visual inertia initialization and close-coupled nonlinear optimization on the third data.
Preferably, in step S3, the key frame policy includes,
step S31, initializing after obtaining scene characteristic information by a binocular fisheye camera, and setting a first frame image which is stably obtained as a reference key frame;
step S32, determining the average parallax of the previous key frame and the current frame according to the environmental characteristic change of the image, and outputting a non-key frame when the average parallax D is smaller than a set threshold epsilon; when the average parallax D is larger than or equal to a set threshold epsilon, outputting a preselected key frame;
and step S33, when the tracked feature quantity of the preselected key frame is lower than a set threshold value delta, taking the current preselected key frame as the key frame.
Preferably, in step S2, the pre-integration process comprises,
s21, splitting an integral term of the state quantity of the inertia measurement unit into a fixed integral term at the ith moment and a pre-integral relative quantity at the ith to the j moments;
and S22, converting the measurement data of the inertia measurement unit in the discrete time into a fixed integral term and a pre-integral relative quantity after the conversion by the pre-integral model.
Preferably, in step S1, I-LK optical flow feature tracking includes,
s11, sampling an original image of the acquired video image data to form a multilayer pyramid image;
s12, decomposing the large-displacement optical flow vector into optical flows which accord with micro motion on each layer of image from top to bottom, and carrying out optical flow iterative solution in the image layer field to obtain pyramid image vector components;
and step S13, accumulating the vector components of the pyramid image to obtain an optical flow vector, and reversely tracking and eliminating mismatching in a reverse optical flow.
Preferably, in step S4, the external parameter online calibration includes calibration of relative positions and attitude parameters of the binocular fisheye camera and the inertial measurement unit; and calibrating time difference parameters between the binocular fisheye camera and the inertia measurement unit.
Preferably, the visual pose initialization comprises the steps of estimating the pose of the camera and the landmark points by a visual odometer through a light beam adjustment method after feature extraction, I-LK tracking matching and image key frame selection, and acquiring the track pose information by the camera.
Preferably, the visual inertia initialization includes a gyroscope bias solution process, a gravity vector estimation process, a gravity direction refinement process, an accelerometer bias process, and a velocity solution process.
The invention also discloses a system for implementing the visual inertial fusion positioning method facing the indoor SLAM, which comprises,
the acquisition unit comprises a plurality of binocular fisheye cameras and is used for acquiring video image data through the binocular fisheye cameras and obtaining first data through I-LK optical flow method feature tracking between adjacent frames of the video image data;
the pre-integration unit is used for acquiring position and attitude data by the inertia measurement unit, and performing pre-integration processing to obtain second data;
the data fusion unit is used for interactively fusing the first data and the second data after loose coupling to determine a system initialization parameter value and determining key frame data according to a key frame strategy; the key frame data are output as third data after being initialized through visual poses and aligned with the pre-integration of the inertial measurement unit;
and the positioning unit is used for constructing a six-degree-of-freedom pose by a sliding window based on a key frame through external parameter online calibration, visual inertia initialization and close-coupled nonlinear optimization on the third data.
Preferably, the system includes that the acquisition unit further includes a device for tracking features of the I-LK optical flow method, the device includes,
the composition module is used for forming a multilayer pyramid image after sampling an original image of the acquired video image data;
the component generation module is used for decomposing the large-displacement optical flow vector into an optical flow which conforms to the micro motion on each layer of image from top to bottom, and performing optical flow iterative solution in the image layer field to obtain pyramid image vector components;
and the optical flow vector module is used for accumulating each vector component of the pyramid image to obtain an optical flow vector and eliminating mismatching in reverse optical flow reverse tracking.
Compared with the prior art, the visual inertial fusion positioning method for the indoor SLAM, provided by the invention, has the advantages that video image data are obtained through the binocular fisheye camera, the I-LK optical flow method characteristic tracking between adjacent frames of the video image data is carried out, the position and attitude data are collected by the inertial measurement unit, and the pre-integration processing is carried out; then, performing loose coupling on the multi-source data, performing interactive fusion to determine a system initialization parameter value, and determining key frame data by a key frame strategy; the key frame data is initialized through visual pose and aligned with the pre-integration of the inertial measurement unit; and constructing a six-degree-of-freedom pose of the mobile robot by a sliding window based on a key frame through external parameter online calibration, visual inertia initialization and close coupling nonlinear optimization. When the vision Inertial fusion positioning is carried out on the rapid movement of a mobile robot, the method and the system reduce the searching range of image matching by an Inertial Measurement Unit (IMU) short-time observation model and restrict the track drift of vision; knowing the position of the current frame camera, the position of the spatial feature point of the scene in the next frame image can be predicted through IMU information, and therefore the stability of feature point tracking is improved; in a scene with few weak textures or characteristic points, visual image frame information is lost, and the IMU recurs the relative pose information of the body in an integral mode, so that the pose of the mobile robot is recovered, and the robustness of the system is enhanced; the IMU calculates the long-time track pose to cause the accumulation error of the measured data, and the visual image frame pose can restrain and restrain the drift of the track; therefore, the visual inertial fusion positioning method can improve the robustness of the system on the premise of realizing the positioning accuracy of the mobile robot.
Additional features and advantages of the invention will be set forth in the detailed description which follows.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an embodiment of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:
FIG. 1 is a flowchart of an embodiment of an indoor SLAM-oriented visual inertial fusion positioning method of the present invention;
FIG. 2 is a schematic flow chart of the indoor SLAM-oriented visual inertial fusion positioning method of the present invention;
FIG. 3 is a key frame calculation flow chart of the indoor SLAM-oriented visual inertial fusion positioning method of the present invention;
FIG. 4 is a key frame diagram of the indoor SLAM-oriented visual inertial fusion positioning method of the present invention;
fig. 5 is a schematic view of a reverse optical flow method of the indoor SLAM-oriented visual inertial fusion positioning method of the invention.
Detailed Description
The following describes in detail embodiments of the present invention with reference to the drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.
In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," "third," "fourth," and the like in the description and in the claims of the invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged under appropriate circumstances in order to facilitate the description of the embodiments of the invention herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The method aims to solve a series of problems that the prior art has certain limitation in SLAM positioning by means of a visual sensor, environmental characteristic information is easy to lose in the environment of weak illumination, scenes with missing textures and rapid movement of a carrier, and meanwhile, the interference and shielding of dynamic objects of the scenes easily cause wrong association of characteristic data and the like. The invention provides a visual inertial fusion positioning method facing indoor SLAM, as shown in figure 1-2, the visual inertial fusion positioning method facing indoor SLAM comprises,
the method comprises the following steps that S1, video image data are obtained through a binocular fisheye camera, and first data are obtained through I-LK optical flow method feature tracking between adjacent frames of the video image data;
s2, acquiring position and attitude data by an inertial measurement unit, and performing pre-integration processing to obtain second data;
s3, interactively fusing and determining a system initialization parameter value after the first data and the second data are loosely coupled, and determining key frame data by a key frame strategy; the key frame data are output as third data after being initialized through visual poses and aligned with the pre-integration of the inertial measurement unit;
and S4, constructing the six-degree-of-freedom pose of the mobile robot by the sliding window based on the key frame through external parameter online calibration, visual inertia initialization and close-coupled nonlinear optimization on the third data.
According to the invention, video image data are obtained through a binocular fisheye camera, position and attitude data are collected by an inertial measurement unit through I-LK optical flow method characteristic tracking between adjacent frames of the video image data, and pre-integration processing is carried out; then, performing loose coupling on the multi-source data, performing interactive fusion to determine system initialization parameter values, and determining key frame data by a key frame strategy; the key frame data is initialized through visual pose and aligned with the pre-integration of the inertial measurement unit; and constructing a six-degree-of-freedom pose of the mobile robot by a sliding window based on a key frame through external parameter online calibration, visual inertia initialization and close coupling nonlinear optimization. When the vision inertial fusion positioning is carried out on the rapid movement of a mobile robot, the searching range of image matching is reduced through an IMU short-time observation model, and the track drift of vision is restrained; knowing the position of the current frame camera, the position of the spatial feature point of the scene in the next frame image can be predicted through IMU information, and therefore the stability of feature point tracking is improved; in a scene with weak texture or few characteristic points, visual image frame information is lost, and the IMU recurses the relative pose information of the body in an integral mode, so that the pose of the mobile robot is recovered, and the robustness of the system is enhanced; the IMU calculates the track pose for a long time to cause the accumulation error of the measured data, and the visual image frame pose can restrain and restrain the drift of the track; therefore, the visual inertial fusion positioning method can improve the robustness of the system on the premise of realizing the positioning accuracy of the mobile robot.
The mobile robot carries a camera to shoot scene pictures at a fixed frame rate, a system obtains a large number of images with similarity and repetition rate in the motion process, and the redundant images occupy more computing resources but have very limited positioning accuracy for improving the system. In a preferred aspect of the present invention, in step S3, the key frame policy includes,
step S31, initializing after obtaining scene characteristic information by a binocular fisheye camera, and setting a first frame image which is stably obtained as a reference key frame;
step S32, determining the average parallax of the previous key frame and the current frame according to the environmental characteristic change of the image, and outputting a non-key frame when the average parallax D is smaller than a set threshold epsilon; when the average parallax D is larger than or equal to a set threshold epsilon, outputting a preselected key frame;
and step S33, when the tracked feature quantity of the preselected key frame is lower than a set threshold value delta, taking the current preselected key frame as the key frame.
For example, a scene image with rich characteristic content and representative motion is selected as a key frame in a continuous image sequence, so that the system operation efficiency is improved. The key frames selected for ensuring the image sequence accord with the scene information description content and meet the real-time property of system data processing. As shown in fig. 3-4, a key frame screening strategy is adopted: initializing after obtaining scene characteristic information by a binocular fisheye camera, setting a first frame image which is stably obtained as a reference key frame, and then judging a current image frame; firstly, determining the average parallax of the previous key frame and the current frame according to the environmental characteristic change of the image, then ensuring the similarity between the image frames to carry out tracking quality judgment, and if the system meets the requirement, storing the position information of the key frame, and if not, entering the next frame image frame screening. The mobile robot extracts features according to the environmental information at the current moment, and selects representative key frame images with rich scenes. When I-LK tracks FAST feature corner environmental perception, the number of optical flow tracking feature points and the total disparity between adjacent frames are stored in the feature manager.
In the global pose estimation of the IMU, the state (P) of the IMU at time j is estimated each time j ,V j ,R j ) Will follow the state of i (P) i ,V i ,R i ) The change occurs and the calculation is carried out again, which results in occupying a large amount of computing resources of the system. The rotation matrix R, described by quaternions in the pose solution process by the IMU, may be represented in the continuous time j as the following kinematic relationship:
Figure BDA0003754187950000091
and converting the integral model into a pre-integral model mode, and splitting an integral term of the IMU state quantity into two parts, namely a fixed integral term at the ith moment and a pre-integral relative quantity from the ith moment to the j moment. Therefore, the integral quantity of the IMU at the current moment can update the pose prediction data only by fixing the moment data and the pre-integral relative quantity between the moment data and the moment.
After IMU measurement data is converted by a pre-integration model in discrete time, the IMU measurement data is divided into a fixed integral term and a pre-integration relative quantity, no matter whether a plurality of sections of discrete integral data exist in the time from the ith moment to the j moment, only one integral operation is needed at the moment, and the redundant decoupling of system data is greatly reduced. In a preferred aspect of the invention, in step S2, the pre-integration process comprises,
s21, splitting an integral term of the state quantity of the inertia measurement unit into a fixed integral term at the ith moment and a pre-integral relative quantity at the ith to the j moments;
and S22, converting the measurement data of the inertia measurement unit in the discrete time into a fixed integral term and a pre-integral relative quantity after the conversion by the pre-integral model.
And matching the optical flow tracking feature points is to establish a corresponding relation between camera image frames so as to restore the camera pose. Based on a common optical flow with unchanged brightness and tiny movement, the optical flow in the image field has consistency, namely all pixel points in the local image field have the same movement.
The invention adds a multilayer image iterative pyramid tracking matching feature point, and adopts a reverse optical flow to eliminate the feature outer points of mismatching or large displacement, which is called I-LK (improved LK, I-LK) optical flow method. The original image is up-sampled to form a multilayer image pyramid, a large-displacement optical flow vector is decomposed into optical flows which accord with micro motion on each layer of image from top to bottom, and optical flow iterative solution is carried out in the image layer field. The solved light stream vector is the sum of the components of the pyramid image, and finally the wrong matching is eliminated by reversely tracking the inverse light stream, so that the image feature extraction capability and the real-time property and the robustness of feature matching in different environments are improved. As shown in fig. 5, in a preferred aspect of the present invention, in step S1, I-LK optical flow feature tracking includes,
s11, sampling an original image of the acquired video image data to form a multilayer pyramid image;
s12, decomposing the large-displacement optical flow vector into optical flows which accord with micro motion on each layer of image from top to bottom, and carrying out optical flow iterative solution in the image layer field to obtain pyramid image vector components;
and S13, accumulating the vector components of the pyramid image to obtain an optical flow vector, and reversely tracking and eliminating mismatching in a reverse optical flow.
To calculate the pyramid image vector components, a more preferred case of the invention is, for example, the pyramid optical flow tracking procedure: firstly, calculating the optical flow of an image from the top layer, estimating the initial value of the lower optical flow according to the calculation result of the upper optical flow, calculating the initial value, and then optimizing the optical flow to obtain an accurate value. From top to bottom, the optical flow of the original image of the bottom layer is calculated.
For most VI-SLAM data fusion, external parameters of vision and inertia are usually repeatedly calibrated in a combined mode before a task is executed to obtain accurate values, but the offline calibration process is tedious and time-consuming. Therefore, a method for calibrating the rotation external parameters of vision and inertia on line is adopted.
For external parameters between vision and inertia are
Figure BDA0003754187950000101
Rigid body transformation matrix representing vision to IMU, relative translation
Figure BDA0003754187950000102
During initialization, the influence is limited, the visual inertia calibration value is taken, and the relative rotation quantity is taken
Figure BDA0003754187950000103
The method is an important factor in the change of vision and IMU space pose, online optimization is carried out in the motion of the mobile robot, and under the optimal condition, in the step S4, external parameter online calibration comprises calibration of relative positions and attitude parameters of a binocular fisheye camera and an inertial measurement unit; and calibrating time difference parameters between the binocular fisheye camera and the inertia measurement unit.
The method comprises the steps of adopting a visual odometer to estimate a camera pose and a landmark point by a light beam adjustment method after feature extraction, I-LK tracking matching and image key frame selection, obtaining good track pose information by the camera, and obtaining a more accurate initial value pose after the visual odometer is initialized after positioning accuracy analysis.
Accurate parameter values of system initialization affect the performance of fusion positioning of visual inertial data, and uncertainty of the parameter values directly causes motion drift of the visual inertial system and reduces the precision of track positioning. Aiming at the accurate solution of the initialization parameters, the parameter values are optimized and solved step by step in an iterative manner by adopting a visual and inertial data loose coupling mode in a sliding window, so that the accurate initialization parameters are provided for a visual inertial system, and the scene mobile robot is pushed to rapidly enter a working state.
Whether the mobile robot reaches the steady pose state quantity or not during the initialization of the visual inertial system can be judged through IMU bias variable convergence of local optimization of a sliding window, and whether the system initialization and the visual inertial data fusion are successful or not can be judged. And analyzing the three-axis data of IMU offset to verify the timeliness of initialization and the robustness of the system under different scenes. In a preferred aspect of the present invention, the visual inertia initialization includes a gyroscope bias solution process, a gravity vector estimation process, a gravity direction refinement process, an accelerometer bias process, and a velocity solution process.
In the gyroscope bias solving process, the error sources of the gyroscope are two parts: the environment measures the noise and bias of the gyroscope for random walk. At system initialization for a short period of time, the measurement noise of the vision and IMU is considered to be small (zeroed out). According to the consistency of the system rotation amount, the pre-integral value of the gyroscope rotation amount between adjacent frames is equal to the observed value of the visual rotation amount, and the difference value of the two values is defined as the gyroscope bias b due to the interference factors of the outside and the IMU g . And constructing a least square optimization model for the rotation amount error function between the adjacent frames of the whole sliding window to calibrate the gyroscope offset.
Figure BDA0003754187950000111
In the k, k +1 frame, Δ R, of the adjacent frames of the window k,k+1 The rotation amount is pre-integrated for the IMU,
Figure BDA0003754187950000112
pre-integrating rotation amounts for IMU
Figure BDA0003754187950000113
In relation to b g A derived jacobian matrix.
Figure BDA0003754187950000114
And
Figure BDA0003754187950000115
the rotation amounts of the k & ltth & gt and k & lt +1 & gt key frames in the world coordinate system and the IMU coordinate system are respectively. And substituting the bias of the gyroscope into IMU pre-integration to update the observation model after solving.
The gravity vector estimation process adopts that the bias of the accelerometer is consistent with the gravity in the direction, when the acceleration value is observed at the same time, the magnitude of the acceleration value is too small compared with the gravity value, the initial value of the bias of the accelerometer calibrated by the IMU is temporarily adopted during initialization, and the noise interference is small because the working time of the system is extremely short during initialization.
In the initialization stage, the body speed of the mobile robot is calculated according to the speed pre-integration under the condition that the bias is solved by neglecting a tiny noise interference factor. The calculated speed is the speed of the IMU corresponding to the key frame on the timestamp, and when the mobile robot moves rapidly, the search range of the image in space matching is reduced through the pre-integration model.
In the gravity direction refining process, the accelerometer bias process and the speed calculating process, after the gravity vector is roughly estimated in the IMU, the gravity value is the local gravity value G, and the direction is further refined. And the accelerometer bias is smaller in value, and gravity G and an inertial coordinate system I are adopted, so that the gravity direction assistance estimation under the condition of being overlapped with the rotation amount expression of the system B is avoided.
With continuous exploration and long-time execution of work tasks of the mobile robot in a scene, the scale of a state estimator formed by acquiring the pose of a key frame and the landmark points of the environment by the visual inertial system is increased sharply, and the real-time performance and robustness of the system are restricted. The variable size and accumulated error are optimized for the state of the visual inertial fusion localization. After the positioning accuracy, the calculation efficiency and the robustness of the visual inertial system are measured, a sliding window based on a limited optimization variable field is established, and local optimal pose updating calculation is carried out in the window through marginalizing old key frame state information and inserting new key frame information.
Wherein the sliding window: the number of key frames is fixed in the window, and when a new key frame and an observed landmark point are added, an old key frame is deleted, so that the number of key frames of window images is restricted, thereby limiting the state parameter variables of the pose and the landmark point, and further controlling the state quantity in the sliding window to optimize the data scale.
Under the condition of inserting the key frame of the sliding window, when the latest image frame outside the sliding window is a key frame, the state data is observed by the marginalized oldest key frame, but the constraint relation of the removed state quantity is kept in an information matrix as prior information so as to avoid losing the pose of the key frame in the window and the landmark point information. And if the latest image frame outside the sliding window is a common image frame, storing pre-integration integrated data measured by the IMU and then transmitting the pre-integration integrated data to the next visual image frame, and discarding observation data of the common image frame until the next image frame is a key frame.
When the mobile robot executes a task in a scene, the system continuously inserts the observation data of the vision and IMU into the sliding window, and performs nonlinear optimization solution on all the vision key frame images and the inertia pre-integration in the window in a close coupling mode, so that the condition quantity that errors between local adjacent frames are accumulated and transmitted to the next moment is avoided, and the accurate positioning in the sliding window is realized.
In order to better implement the system of the visual inertial fusion positioning method facing indoor SLAM, the system comprises,
the acquisition unit comprises a plurality of binocular fisheye cameras and is used for acquiring video image data through the binocular fisheye cameras and obtaining first data through I-LK optical flow method feature tracking between adjacent frames of the video image data;
the pre-integration unit is used for acquiring position and attitude data by the inertia measurement unit, and performing pre-integration processing to obtain second data;
the data fusion unit is used for interactively fusing the first data and the second data after loose coupling to determine a system initialization parameter value and determining key frame data according to a key frame strategy; the key frame data are output as third data after being initialized through visual poses and aligned with the pre-integration of the inertial measurement unit;
and the positioning unit is used for constructing the six-degree-of-freedom pose of the mobile robot by the sliding window based on the key frame through external parameter online calibration, visual inertia initialization and close-coupled nonlinear optimization of the third data.
According to the invention, video image data are obtained through a plurality of binocular fisheye cameras in an obtaining unit, position and attitude data are collected by an inertia measuring unit through I-LK optical flow method characteristic tracking between adjacent frames of the video image data, and pre-integration processing is carried out in a pre-integration unit; the data fusion unit performs loose coupling on multi-source data and then performs interactive fusion to determine system initialization parameter values, and determines key frame data according to a key frame strategy; the key frame data is initialized through visual pose and aligned with the pre-integration of the inertial measurement unit; and performing external parameter online calibration, visual inertia initialization and tight coupling nonlinear optimization in the positioning unit, and constructing a six-degree-of-freedom pose of the mobile robot by a sliding window based on a key frame. When the system is positioned in a rapid movement mode of a mobile robot through visual inertia fusion, the searching range of image matching is reduced through an IMU short-time observation model, and visual track drift is restrained; knowing the position of the current frame camera, the position of the spatial feature point of the scene in the next frame image can be predicted through IMU information, and therefore the stability of feature point tracking is improved; in a scene with weak texture or few characteristic points, visual image frame information is lost, and the IMU recurses the relative pose information of the body in an integral mode, so that the pose of the mobile robot is recovered, and the robustness of the system is enhanced; the IMU calculates the long-time track pose to cause the accumulation error of the measured data, and the visual image frame pose can restrain and restrain the drift of the track; therefore, the visual inertial fusion positioning method can improve the robustness of the system on the premise of realizing the positioning accuracy of the mobile robot.
The original image is up-sampled to form a multilayer image pyramid, a large-displacement optical flow vector is decomposed into optical flows which accord with micro motion on each layer of image from top to bottom, and optical flow iterative solution is carried out in the image layer field. The solved optical flow vector is the accumulated sum of all components of the pyramid image, and finally the inverse optical flow is reversely tracked to eliminate mismatching, so that the real-time performance and the robustness of image feature extraction capability and feature matching in different environments are improved, in the preferable condition of the invention, the system comprises the acquisition unit and a device for I-LK optical flow feature tracking, wherein the device comprises,
the composition module is used for forming a multilayer pyramid image after sampling an original image of the acquired video image data;
the component generation module is used for decomposing the large-displacement optical flow vector into an optical flow which conforms to the micro motion on each layer of image from top to bottom, and performing optical flow iterative solution in the image layer field to obtain pyramid image vector components;
and the optical flow vector module is used for accumulating each vector component of the pyramid image to obtain an optical flow vector and removing the mismatching in the reverse optical flow reverse tracking.
The embodiment of the invention also provides a storage medium, which comprises a stored program, wherein when the program runs, the device where the storage medium is located is controlled to execute the method.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art will appreciate that the embodiments described in this specification are presently preferred and that no acts or modules are required by the invention.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus can be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one type of logical functional division, and other divisions may be realized in practice, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a mobile terminal, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A visual inertial fusion positioning method facing indoor SLAM is characterized by comprising the following steps,
the method comprises the following steps that S1, video image data are obtained through a binocular fisheye camera, and first data are obtained through I-LK optical flow method feature tracking between adjacent frames of the video image data;
s2, acquiring position and attitude data by an inertial measurement unit, and performing pre-integration processing to obtain second data;
s3, interactively fusing the first data and the second data after loose coupling to determine a system initialization parameter value, and determining key frame data by a key frame strategy; the key frame data are output as third data after being initialized through visual poses and aligned with the pre-integration of the inertial measurement unit;
and S4, constructing the six-degree-of-freedom pose of the mobile robot by the sliding window based on the key frame through external parameter online calibration, visual inertia initialization and close-coupled nonlinear optimization on the third data.
2. The indoor SLAM-oriented visual inertial fusion positioning method according to claim 1, wherein in step S3, the key frame strategy comprises,
step S31, initializing after obtaining scene characteristic information by a binocular fisheye camera, and setting a first frame image which is stably obtained as a reference key frame;
step S32, determining the average parallax of the previous key frame and the current frame according to the environmental characteristic change of the image, and outputting a non-key frame when the average parallax D is smaller than a set threshold epsilon; when the average parallax D is larger than or equal to a set threshold epsilon, outputting a preselected key frame;
and step S33, when the tracked feature quantity of the preselected key frame is lower than a set threshold value delta, taking the current preselected key frame as the key frame.
3. The indoor SLAM-oriented visual inertial fusion positioning method of claim 1, wherein in step S2, the pre-integration process comprises,
s21, splitting an integral term of the state quantity of the inertia measurement unit into a fixed integral term at the ith moment and a pre-integral relative quantity at the ith to the j moments;
and S22, converting the measurement data of the inertia measurement unit in the discrete time into a fixed integral term and a pre-integral relative quantity after the conversion by the pre-integral model.
4. The indoor SLAM-oriented visual inertial fusion positioning method of claim 1, wherein in step S1, I-LK optical flow method feature tracking comprises,
s11, forming a multilayer pyramid image after sampling an original image of the acquired video image data;
s12, decomposing the large-displacement optical flow vector into optical flows which accord with micro motion on each layer of image from top to bottom, and carrying out optical flow iterative solution in the image layer field to obtain pyramid image vector components;
and step S13, accumulating the vector components of the pyramid image to obtain an optical flow vector, and reversely tracking and eliminating mismatching in a reverse optical flow.
5. The indoor SLAM-oriented visual inertial fusion positioning method of claim 1, wherein in step S4, the extrinsic parameters online calibration comprises binocular fisheye camera and inertial measurement unit relative position and attitude parameter calibration; and calibrating time difference parameters between the binocular fisheye camera and the inertia measurement unit.
6. The indoor SLAM-oriented visual inertial fusion positioning method of claim 1, wherein visual pose initialization comprises estimating camera pose and landmark points by a beam adjustment method using a visual odometer after feature extraction, I-LK tracking matching and image keyframe selection, and acquiring trajectory pose information by the camera.
7. The indoor SLAM-oriented visual inertial fusion positioning method of any one of claims 1-6 wherein visual inertial initialization comprises a gyroscope bias solution process, a gravity vector estimation process, a gravity direction refinement process, an accelerometer bias process and a velocity solution process.
8. A system for implementing the indoor SLAM-oriented visual inertial fusion positioning method of any one of claims 1-7, the system comprising,
the acquisition unit comprises a plurality of binocular fisheye cameras and is used for acquiring video image data through the binocular fisheye cameras and obtaining first data through I-LK optical flow method feature tracking between adjacent frames of the video image data;
the pre-integration unit is used for acquiring position and attitude data by the inertia measurement unit, and performing pre-integration processing to obtain second data;
the data fusion unit is used for interactively fusing the first data and the second data after loose coupling to determine a system initialization parameter value and determining key frame data according to a key frame strategy; the key frame data are output as third data after being initialized through visual poses and aligned with the pre-integration of the inertial measurement unit;
and the positioning unit is used for constructing the six-degree-of-freedom pose of the mobile robot by the sliding window based on the key frame through external parameter online calibration, visual inertia initialization and close-coupled nonlinear optimization of the third data.
9. The system of claim 8, wherein said acquisition unit further comprises means for I-LK optical flow feature tracking, said means comprising,
the composition module is used for forming a multilayer pyramid image after sampling an original image of the acquired video image data;
the component generation module is used for decomposing the large-displacement optical flow vector into an optical flow which conforms to the micro motion on each layer of image from top to bottom, and performing optical flow iterative solution in the image layer field to obtain pyramid image vector components;
and the optical flow vector module is used for accumulating each vector component of the pyramid image to obtain an optical flow vector and removing the mismatching in the reverse optical flow reverse tracking.
10. A storage medium comprising a stored program, wherein the program, when executed, controls an apparatus in which the storage medium is located to perform the indoor SLAM-oriented visual inertial fusion positioning method of any one of claims 1-7.
CN202210849157.6A 2022-07-19 2022-07-19 Indoor SLAM-oriented visual inertial fusion positioning method and system Pending CN115218906A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210849157.6A CN115218906A (en) 2022-07-19 2022-07-19 Indoor SLAM-oriented visual inertial fusion positioning method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210849157.6A CN115218906A (en) 2022-07-19 2022-07-19 Indoor SLAM-oriented visual inertial fusion positioning method and system

Publications (1)

Publication Number Publication Date
CN115218906A true CN115218906A (en) 2022-10-21

Family

ID=83612166

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210849157.6A Pending CN115218906A (en) 2022-07-19 2022-07-19 Indoor SLAM-oriented visual inertial fusion positioning method and system

Country Status (1)

Country Link
CN (1) CN115218906A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116442248A (en) * 2023-06-19 2023-07-18 山东工程职业技术大学 Robot vision positioning module and method based on target detection
CN116758161A (en) * 2023-06-26 2023-09-15 北京道仪数慧科技有限公司 Mobile terminal space data generation method and space perception mobile terminal

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107869989A (en) * 2017-11-06 2018-04-03 东北大学 A kind of localization method and system of the fusion of view-based access control model inertial navigation information

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107869989A (en) * 2017-11-06 2018-04-03 东北大学 A kind of localization method and system of the fusion of view-based access control model inertial navigation information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘大鹏: "视觉惯性融合定位的室内SLAM研究", 南方农机, 31 March 2022 (2022-03-31), pages 22 - 25 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116442248A (en) * 2023-06-19 2023-07-18 山东工程职业技术大学 Robot vision positioning module and method based on target detection
CN116758161A (en) * 2023-06-26 2023-09-15 北京道仪数慧科技有限公司 Mobile terminal space data generation method and space perception mobile terminal

Similar Documents

Publication Publication Date Title
CN110125928B (en) Binocular inertial navigation SLAM system for performing feature matching based on front and rear frames
CN111258313B (en) Multi-sensor fusion SLAM system and robot
CN109974693B (en) Unmanned aerial vehicle positioning method and device, computer equipment and storage medium
CN107255476B (en) Indoor positioning method and device based on inertial data and visual features
US9071829B2 (en) Method and system for fusing data arising from image sensors and from motion or position sensors
CN112525202A (en) SLAM positioning and navigation method and system based on multi-sensor fusion
CN110298914B (en) Method for establishing fruit tree canopy feature map in orchard
CN104501814B (en) Attitude and position estimation method based on vision and inertia information
CN110726406A (en) Improved nonlinear optimization monocular inertial navigation SLAM method
CN115218906A (en) Indoor SLAM-oriented visual inertial fusion positioning method and system
CN110553648A (en) method and system for indoor navigation
CN110533719B (en) Augmented reality positioning method and device based on environment visual feature point identification technology
CN110455301A (en) A kind of dynamic scene SLAM method based on Inertial Measurement Unit
EP3786891A1 (en) Method and system for visual localization based on dual dome cameras
CN113674412B (en) Pose fusion optimization-based indoor map construction method, system and storage medium
CN110749308B (en) SLAM-oriented outdoor positioning method using consumer-grade GPS and 2.5D building models
CN115272596A (en) Multi-sensor fusion SLAM method oriented to monotonous texture-free large scene
CN113551665A (en) High dynamic motion state sensing system and sensing method for motion carrier
CN111998862A (en) Dense binocular SLAM method based on BNN
CN114494629A (en) Three-dimensional map construction method, device, equipment and storage medium
CN114964276A (en) Dynamic vision SLAM method fusing inertial navigation
CN115451948A (en) Agricultural unmanned vehicle positioning odometer method and system based on multi-sensor fusion
CN115355901A (en) Multi-machine combined graph building method fusing dynamic target perception
CN112580683A (en) Multi-sensor data time alignment system and method based on cross correlation
CN114723811A (en) Stereo vision positioning and mapping method for quadruped robot in unstructured environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination