CN117576218B - Self-adaptive visual inertial navigation odometer output method - Google Patents

Self-adaptive visual inertial navigation odometer output method Download PDF

Info

Publication number
CN117576218B
CN117576218B CN202410066053.7A CN202410066053A CN117576218B CN 117576218 B CN117576218 B CN 117576218B CN 202410066053 A CN202410066053 A CN 202410066053A CN 117576218 B CN117576218 B CN 117576218B
Authority
CN
China
Prior art keywords
visual
inertial navigation
image
adaptive
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410066053.7A
Other languages
Chinese (zh)
Other versions
CN117576218A (en
Inventor
苑晶
唐光盛
张雪波
王扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nankai University
Original Assignee
Nankai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nankai University filed Critical Nankai University
Priority to CN202410066053.7A priority Critical patent/CN117576218B/en
Publication of CN117576218A publication Critical patent/CN117576218A/en
Application granted granted Critical
Publication of CN117576218B publication Critical patent/CN117576218B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/005Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 with correlation of navigation data from several sources, e.g. map or contour matching
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • G01C21/1656Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments with passive imaging devices, e.g. cameras
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/38Electronic maps specially adapted for navigation; Updating thereof
    • G01C21/3804Creation or updating of map data
    • G01C21/3833Creation or updating of map data characterised by the source of data
    • G01C21/3841Data obtained from two or more sources, e.g. probe vehicles
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C22/00Measuring distance traversed on the ground by vehicles, persons, animals or other moving solid bodies, e.g. using odometers, using pedometers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/757Matching configurations of points or features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Automation & Control Theory (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of visual inertial navigation odometer optimization, and provides a self-adaptive visual inertial navigation odometer output method. Comprising the following steps: collecting data; front-end tracking alignment is carried out on the visual image characteristics; selecting two adjacent frames of visual images to form a first image set, and performing characteristic point method matching and direct method alignment to obtain a first result set; carrying out fuzzy processing and illumination change processing on one frame of image, forming a second image set with the other frame, and aligning the characteristic point method matching with the direct method again to obtain a second result set; deep learning is carried out on the two result sets, and a self-adaptive fusion network is obtained; establishing an objective function by taking the visual characteristic residual error, the inertial navigation data residual error and the prior residual error as multiple targets in the sliding window, and carrying out data fusion and solving through a self-adaptive fusion network; and outputting the pose by the visual inertial navigation odometer according to the optimization result. The invention not only can adaptively adjust the weight fused by the characteristic point method and the direct method, but also improves the positioning precision.

Description

Self-adaptive visual inertial navigation odometer output method
Technical Field
The invention relates to the technical field of visual inertial navigation odometer optimization, in particular to a self-adaptive visual inertial navigation odometer output method.
Background
Visual inertial navigation odometer (Visual Inertial Odometry, VIO) is a technique for positioning and navigation that estimates the pose (position and orientation) of a mobile device by fusing information from visual sensors and inertial measurement units (Inertial measurement unit, IMU). Such techniques are commonly used in robots, autonomous vehicles, virtual reality helmets, drones, and the like to help them know their position and motion state accurately without GPS signals or with limited GPS signals. According to the calculation mode of the visual residual error, the VIO can be divided into a direct method VIO and a characteristic point method VIO, wherein the direct method VIO calculates the luminosity residual error by using pixel information in an image to realize the motion estimation of the camera, and the characteristic point method calculates the re-projection residual error through the position of the characteristic point to estimate the motion of the camera.
The existing visual inertial navigation odometer mostly adopts a method of combining a direct method and a characteristic point method, but regardless of the combination scheme, whether the direct method or the characteristic point method is switched is generally obtained by comparing certain indexes with a set threshold value, and the method needs to set a reasonable threshold value according to a specific condition, for example, the quantity and the quality of successfully tracked characteristics reflect the effectiveness of the direct method, and different threshold values need to be set according to scenes and experiences to ensure the stability of an algorithm. Whether the number of the feature points or the size of the visual residual errors is judged, only one residual error exists in the optimization target at the same time, and one method is mainly used in the switching method, and the other method is only used for optimization under some conditions, namely, the existing method for simultaneously writing the feature points and the direct method residual errors into the optimization target, so that a fine fusion strategy is lacked, the contribution of the direct method and the feature point method to the results under different conditions is not fully considered, and the final positioning result can have larger accumulated errors.
Disclosure of Invention
The present invention is directed to solving at least one of the technical problems existing in the related art. Therefore, the invention provides a self-adaptive visual inertial navigation odometer output method.
The invention provides a self-adaptive visual inertial navigation odometer output method, which comprises the following steps:
s1: acquiring a visual image and inertial navigation data of a visual inertial navigation odometer;
s2: extracting an evaluation index of the visual image, and performing front-end tracking alignment on the visual image characteristics, wherein the evaluation index comprises motion blur degree, texture definition and illumination change;
s3: selecting two adjacent frames of visual images with front-end feature tracking aligned to form a first image set, and respectively performing feature point method descriptor matching and direct method alignment on the first image set to obtain a first result set;
s4: performing fuzzy processing and illumination change processing on a second frame image in the first image set to obtain a second fuzzy image, forming a second image set by the second fuzzy image and the first frame image in the first image set, and respectively performing feature point method descriptor matching and direct method alignment on the second image set to obtain a second result set;
s5: deep learning is carried out on the first result set and the second result set, and an adaptive fusion network is obtained;
s6: establishing an objective function by taking a visual characteristic residual error, an inertial navigation data residual error and an priori residual error as multiple targets based on a sliding window, weighting and fusing residual error information in the objective function through the self-adaptive fusion network, and solving the objective function through a nonlinear optimization solving library;
s7: and outputting the pose by the visual inertial navigation odometer according to the optimization result of the objective function.
According to the self-adaptive visual inertial navigation odometer output method provided by the invention, in the step S1, the visual image frequency is stable, and the inertial navigation data frequency is stable.
According to the self-adaptive visual inertial navigation odometer output method provided by the invention, the step S1 further comprises the following steps:
s11: pre-integrating the inertial navigation data.
According to the adaptive visual inertial navigation odometer output method provided by the invention, in step S2, the step of obtaining the motion blur comprises the following steps:
s211: calculating the amplitude spectrum of the visual image by a Fourier transform method;
s212: calculating an average value of the magnitude spectrum and recording the average value as the motion blur;
the texture definition is the contrast of the gray level co-occurrence matrix of the visual image;
the obtaining step of the illumination variation comprises the following steps:
s231: matching the features of the current frame visual image and the previous frame visual image to obtain a feature matching relationship;
s232: and calculating according to the characteristic matching relation to obtain the illumination change.
According to the adaptive visual inertial navigation odometer output method provided by the invention, in step S2, the front-end features include:
fast corner features, wherein the Fast corner features are used for feature point method matching, and the Fast corner features comprise brief descriptors;
and the Shi-Tomas corner point is used for tracking by a direct method.
According to the adaptive visual inertial navigation odometer output method provided by the invention, the step S5 further comprises the following steps:
s51: calculating to obtain the position deviation under a characteristic point method and the pixel gradient deviation under a direct method according to the first result set and the second result set;
s52: normalizing the position deviation and the pixel gradient deviation to obtain an uncertainty true value;
s53: and taking the evaluation index corresponding to the first result set and the evaluation index corresponding to the second result set as inputs, taking the uncertainty true value as a label, and training through a deep learning network to obtain the self-adaptive fusion network.
According to the self-adaptive visual inertial navigation odometer output method provided by the invention, the step S6 further comprises the following steps:
s611: performing visual motion estimation on a visual inertial navigation odometer, and calculating to obtain visual inertial navigation alignment data according to visual motion estimation results and the inertial navigation data;
s612: and constructing a sliding window according to the visual inertial navigation alignment data.
According to the self-adaptive visual inertial navigation odometer output method provided by the invention, the step S6 further comprises the following steps:
s621: and converting the data to be optimized outside the sliding window into prior distribution of the objective function by an marginalization method.
According to the self-adaptive visual inertial navigation odometer output method provided by the invention,
the expression of the objective function in step S6 is:
wherein,for the system state to be estimated including pose, visual feature inverse depth, inertial navigation zero offset, ++>For the collection of all inertial navigation data within the sliding window, and (2)>Is in state->Lower->The residual of the individual inertial navigation data,tracking the set of acquisition points for the direct method, < +.>As a kernel function->Is->The weight of the points is obtained by each trace,is->Luminosity residual error of each tracking acquisition point, +.>To obtain a set of points by descriptor matching,first->The weight of the point is obtained by matching the descriptors, +.>Is in state->Lower->Re-projection residual of the point obtained by matching the individual descriptors, < >>Is a priori error.
According to the self-adaptive visual inertial navigation odometer output method provided by the invention, the step S7 further comprises the following steps:
s71: according to the optimization result of the objective function, outputting map points by the visual inertial navigation odometer;
s72: and constructing a sparse map according to the pose output and the map point output.
Compared with the VIO of the direct method and the characteristic point method switched according to the threshold value, the VIO of the self-adaptive visual inertial navigation odometer output method does not need to set the threshold value in advance, and has no switching process, the contribution of the direct method and the characteristic point method simultaneously acts on an optimization target, compared with the semi-direct method, the direct method and the characteristic point method are mutually independent in characteristic tracking and matching, and residual errors of the direct method and the characteristic point method are simultaneously used in the optimization target, compared with the direct method and the characteristic point method in direct fusion, the method of the invention takes environmental factors affecting the direct method and the characteristic point method into consideration, quantifies the effects, directly acts on the optimization target through a deep learning network from the motion blur, the illumination change and the characteristic point method to the mapping relation of weights of the direct method and the characteristic point method, particularly, when the direct method and the characteristic point method of each frame of image participate in fusion, the direct method and the characteristic point method can be adjusted according to different environments, and the direct method and the characteristic point method of fusion are utilized to the greatest extent under different conditions.
The invention designs the VIO with the direct method and the characteristic point method adaptively fused by taking the influence of motion blur, illumination change and texture conditions on the VIO system into consideration, and the positioning accuracy is obviously improved compared with other VIO systems while the robustness is ensured in an extreme scene.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of an adaptive visual inertial navigation odometer output method provided by the invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention. The following examples are illustrative of the invention but are not intended to limit the scope of the invention.
In the description of the embodiments of the present invention, it should be noted that the terms "center", "longitudinal", "lateral", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", etc. indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, are merely for convenience in describing the embodiments of the present invention and simplifying the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the embodiments of the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In describing embodiments of the present invention, it should be noted that, unless explicitly stated and limited otherwise, the terms "coupled," "coupled," and "connected" should be construed broadly, and may be either a fixed connection, a removable connection, or an integral connection, for example; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium. The specific meaning of the above terms in embodiments of the present invention will be understood in detail by those of ordinary skill in the art.
In embodiments of the invention, unless expressly specified and limited otherwise, a first feature "up" or "down" on a second feature may be that the first and second features are in direct contact, or that the first and second features are in indirect contact via an intervening medium. Moreover, a first feature being "above," "over" and "on" a second feature may be a first feature being directly above or obliquely above the second feature, or simply indicating that the first feature is level higher than the second feature. The first feature being "under", "below" and "beneath" the second feature may be the first feature being directly under or obliquely below the second feature, or simply indicating that the first feature is less level than the second feature.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the embodiments of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
For a better understanding of embodiments of the present invention, an existing VIO system type will first be described.
Whether direct process VIO or feature point process VIO, it is generally composed of the following parts:
(1) Extraction and tracking of visual features. I.e. extracting feature points or feature descriptors from successive visual images, tracking these features using light flow tracking or feature matching;
(2) And (5) motion estimation. Firstly, estimating the angular velocity, linear velocity and attitude of equipment by using a motion model of an inertial sensor, and simultaneously estimating the relative pose transformation of a camera according to the position change of visual features on an image or pixel information;
(3) And (5) data fusion. The visual and IMU information is fused together by using a Kalman filter or a nonlinear optimization method to obtain more accurate pose and visual characteristic maps, and error compensation and calibration are usually required to be carried out on the IMU in the step so as to reduce the influence of sensor errors on an estimation result;
(4) And (5) map construction. The estimated camera pose and the feature point position are used for constructing an environment map so as to facilitate subsequent navigation and positioning;
(5) And outputting an estimation result. Outputting the pose and the map;
(6) The steps are repeated continuously, and the gesture and the position of the equipment are updated in real time.
The problem that the scene stability operation such as motion blur, illumination change, texture deletion and the like is always the pain point of the VIO system, and even if the IMU exists, a large pose estimation error still inevitably occurs in the scenes. By analyzing the processes of feature matching and residual calculation by the direct method and the feature point method, the conclusion can be drawn that the direct method VIO can work without obvious features because only pixel information of an image is used, has certain robustness to motion blur, is sensitive to changes of illumination and generally has large calculation amount, and the feature point method VIO only extracts a small number of feature points in the image, has good performance under the condition of abundant textures and has certain robustness to changes of illumination, but has poor robustness under the environment of low textures and repeated textures and often needs to calculate feature descriptors for feature matching, and is time-consuming. In some cases, both methods may be used in combination to increase robustness and accuracy. Many studies have been made on the combination of the two, which have been newly improved and have achieved good results, and can be roughly classified into the following categories:
(1) And when the direct method odometer fails, switching to be a characteristic point method odometer. The FDMO determines whether to switch to the feature point method by judging the feature quantity in the direct method, and basically, the feature tracking of the direct method is unstable due to the change of ambient illumination, the feature quantity of successful tracking is small, and the robustness of the system can be ensured by switching to the feature point method.
(2) A semi-direct process. SVO is an odometer combining the ideas of a direct method and a characteristic point method, the conventional direct method generally needs to track a large number of pixels to realize pose estimation, the calculated amount is large, the SVO firstly extracts the characteristics on an image like the characteristic point method, and the pixels around the tracked characteristics do not need to calculate descriptors, and a plurality of pixels are abandoned, so that the calculated amount is reduced, and the pose estimation speed is extremely high. The SVO essentially uses a feature extraction strategy similar to a feature point method, and adopts a feature tracking strategy similar to a direct method, so that the calculated amount in the direct method is reduced, the matching time in the feature point method is shortened, but in final pose estimation, the visual residual is calculated by using pixels, which is one of the direct methods, and the SVO expands the strategy of assisting the direct method in the prior related research by utilizing the concept of the feature point method, designs different feature tracking matching modes aiming at different situations, and ensures that a better effect is still realized under illumination change and severe rotation.
(3) By comparing the magnitude relation between the visual residual and the set threshold, a direct method or a characteristic point method is selected to be used. For example, uniVIO is a visual inertial navigation odometer designed for underwater environment, and the situation that the illumination change is severe and the texture is low often occurs in the image collected underwater, and even if the image is correctly matched, the visual residual error is large due to the large difference of the pixel gray level of the image due to the change of the environment, so that the final pose estimation result is influenced. In order to solve the problem, the VIO can screen out a proper residual calculation mode by judging the size relation between the visual residual and the set threshold value, and the characteristic point method and the direct method are switched in the optimization process, so that the robustness and the accuracy of the VIO of the underwater system are ensured.
(4) And simultaneously using a direct method and feature points. The VIO generally picks out a better quality point from the tracked points to calculate pixel errors, and other points calculate re-projection errors, wherein the two residuals have the same contribution in an optimization target. And setting variable weight functions for the direct method and the feature point method according to the number of the features, and realizing the fusion of the direct method and the feature point method.
An embodiment of the present invention is described below with reference to fig. 1.
The invention provides a self-adaptive visual inertial navigation odometer output method, which comprises the following steps:
s1: acquiring a visual image and inertial navigation data of a visual inertial navigation odometer;
in step S1, the visual image frequency is stable, and the inertial navigation data frequency is stable.
Wherein, step S1 further includes:
s11: pre-integrating the inertial navigation data.
Further, the purpose of this stage is visual inertial navigation time stamp alignment and inertial navigation pre-integration. Firstly, whether the frequencies of visual data and inertial navigation data are stable or not is detected, after the frequencies are stable, the visual image and the inertial navigation data are arranged according to time, and inertial navigation data between two frames of images are pre-integrated.
S2: extracting an evaluation index of the visual image, and performing front-end tracking alignment on the visual image characteristics, wherein the evaluation index comprises motion blur degree, texture definition and illumination change;
further, the purpose of this stage is that the image evaluation index selection and front-end feature tracking, the evaluation index of the extracted visual image is used for the deep learning input of the subsequent step, and in steps S3 and S4, the evaluation index is extracted for each image before or after the processing.
In step S2, the step of obtaining the motion blur degree includes:
s211: calculating the amplitude spectrum of the visual image by a Fourier transform method;
s212: calculating an average value of the magnitude spectrum and recording the average value as the motion blur;
the texture definition is the contrast of the gray level co-occurrence matrix of the visual image;
the obtaining step of the illumination variation comprises the following steps:
s231: matching the features of the current frame visual image and the previous frame visual image to obtain a feature matching relationship;
s232: and calculating according to the characteristic matching relation to obtain the illumination change.
Further, according to the input visual image information, first, the amplitude spectrum of the input image is calculated by using the fast fourier transform, and the average value of the amplitude spectrum is calculated to represent the motion blur degree of the image. And secondly, calculating the contrast of the gray level co-occurrence matrix of the image, and measuring the texture definition in the image. And finally, calculating the characteristic matching relation of the two frames of images according to the current frame of image and the previous frame of image, and calculating the illumination change condition of the two frames of images. Each frame of image calculates the degree of motion blur, the degree of texture sharpness and the degree of illumination variation.
In step S2, the front-end feature includes:
fast corner features, wherein the Fast corner features are used for feature point method matching, and the Fast corner features comprise brief descriptors;
and the Shi-Tomas corner point is used for tracking by a direct method.
Further, front-end feature extraction is performed, wherein two features are respectively extracted, and the features comprise fast corner features, brief descriptors and Shi-Tomas corner points, wherein the fast features are used for matching of a feature point method, and the Shi-Tomas corner points are used for tracking of a direct method.
S3: selecting two adjacent frames of visual images with front-end feature tracking aligned to form a first image set, and respectively performing feature point method descriptor matching and direct method alignment on the first image set to obtain a first result set;
s4: performing fuzzy processing and illumination change processing on a second frame image in the first image set to obtain a second fuzzy image, forming a second image set by the second fuzzy image and the first frame image in the first image set, and respectively performing feature point method descriptor matching and direct method alignment on the second image set to obtain a second result set;
further, the objective of the steps S3 and S4 is to create training data. The method comprises the following specific steps: (1) Firstly, original images of a plurality of scenes are collected, the collected images are guaranteed to be sufficient in illumination and rich in texture, the motion in the collection process is slow and smooth, and the collected images are used as the original images. (2) Randomly extracting two adjacent frames in an original image, respectively carrying out descriptor matching of a characteristic point method and image alignment of a direct method on the two frames of images, recording all matching results and positions of the characteristics in the images, (3) carrying out no processing on the extracted two frames, respectively carrying out fuzzy processing and illumination change processing on the next frame to different degrees, carrying out the same matching and alignment as in the step (2) on the next frame after the previous frame and the processing, and recording the results. (4) Repeating the steps (2) and (3) for a plurality of times to obtain data sets of characteristic point method matching and direct method alignment under different conditions.
S5: deep learning is carried out on the first result set and the second result set, and an adaptive fusion network is obtained;
wherein, step S5 further comprises:
s51: calculating to obtain the position deviation under a characteristic point method and the pixel gradient deviation under a direct method according to the first result set and the second result set;
s52: normalizing the position deviation and the pixel gradient deviation to obtain an uncertainty true value;
s53: and taking the evaluation index corresponding to the first result set and the evaluation index corresponding to the second result set as inputs, taking the uncertainty true value as a label, and training through a deep learning network to obtain the self-adaptive fusion network.
Further, after the result sets obtained in the steps S3 and S4 are obtained, respectively calculating the position deviation of feature point matching and the direct method pixel gradient deviation under the normal condition and the changed illumination and image motion blur degree, obtaining quantization indexes matched by a direct method and a feature point method due to environmental change, normalizing the indexes to obtain uncertainty true values of the direct method and the feature point method in training data, and then taking the motion blur degree, the texture definition and the illumination change degree of all images in the calculated data as deep learning network input, taking the uncertainty true values of the direct method and the feature point method as labels, and training to obtain a weight network.
S6: establishing an objective function by taking a visual characteristic residual error, an inertial navigation data residual error and an priori residual error as multiple targets based on a sliding window, weighting and fusing residual error information in the objective function through the self-adaptive fusion network, and solving the objective function through a nonlinear optimization solving library;
further, the nonlinear optimization solution library is ceres.
Further, after the initialization of S611 to S612 is successful, a sliding window is constructed by using a certain number of visual images, and a least square problem is constructed in the sliding window by using visual feature residual errors, IMU residual errors and priori residual errors, wherein the visual residual errors consist of direct method residual errors and feature point residual errors, and the weights in the optimization target are calculated by the network trained in step S5.
Wherein, step S6 further comprises:
s611: performing visual motion estimation on a visual inertial navigation odometer, and calculating to obtain visual inertial navigation alignment data according to visual motion estimation results and the inertial navigation data;
s612: and constructing a sliding window according to the visual inertial navigation alignment data.
Further, the stage objective of steps S611 to S612 is to align the visual initialization with the visual inertial navigation, specifically, firstly, performing pure visual motion estimation according to the front end feature tracking and matching result, converting the estimation result to the inertial navigation coordinate system through the external parameters, combining the motion solved through the IMU information with the motion solved by the pure visual, solving the least square problem, and solving the initial zero drift, the gravity direction and the like of the inertial navigation.
Wherein, step S6 further comprises:
s621: and converting the data to be optimized outside the sliding window into prior distribution of the objective function by an marginalization method.
Furthermore, the frames in the sliding window need to be controlled in a stable quantity to ensure the real-time performance of the VIO system, and in the state outside the sliding window, the frames are not involved in optimization, but cannot be directly discarded in order not to destroy the original constraint relation, so that all constraint information is converted into prior distribution of variables to be optimized by utilizing an marginalization mode, marginalized frames are determined according to the parallaxes of all points of a direct method and a characteristic point method, and under the condition that the parallaxes are larger, the frame with the oldest marginalization is marginal, and under the condition that the parallaxes are smaller, the new frames are marginalized.
Wherein, the expression of the objective function in step S6 is:
wherein,for the system state to be estimated including pose, visual feature inverse depth, inertial navigation zero offset, ++>For the collection of all inertial navigation data within the sliding window, and (2)>Is in state->Lower->The residual of the individual inertial navigation data,tracking the set of acquisition points for the direct method, < +.>As a kernel function->Is->The weight of the points is obtained by each trace,is->Luminosity residual error of each tracking acquisition point, +.>To obtain a set of points by descriptor matching,first->The individual descriptors match the weights of the obtained points, wherein +.>Is->Are obtained by an adaptive converged network,is in state->Lower->Re-projection residual of the point obtained by matching the individual descriptors, < >>Is a priori error.
S7: and outputting the pose by the visual inertial navigation odometer according to the optimization result of the objective function.
Wherein, step S7 further comprises:
s71: according to the optimization result of the objective function, outputting map points by the visual inertial navigation odometer;
s72: and constructing a sparse map according to the pose output and the map point output.
Furthermore, the VIO provided by the invention obtained through the steps can finally ensure that the pose of the VIO is output at 10Hz, and meanwhile, a constructed sparse map is obtained.
In some embodiments, the invention designs the VIO with the direct method and the characteristic point method adaptively fused by taking the influence of motion blur, illumination change and texture conditions on the VIO system into consideration, and the positioning accuracy is obviously improved compared with other VIO systems while ensuring the robustness in an extreme scene. The algorithm and other algorithms are used for verifying the effect of the algorithm on the EUROC data set with illumination change, low texture and motion blur. The specific results are as follows:
first, the positioning accuracy is compared with the result shown in table 1.
TABLE 1 comparison of the positioning accuracy of VIO obtained by the method of the invention with other VIOs
The data units in Table 1 are rice, and are obtained by calculating root mean square error between the positioning track and the acquisition true value, wherein Ours represents VIO of the invention, VINS-mono is VIO based on an optimized characteristic point method, ROVIO is VIO based on a filtering direct method, DVIO is VIO based on an optimized direct method, and Ouiew/o represents VIO without using a weight network in the invention.
As can be obtained from the data analysis in table 1, the method provided by the invention fully considers the influence of environmental factors on the direct method and the characteristic point method, effectively combines the advantages of the characteristic point method and the direct method, and the self-adaptive adjustment contributes to pose optimization, so that a more accurate positioning result is provided, a better effect is obtained on most data sets, and compared with VINS-mono, ROVIO, DVIO, the average positioning error is reduced by 52%, 81%, 46% and 47% respectively, and the positioning precision is obviously improved.
The second is a real-time comparison result, as shown in table 2.
TABLE 2 real-time comparison of VIO obtained by the method with other VIOs
As can be seen from the data analysis of Table 2, the algorithm provided by the invention is basically equivalent to the optimization-based method in the output frequency of the pose, and meets the real-time positioning requirement, wherein ROVIO is a direct method VIO based on iterative expansion Kalman filtering, the processing speed is obviously faster than that of the optimization-based method, but the positioning accuracy is poor.
The invention provides a self-adaptive visual inertial navigation odometer output method, which is a VIO with self-adaptive fusion of a direct method and a characteristic point method by taking motion blur, illumination change and texture conditions into consideration. Compared with the semi-direct method, the direct method and the characteristic point method are independent from each other in characteristic tracking and matching, and the residual errors of the direct method and the characteristic point method are simultaneously used in an optimization target. Compared with the method for directly fusing the direct method and the characteristic point method, the method provided by the invention has the advantages that environmental factors affecting the direct method and the characteristic point method are considered, the effects are quantified, the mapping relation from motion blur, illumination change and texture condition to weights of the direct method and the characteristic point method is learned through a deep learning network, the mapping relation directly acts on an optimization target, specifically, when the direct method and the characteristic point method of each frame of image participate in fusion, the fused weights can be adjusted according to different environments, and the complementarity of the direct method and the characteristic point method under different conditions is utilized to the greatest extent.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1. An adaptive visual inertial navigation odometer output method, comprising:
s1: acquiring a visual image and inertial navigation data of a visual inertial navigation odometer;
s2: extracting an evaluation index of the visual image, and performing front-end tracking alignment on the visual image characteristics, wherein the evaluation index comprises motion blur degree, texture definition and illumination change;
s3: selecting two adjacent frames of visual images with front-end feature tracking aligned to form a first image set, and respectively performing feature point method descriptor matching and direct method alignment on the first image set to obtain a first result set;
wherein the front-end feature in step S3 includes:
fast corner features, wherein the Fast corner features are used for feature point method matching, and the Fast corner features comprise brief descriptors;
a Shi-Tomas corner point, wherein the Shi-Tomas corner point is used for tracking by a direct method;
s4: performing fuzzy processing and illumination change processing on a second frame image in the first image set to obtain a second fuzzy image, forming a second image set by the second fuzzy image and the first frame image in the first image set, and respectively performing feature point method descriptor matching and direct method alignment on the second image set to obtain a second result set;
s5: deep learning is carried out on the first result set and the second result set, and an adaptive fusion network is obtained;
wherein, step S5 further comprises:
s51: calculating to obtain the position deviation under a characteristic point method and the pixel gradient deviation under a direct method according to the first result set and the second result set;
s52: normalizing the position deviation and the pixel gradient deviation to obtain an uncertainty true value;
s53: taking the evaluation index corresponding to the first result set and the evaluation index corresponding to the second result set as inputs, taking the uncertainty true value as a label, and training through a deep learning network to obtain an adaptive fusion network;
s6: establishing an objective function by taking a visual characteristic residual error, an inertial navigation data residual error and an priori residual error as multiple targets based on a sliding window, weighting and fusing residual error information in the objective function through the self-adaptive fusion network, and solving the objective function through a nonlinear optimization solving library;
s7: and outputting the pose by the visual inertial navigation odometer according to the optimization result of the objective function.
2. An adaptive visual inertial navigation odometer output method according to claim 1, wherein in step S1, the visual image frequency is stable and the inertial navigation data frequency is stable.
3. The adaptive visual inertial navigation odometer output method of claim 1, further comprising in step S1:
s11: pre-integrating the inertial navigation data.
4. An adaptive visual inertial navigation odometer output method according to claim 1, wherein in step S2, the step of obtaining the motion blur comprises:
s211: calculating the amplitude spectrum of the visual image by a Fourier transform method;
s212: calculating an average value of the magnitude spectrum and recording the average value as the motion blur;
the texture definition is the contrast of the gray level co-occurrence matrix of the visual image;
the obtaining step of the illumination variation comprises the following steps:
s231: matching the features of the current frame visual image and the previous frame visual image to obtain a feature matching relationship;
s232: and calculating according to the characteristic matching relation to obtain the illumination change.
5. An adaptive visual inertial navigation odometer output method according to claim 1, wherein step S6 further comprises:
s611: performing visual motion estimation on a visual inertial navigation odometer, and calculating to obtain visual inertial navigation alignment data according to visual motion estimation results and the inertial navigation data;
s612: and constructing a sliding window according to the visual inertial navigation alignment data.
6. An adaptive visual inertial navigation odometer output method according to claim 1, wherein step S6 further comprises:
s621: and converting the data to be optimized outside the sliding window into prior distribution of the objective function by an marginalization method.
7. An adaptive visual inertial navigation odometer output method according to claim 1, wherein the expression of the objective function in step S6 is:
wherein,for the system state to be estimated including pose, visual feature inverse depth, inertial navigation zero offset, ++>For the collection of all inertial navigation data within the sliding window, and (2)>Is in state->Lower->Residual error of individual inertial navigation data,/>Tracking the set of acquisition points for the direct method, < +.>As a kernel function->Is->Weights of the tracking acquisition points +.>Is->Luminosity residual error of each tracking acquisition point, +.>To obtain a set of points by descriptor matching, +.>First->The weight of the point is obtained by matching the descriptors, +.>Is in state->Lower->Re-projection residual of the point obtained by matching the individual descriptors, < >>Is a priori error.
8. An adaptive visual inertial navigation odometer output method according to claim 1, wherein step S7 further comprises:
s71: according to the optimization result of the objective function, outputting map points by the visual inertial navigation odometer;
s72: and constructing a sparse map according to the pose output and the map point output.
CN202410066053.7A 2024-01-17 2024-01-17 Self-adaptive visual inertial navigation odometer output method Active CN117576218B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410066053.7A CN117576218B (en) 2024-01-17 2024-01-17 Self-adaptive visual inertial navigation odometer output method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410066053.7A CN117576218B (en) 2024-01-17 2024-01-17 Self-adaptive visual inertial navigation odometer output method

Publications (2)

Publication Number Publication Date
CN117576218A CN117576218A (en) 2024-02-20
CN117576218B true CN117576218B (en) 2024-03-29

Family

ID=89895991

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410066053.7A Active CN117576218B (en) 2024-01-17 2024-01-17 Self-adaptive visual inertial navigation odometer output method

Country Status (1)

Country Link
CN (1) CN117576218B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111780754A (en) * 2020-06-23 2020-10-16 南京航空航天大学 Visual inertial odometer pose estimation method based on sparse direct method
CN113566779A (en) * 2021-08-02 2021-10-29 东南大学 Vehicle course angle estimation method based on linear detection and digital map matching
CN115855064A (en) * 2023-02-15 2023-03-28 成都理工大学工程技术学院 Indoor pedestrian positioning fusion method based on IMU multi-sensor fusion

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111780754A (en) * 2020-06-23 2020-10-16 南京航空航天大学 Visual inertial odometer pose estimation method based on sparse direct method
CN113566779A (en) * 2021-08-02 2021-10-29 东南大学 Vehicle course angle estimation method based on linear detection and digital map matching
CN115855064A (en) * 2023-02-15 2023-03-28 成都理工大学工程技术学院 Indoor pedestrian positioning fusion method based on IMU multi-sensor fusion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DVIO:An Optimization-Based Tightly Coupled Direct Visual-Inertial Odometry;Jing Yuan等;《IEEE》;20201112;论文正文 *
基于深度学习特征点法的单目视觉里程计;熊炜;金靖熠;王娟;刘敏;曾春艳;;计算机工程与科学;20200115(第01期);论文正文 *
基于视觉惯性融合的半直接单目视觉里程计;龚赵慧;张霄力;彭侠夫;李鑫;;机器人(第05期);论文正文 *

Also Published As

Publication number Publication date
CN117576218A (en) 2024-02-20

Similar Documents

Publication Publication Date Title
CN107301654B (en) Multi-sensor high-precision instant positioning and mapping method
CN107341814B (en) Four-rotor unmanned aerial vehicle monocular vision range measurement method based on sparse direct method
CN112233177B (en) Unmanned aerial vehicle pose estimation method and system
US9071829B2 (en) Method and system for fusing data arising from image sensors and from motion or position sensors
CN109579825B (en) Robot positioning system and method based on binocular vision and convolutional neural network
CN110726406A (en) Improved nonlinear optimization monocular inertial navigation SLAM method
CN108519102B (en) Binocular vision mileage calculation method based on secondary projection
CN113916243A (en) Vehicle positioning method, device, equipment and storage medium for target scene area
CN112115980A (en) Binocular vision odometer design method based on optical flow tracking and point line feature matching
CN112419497A (en) Monocular vision-based SLAM method combining feature method and direct method
CN110119768B (en) Visual information fusion system and method for vehicle positioning
CN116205947A (en) Binocular-inertial fusion pose estimation method based on camera motion state, electronic equipment and storage medium
CN112802096A (en) Device and method for realizing real-time positioning and mapping
CN115451948A (en) Agricultural unmanned vehicle positioning odometer method and system based on multi-sensor fusion
CN115861860B (en) Target tracking and positioning method and system for unmanned aerial vehicle
CN111998862A (en) Dense binocular SLAM method based on BNN
CN113744315A (en) Semi-direct vision odometer based on binocular vision
CN110706253B (en) Target tracking method, system and device based on apparent feature and depth feature
CN115218906A (en) Indoor SLAM-oriented visual inertial fusion positioning method and system
Xian et al. Fusing stereo camera and low-cost inertial measurement unit for autonomous navigation in a tightly-coupled approach
CN103679740A (en) ROI (Region of Interest) extraction method of ground target of unmanned aerial vehicle
Zhu et al. PairCon-SLAM: Distributed, online, and real-time RGBD-SLAM in large scenarios
CN112731503A (en) Pose estimation method and system based on front-end tight coupling
CN117576218B (en) Self-adaptive visual inertial navigation odometer output method
CN115482282A (en) Dynamic SLAM method with multi-target tracking capability in automatic driving scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant