CN114199259B - Multi-source fusion navigation positioning method based on motion state and environment perception - Google Patents

Multi-source fusion navigation positioning method based on motion state and environment perception Download PDF

Info

Publication number
CN114199259B
CN114199259B CN202210156085.7A CN202210156085A CN114199259B CN 114199259 B CN114199259 B CN 114199259B CN 202210156085 A CN202210156085 A CN 202210156085A CN 114199259 B CN114199259 B CN 114199259B
Authority
CN
China
Prior art keywords
vehicle
imu
image
sliding window
visual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210156085.7A
Other languages
Chinese (zh)
Other versions
CN114199259A (en
Inventor
孙蕊
戴晔莹
朱永文
王均晖
王长春
林挺
孔洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202210156085.7A priority Critical patent/CN114199259B/en
Publication of CN114199259A publication Critical patent/CN114199259A/en
Application granted granted Critical
Publication of CN114199259B publication Critical patent/CN114199259B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/28Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network with correlation of data from several navigational instruments
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S19/00Satellite radio beacon positioning systems; Determining position, velocity or attitude using signals transmitted by such systems
    • G01S19/38Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system
    • G01S19/39Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system the satellite radio beacon positioning system transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
    • G01S19/42Determining position
    • G01S19/45Determining position by combining measurements of signals from the satellite radio beacon positioning system with a supplementary measurement
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S19/00Satellite radio beacon positioning systems; Determining position, velocity or attitude using signals transmitted by such systems
    • G01S19/38Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system
    • G01S19/39Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system the satellite radio beacon positioning system transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
    • G01S19/42Determining position
    • G01S19/45Determining position by combining measurements of signals from the satellite radio beacon positioning system with a supplementary measurement
    • G01S19/47Determining position by combining measurements of signals from the satellite radio beacon positioning system with a supplementary measurement the supplementary measurement being an inertial measurement, e.g. tightly coupled inertial
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Landscapes

  • Engineering & Computer Science (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Theoretical Computer Science (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)
  • Navigation (AREA)

Abstract

The invention provides a multisource fusion navigation positioning method based on motion state and environment perception, which comprises the following steps of 1, collecting image frames by using a camera, eliminating dynamic points in the image frames, and obtaining static characteristic points and positions of the image frames; step 2, obtaining the position of a local camera based on the feature point position of each frame of image and a visual and IMU tight combination fusion positioning algorithm; step 3, obtaining the motion state of the vehicle based on the IMU measurement data; switching the self-adaptive algorithm according to the motion state of the vehicle; and 4, obtaining a final positioning result based on the local camera pose, the switching result of the self-adaptive algorithm, Kalman filtering and a GNSS, IMU and visual pine combined fusion positioning algorithm. The method effectively exerts the function of vision in integrated navigation, and improves the precision and robustness of multi-source fusion positioning.

Description

Multi-source fusion navigation positioning method based on motion state and environment perception
Technical Field
The invention belongs to the technical field of satellite positioning and navigation, and particularly relates to a multisource fusion navigation positioning method based on motion state and environment perception.
Background
Intelligent traffic is rapidly developing in urban environments, and the automatic driving technology has provided a lot of convenience for human life. The navigation positioning technology is the key for determining high precision, continuity and reliability of vehicle positioning and is also an important precondition for path planning and vehicle control. The GNSS is an important sensor for navigation and positioning, can provide high-precision space-time information in an open environment, but can cause reduction and even loss of positioning precision due to GNSS signal blockage and shielding in urban complex environments such as urban canyons, viaducts, tunnels and the like. It is difficult to maintain good positioning accuracy in a complicated and varied environment by using a single sensor, and thus, it is necessary to make up for the defects and shortcomings by using the advantages of each of the plurality of sensors. At present, the main research direction that GNSS performance descends among the solution urban environment is navigated in the multisensor integration, GNSS INS integrated navigation is the most extensive integrated navigation system who uses, but long-time GNSS is in sheltering from the region and only relies on inertial navigation system still can produce great error, consequently need other sensors to reduce the drift of being used to lead in the short time, the camera can carry out environmental perception and visual information acquisition through the image that obtains the surrounding environment, and have and be difficult for receiving advantages such as interference, low price, be ideal vehicle-mounted sensor. Zhang Yiran et al proposed a method for locally improving the positioning accuracy of GNSS by establishing a landmark database using the geographical position of landmarks through binocular vision, but only assisted with GNSS by a camera, and did not use the camera as a main sensor for integrated navigation, and Scheiwarong et al proposed a vision-assisted INS/GNSS navigation system based on a federal filter, which uses INS to perform loose integration with GNSS and vision respectively, but when the quality of observation information and data is not good, the performance of integrated navigation will be affected. Therefore, the research on the GNSS/IMU/visual combined navigation system with reliability and robustness has great development prospect.
Interpretation of terms:
GNSS Global Navigation Satellite System (Global Navigation Satellite System)
IMU inertia Measurement Unit (Inertial Measurement Unit)
INS Inertial Navigation System (Inertial Navigation System)
VIO Visual inertia odometer (Visual inertia odometer)
SLAM synchronous positioning and mapping (simultaneous localization and mapping)
VO Visual odometer (Visual odometer)
BN Batch Normalization (Batch Normalization)
ECEF Earth-core-Earth-Fixed (Earth-Fixed )
KLT sparse optical flow Tracking Method (Kanade-Lucas-Tomasi Tracking Method)
EKF Extended Kalman Filter (Extended Kalman Filter).
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to solve the technical problem of the prior art and provides a multisource fusion navigation positioning method based on motion state and environment perception.
In order to solve the technical problem, the invention discloses a multi-source fusion navigation positioning method based on motion state and environment perception, which comprises the following steps.
Step 1, a camera is used for collecting image frames, dynamic points in the image frames are removed, and static characteristic points and positions of the static characteristic points of each image frame are obtained.
And 2, obtaining the position of the local camera based on the feature point position of each frame of image and a visual and IMU tight combination fusion positioning algorithm.
Step 3, obtaining the motion state of the vehicle based on the IMU measurement data; and switching the self-adaptive algorithm according to the motion state of the vehicle.
And 4, obtaining a final positioning result based on the local camera pose, the switching result of the self-adaptive algorithm, Kalman filtering and a GNSS, IMU and visual pine combined fusion positioning algorithm.
Further, the step 1 includes a step 1.1 of semantically segmenting pedestrians and vehicles of each frame of image in the image frame, for an image frame set OPic collected by a camera, respectively extracting and tracking feature points of each frame of image by using a KLT sparse optical flow method to obtain feature point positions, namely pixel coordinates of the feature points, and performing semantic segmentation of the vehicles and the pedestrians on the basis of a convolutional neural network to obtain the feature points of each frame of image belonging to semantic categories of the vehicles or the pedestrians.
The KLT sparse optical flow method extracts feature points, tracks the position change of the same pixel point in a pixel coordinate, and is the basis of subsequent pose estimation. And through semantic segmentation and dynamic point elimination, dynamic characteristic points of people and vehicles can be removed, so that the negative influence of the dynamic points on final pose estimation is reduced, the number of points tracked by a sparse optical flow method can be reduced, and the operation efficiency is improved.
Step 1.2, pre-integrating IMU measurement data to obtain an IMU pre-integration result, namely a relative pose relationship of two adjacent frames of images, including relative position, speed and attitude change relationships of the two adjacent frames of images, wherein the measurement data comprises measurement values output by an accelerometer and a gyroscope.
Step 1.3, removing dynamic points, and projecting feature points of each frame of image belonging to semantic categories of vehicles or pedestrians to the next frame of image based on an IMU pre-integration result as predicted positions of the feature points in the next frame of image based on the IMU pre-integration result; and eliminating the dynamic points by tracking the inconsistent pixel points of the actual positions and the predicted positions of the points in the next frame of image, namely the dynamic points, and taking the pixel points of the actual positions and the predicted positions as the static points, thereby obtaining the static characteristic points and the positions of the next frame of image.
Step 1.1, dynamic points of all vehicles and pedestrians are segmented, but not all vehicles and pedestrians are in motion, so that the moving part of the vehicle and pedestrian characteristic points obtained in step 1.1 needs to be found through IMU pre-integration. The division is carried out firstly, so that the pre-integral projection is not needed to be carried out on all the characteristic points, objects such as trees and street lamps are static, vehicles and pedestrians which are movable are found out firstly, only the part is projected, and the calculation amount can be reduced.
Further, the convolutional neural network in step 1.1 adopts an improved U-net convolutional neural network, which replaces convolutional layers in the encoder and the decoder with a Ghost module, which has the same function as a conventional convolutional module, but belongs to lightweight operation, and the linear transform operation used in the convolutional neural network is more low in cost and less in calculation amount while ensuring accuracy.
Semantic segmentation of vehicles and pedestrians based on a convolutional neural network includes step 1.1.1, training an improved U-net convolutional neural network model by using a data set; the data set acquires road images under different scenes in advance in a mode of vehicle-mounted camera acquisition, establishes a vehicle and pedestrian data set, or directly downloads the existing public road data set; and training the improved U-net convolutional neural network model to obtain the trained improved U-net convolutional neural network.
Step 1.1.2, inputting the image frame set OPic collected by the camera into a trained encoder of the improved U-net convolutional neural network to obtain a high-level feature vector of the image frame set OPic collected by the camera.
And 1.1.3, inputting the high-level feature vector of the image frame set OPic collected by the camera into a decoder of the trained improved U-net convolutional neural network to obtain a feature image for semantic category segmentation.
Step 1.1.4, inputting the characteristic image for semantic category segmentation into a softmax layer, and obtaining the characteristic point of each frame of image belonging to the semantic category of the vehicle or the pedestrian in an image frame set OPic collected by a camera.
Further, step 2.1, performing VIO initialization based on the feature point position of each frame of image and the IMU pre-integration result to obtain initial values of external parameters, the depth of the feature point, the tightly coupled position, the speed and the attitude between the camera and the IMU.
And 2.2, performing the vision and IMU tight combination nonlinear optimization of the sliding window according to initial values of external parameters, depths of characteristic points, tight coupling positions, speeds and postures between the camera and the IMU based on a nonlinear optimization mode, and solving the optimal camera pose, namely the local camera pose, in the sliding window.
Further, in the step 2.1, the pose transformation of the image frame is restored through a motion restoration structure technology based on the feature point position of each frame of image, the pose transformation of the image frame is aligned with the IMU pre-integration result, the external reference between the camera and the IMU is restored, and the transformation relation between the IMU coordinate system and the world coordinate system is obtained.
The step 2.2 comprises: obtaining a state vector X of a sliding window with a combination of vision and IMU, solving a nonlinear optimization problem with the combination of vision and IMU by using a minimum reprojection error algorithm, and solving the nonlinear optimization problem by using a nonlinear optimization library Ceres solution library, thereby solving the optimal camera pose in the sliding window.
The error function to be optimized for the visual and IMU tightly combined nonlinear optimization problem is the following formula.
Figure GDA0003564792230000041
In the formula, gammamargIs prior information from marginalization, i.e. the oldest frame in the sliding window and its corresponding state quantity are marginalized as prior information; gamma rayIAnd gammaLRespectively an IMU residual error item and a visual residual error item; i is the set in which all IMU measurements are taken,
Figure GDA0003564792230000042
the covariance matrix of an IMU pre-integration noise item between the kth frame image and the (k + 1) th frame image is obtained; l is the set of feature points observed at least twice in the current sliding window,
Figure GDA0003564792230000043
is the covariance matrix of the visual observation noise of the ith pixel coordinate in the theta frame image in the set L.
Further, in step 3, the switching of the adaptive algorithm refers to constraining and correcting the GNSS, the IMU and the visual loose combination fusion positioning algorithm and the visual and IMU tight combination fusion positioning algorithm respectively by using zero-speed correction, a robust kernel function and incomplete constraint according to the motion state of the vehicle.
And 3.1, designing a sliding window with the size of N based on the measured value output by the accelerometer to solve an acceleration standard deviation factor, judging whether the vehicle state is in a static state or not according to the acceleration standard deviation factor, and performing static zero-speed correction in a GNSS, IMU and visual loose combined fusion positioning algorithm if the vehicle state is in the static state.
Step 3.2, if the vehicle is detected to be in a non-static state, the vehicle is in a turning state or a straight-going state; and if the vehicle is in a turning state, adding a robust kernel function into the vision and IMU tightly-combined fusion positioning algorithm.
And 3.3, if the vehicle is in a straight-going state, adding an incomplete constraint in a GNSS, IMU and vision loose combined fusion positioning algorithm to correct the accumulated error of inertial navigation.
The influence of the vehicle motion state on the positioning accuracy is large, and different constraints and corrections can be added into the algorithm by judging the difference of the vehicle motion state, so that the positioning accuracy is improved. The invention sets three motion state judgments, when the vehicle is static, the sensor IMU and the camera are not static due to self errors, which brings errors, so that the vehicle is accurately judged to be static, and the error is corrected by static zero speed correction. When the vehicle moves straight, lateral movement and vertical movement are not possible, and the use condition of incomplete restriction is met. When a vehicle turns, an image shot by the camera is blurred, the camera can generate mismatching when the feature points are matched, and the accuracy of the camera is greatly reduced, so that a robust kernel function is added, and the error influence is reduced.
Further, step 4 comprises a step 4.1 of resolving the GNSS position information.
And 4.2, initializing VIO and GNSS, and converting the local camera pose obtained in the step 2 and the GNSS position information obtained by resolving in the step 4.1 into corresponding information in a geocentric and geostationary coordinate system.
And 4.3, obtaining a global estimation pose, namely a final positioning result, based on Kalman filtering and a combined fusion positioning algorithm of the GNSS, the IMU and the visual pine.
Further, step 3.1 includes a sliding window acceleration standard deviation of magnitude N
Figure GDA0003564792230000051
Expressed as the following equation.
Figure GDA0003564792230000052
Figure GDA0003564792230000053
Wherein the content of the first and second substances,
Figure GDA0003564792230000054
respectively outputting three axes of an accelerometer xyz at the time t; i denotes the time index of the acceleration data within the sliding window, i ∈ [ t-N +1, t ∈](ii) a μ is the average of all accelerations within the sliding window, and as a mathematical expectation, an unbiased estimate is used here.
By setting a stationary threshold value of the vehicle during movement of the vehicle
Figure GDA0003564792230000055
To judge whether the vehicle is stationary or not, when the acceleration standard deviation of the vehicle is
Figure GDA0003564792230000056
And (3) judging that the vehicle is in a zero-speed static state, adding zero-speed correction to the vehicle, namely in the Kalman filtering process of the step 4.3, using the speed solved by inertial navigation as the observed quantity of the system speed error to carry out Kalman filtering estimation, and using the updated state estimator to feed back and correct the navigation parameter result in the INS subsystem so as to finish static zero-speed correction.
Wherein the vehicle is stationary
Figure GDA0003564792230000061
Setting as dynamic threshold, setting a sliding window with size of M, acceleration standard deviation at different time in the sliding window
Figure GDA0003564792230000062
Threshold value of vehicle standstill
Figure GDA0003564792230000063
The zero speed state of the vehicle can be detected more flexibly by changing adaptively with the sliding of the sliding window M.
Figure GDA0003564792230000064
Figure GDA0003564792230000065
Wherein j represents the standard deviation of the acceleration within the sliding window M
Figure GDA0003564792230000066
Is given by the time index of (j) e [ t-M +1, t ∈],ηjThe weights are weights of the respective acceleration standard deviations, and the smaller the weight of the acceleration standard deviation farther from the current time t is, the larger the weight is the closer the acceleration standard deviation is.
Defining the upper limit of the size of the sliding window M as MmaxThe lower limit is MminFirst get MminAs the length of the sliding window, a current group is set
Figure GDA0003564792230000067
When the mean value sigma is less than or equal to the threshold epsilon of the stationarity of the acceleration value, the acceleration value representing the period of time is relatively stable, a larger sliding window length M value is selected, and when the mean value sigma is more than the threshold epsilon of the stationarity of the acceleration value, the sliding window length M representing the staged change of the acceleration value is selected, and a smaller sliding window length M is selected to improve the sensitivity.
Figure GDA0003564792230000068
Figure GDA0003564792230000069
Wherein M' belongs to [0, M ∈max-Mmin]σ is taken to be MminWhen the window length is, the group
Figure GDA00035647922300000610
The mean value of (a); collecting acceleration output by IMUs of a plurality of groups of vehicles in different motion states to calculate a plurality of groups of sigma, and selecting a critical value of whether the acceleration value is stable or not as an acceleration value stability threshold epsilon.
When the vehicle is static, the output of the accelerometer is stable, and when the vehicle is not static, the data has larger fluctuation, and the standard deviation can reflect the stability of a group of data. In different static scene time, the stability threshold values of the acceleration values should be different due to external factors such as shaking of the vehicle body and the like, so that the standard deviation of some nearby time periods has reference value for determining the stability threshold values of the acceleration values, and the stability threshold values of the dynamic acceleration values are set.
Further, step 3.2 includes designing a turn detection factor based on the average value of the angular velocity based on the measured value of the angular velocity, and using the average value | ω | of the angular velocity within the time of the sliding window N as the judgment basis for the turning of the vehicle when the average value of the angular velocity is used
Figure GDA0003564792230000071
If so, the vehicle is judged to be in a turning state, otherwise, the vehicle is judged to be in a straight running state.
In the formula, ωzIs the output of the gyroscope in the z-axis direction; THωIs a set angular velocity mean threshold.
And if the vehicle is in a turning state, adding a robust kernel function into a visual and IMU (inertial measurement unit) close-combination fusion positioning algorithm, and adding a robust kernel function Huber kernel function into a residual error term of visual measurement in an error function to be optimized of a visual and IMU close-combination nonlinear optimization problem.
Figure GDA0003564792230000072
Figure GDA0003564792230000073
And delta is a threshold value for judging whether the visual residual error item is overlarge, respectively acquiring data with good straight-ahead quality of the vehicle and data with image blurring and mismatching during turning, and comparing the obtained visual residual error items so as to determine the threshold value delta of the visual residual error item for judging whether the quality of the visual data is good or bad.
In the step 3.3, if it is determined that the vehicle is in a straight-ahead state, the vehicle does not generate lateral sideslip and motion along the vertical direction of the road surface in the driving process, and an incomplete constraint is added in the kalman filtering process of the step 4.3 to correct the accumulated error of inertial navigation, that is, the speeds in the lateral direction and the vertical direction are set to be zero.
When the vehicle moves straight, lateral movement and vertical movement are not possible, and the use condition of incomplete constraint is met. When the vehicle turns, the image shot by the camera is blurred, the camera can generate mismatching when the feature points are matched, and the accuracy of the camera is greatly reduced, so that a robust kernel function is used, and the error influence is reduced.
Further, step 4.1 comprises: removing satellites with satellite elevation angles lower than a satellite elevation angle threshold value to obtain satellite observation data after screening; and resolving to obtain GNSS position information through a pseudo-range single-point positioning algorithm according to the screened satellite observation data.
Step 4.3 comprises: inputting corresponding information of the local camera pose and the GNSS position information obtained in the step 4.2 in a geocentric geocoordinate system and a switching result of the adaptive algorithm in the step 3 into a Kalman filter to obtain error correction information; inputting the error correction information into the sliding window-based close-combination nonlinear optimization of the step 2.2, and carrying out feedback correction on the navigation parameters; and finally outputting the corrected pose information as a global estimation pose, namely a final positioning result.
If the positioning of the GNSS itself is problematic, the combined effect of it and the VIO is greatly affected. The signals of the satellites with elevation angles lower than the elevation angle threshold of the satellites are easy to reflect by obstacles such as buildings, so that the measurement information has large errors, and therefore the satellites with the low elevation angles need to be rejected.
The GNSS can provide an absolute positioning position of the vehicle, and under the condition of good signals, the GNSS positioning precision is very high, and no accumulated error exists, so that the GNSS and the VIO are combined for navigation to obtain a good global pose result.
Has the advantages that: 1. aiming at the problem that the GNSS/IMU combined navigation can drift under the condition that the GNSS fails for a long time, the camera and the GNSS/IMU are adopted for combined navigation, firstly, the visual information and the IMU are tightly combined based on sliding window optimization, and then, the output local pose and the positioning result of the GNSS are input into Kalman filtering for loose combination, so that the effect of the vision on the combined navigation is effectively exerted, and the precision and the robustness of multi-source fusion positioning are improved.
2. Aiming at the problem of dynamic point elimination in combined navigation data preprocessing, the improved U-net convolutional neural network is adopted, namely, a Ghost module is used for replacing an original traditional convolutional layer, semantic segmentation is carried out on a camera shooting picture, pedestrian and vehicle characteristic points are extracted, the part of characteristic points are projected into a next frame of image based on an IMU (inertial measurement unit) pre-integration result, points with different positions are found out and are taken as dynamic points, efficient dynamic point elimination is effectively achieved, and computing efficiency is optimized.
3. Aiming at the problem of accuracy reduction caused by the change of the motion state of the vehicle, the invention constructs a detection factor based on IMU acceleration and angular velocity, detects whether the motion state of the vehicle is static, turning or straight, and respectively uses zero velocity correction, robust kernel function and incomplete restriction to restrict and correct the filtering process and nonlinear optimization aiming at the three states.
Drawings
The foregoing and/or other advantages of the invention will become further apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings.
Fig. 1 is a schematic flow chart of a multi-source fusion navigation positioning method based on motion state and environmental awareness according to an embodiment of the present application.
Fig. 2 is a schematic diagram of the Ghost module.
Fig. 3 is a schematic diagram of a network structure of an improved U-net convolutional neural network.
Fig. 4 is an image frame acquired by a camera.
Fig. 5 is a diagram illustrating a dynamic point culling effect of fig. 4 according to the prior art.
Fig. 6 is a schematic diagram illustrating a dynamic point elimination effect in a multi-source fusion navigation positioning method combining a vehicle motion state and environment perception provided by an embodiment of the present application.
Fig. 7 is a schematic diagram of comparing three-dimensional position errors in a northeast coordinate system of navigation and positioning by a multi-source fusion navigation and positioning method combining vehicle motion state and environmental perception provided by the prior art and the embodiment of the present application.
Fig. 8 is a schematic diagram illustrating a comparison of attitude errors in navigation and positioning by a multi-source fusion navigation and positioning method combining vehicle motion state and environmental awareness according to the prior art and the embodiments of the present application.
Detailed Description
Embodiments of the present invention will be described below with reference to the accompanying drawings.
The prior art has the following problems.
1. Currently, the most widely used GPS/INS integrated navigation system is adopted, under the condition that the GNSS fails for a long time, the IMU can generate error accumulation and drift, and the INS is high in manufacturing cost and long in production period. Meanwhile, most of the existing visual combined navigation methods are VO or visual assisted GNSS positioning, and the visual positioning effect is not fully exerted.
2. The existing GNSS/IMU/vision combined navigation has the problem of lack of dynamic environment self-adaption, dynamic points are not accurately and efficiently removed in the vision data preprocessing, and the defects that a large amount of calculation is brought to processing of all characteristic points and all states of pedestrian and vehicle characteristic points are removed exist.
3. When the motion state of the vehicle is changeable, the positioning precision of the existing GNSS/IMU/vision combined navigation is reduced, and the problems of performance reduction and insufficient reliability exist.
In order to meet the requirements of high-precision, continuity and reliability navigation and positioning of vehicles under all-weather and all-road conditions and improve the self-adaptability of the vehicles in a complex environment and motion state change, the embodiment of the invention provides a multi-source fusion navigation and positioning method combining vehicle motion state and environment perception, and as shown in figure 1, the method comprises the following steps of 1, acquiring image frames by using a camera, eliminating dynamic points in the image frames and obtaining static characteristic points and positions of the image frames.
Step 1.1, based on improved convolution neural network, semantic segmentation of pedestrians and vehicles
For an image frame set OPic collected by a camera, firstly, respectively extracting and tracking feature points of each frame of image by using a KLT sparse optical flow method, obtaining feature point positions, namely pixel coordinates of the feature points, inputting the image frame set OPic collected by the camera into a convolutional neural network to carry out semantic segmentation of 'vehicles' and 'pedestrians', and finally carrying out dynamic point elimination by using an IMU pre-integration result.
The KLT sparse optical flow method extracts and tracks feature points of each frame of image, specifically refer to reference: peri-epi, Marie ORB feature matching optimization based on sparse optical flow methods [ J ] applied optics, 2019,40(04): 583-.
In the step 1.1, the convolutional neural network adopts an improved U-net convolutional neural network, as shown in fig. 3, a conventional convolutional layer is replaced by a Ghost module for feature extraction, and then the image is subjected to semantic segmentation. The improved U-net convolutional neural network is of an encoder-decoder structure, a Ghost module is used for replacing a traditional convolutional layer in an encoder part to extract features, an up-sampling and convolution combination method is adopted in a decoder part to achieve deconvolution, and the Ghost module is used for replacing the traditional convolutional layer.
Step 1.1.1, training the improved U-net convolutional neural network model by using a data set. The data set acquires road images under different scenes in a mode of vehicle-mounted camera acquisition, establishes a vehicle and pedestrian data set, or directly downloads the existing public road data set, such as a KITTI data set. And training the improved U-net convolutional neural network model to obtain the trained improved U-net convolutional neural network.
And step 1.1.2, inputting the image collection OPic collected by the camera into a coder of the improved U-net convolutional neural network after the input, and obtaining the high-level feature vector of the image collection OPic collected by the camera.
The output of the feature map is increased by using the Ghost module, and is divided into two parts, as shown in fig. 2, and the first part is a traditional convolutional layer which outputs a small number of feature maps, as shown in formula (1).
Y’=OPic*f’ (1)
Where Y 'is the set of output feature maps, f' is the convolution kernel of the convolution operation, pic is the set of image frames captured by the camera, and x is the conventional convolution. The number of channels of the output signature of this convolution operation is smaller than the number of channels of a conventional convolution layer.
The second part is a lightweight linear transform layer for generating redundant feature maps, as shown in equation (2).
Figure GDA0003564792230000101
Wherein, the first and the second end of the pipe are connected with each other,
Figure GDA0003564792230000102
for m feature images of Y', each feature map of Y
Figure GDA0003564792230000103
All operated by lightweight linear
Figure GDA0003564792230000104
Obtaining s feature images, here
Figure GDA0003564792230000105
It may be a convolution operation of 3 × 3 or 5 × 5. Therefore, m feature maps collectively obtain m × s feature images by this linear operation. And setting the number of Ghost modules as 5, and performing down-sampling on the picture through maximum pooling after each Ghost module to reduce the dimension of the features. And outputting the high-level feature vectors of the image after passing through 5 Ghost modules, and transmitting the high-level feature vectors into a decoder.
And 1.1.3, decoding, namely decoding the extracted features by adopting 4 decoding modules, wherein each decoding module consists of a deconvolution module followed by a Ghost module, and the coding of the same level is the same as the convolution module of the decoding part. And the output of the upper level and the feature graph output by the corresponding encoder are used as the input of the decoding module, the features of the lower level output by the encoder are fused, and all the features are fully utilized to obtain the feature image for semantic category segmentation.
Step 1.1.4, inputting the feature image for semantic category segmentation into a softmax layer, thereby obtaining the probability that each pixel of each frame of image in an image frame set OPic collected by a camera belongs to a semantic category of 'vehicle' or 'pedestrian', as shown in formula (3).
Figure GDA0003564792230000111
Wherein s isd(x) The score of each pixel point x corresponding to the semantic category D is D, which is the number of semantic categories, here, two categories are "pedestrian" and "vehicle", and D is 2. Qd(x) Is the result of the classification of the semantic class d to the pixel point x such that the most likely result is maximized while suppressing the probability of other classes. According to multiple tests, multiple groups of images of pedestrians and vehicles are collected, multiple groups of probabilities of semantic categories can be obtained through the improved U-net network, for example, the probability of the pedestrian category obtained by the pedestrian feature points in the images and the probability of the vehicle category obtained by the vehicle feature points are both greater than 0.8, and then the probability experience threshold Q is obtainedthrIs 0.8. For a certain feature point, it will find out two probabilities, one is the probability Q that it belongs to a pedestrian1(x) And secondly, the probability Q that it belongs to a vehicle2(x) The obtained probability and a determined probability empirical threshold Q are comparedthrBy comparison, if Q1(x) Greater than a probabilistic empirical threshold QthrThen the pixel is determined to belong to the "pedestrian" category, if Q2(x) Greater than QthrIt is deemed to belong to the "vehicle" category.
Step 1.2, pre-integrating IMU measurement data to obtain an IMU pre-integration result, namely a relative pose relationship of two adjacent frames of images, including relative position, speed and attitude change relationships of the two adjacent frames of images, wherein the measurement data comprises measurement values output by an accelerometer and a gyroscope.
The measurement data of the accelerometer and gyroscope are shown in equation (4) and equation (5).
Figure GDA0003564792230000112
Figure GDA0003564792230000113
Wherein the content of the first and second substances,
Figure GDA0003564792230000114
is the measurement of accelerometer output at time t output by the IMU; a istIs the true value of acceleration; baIs the bias of the accelerometer; n is a radical of an alkyl radicalaIs the accelerometer noise term; w represents the world coordinate system, gwIs the gravitational acceleration in the world coordinate system;
Figure GDA0003564792230000115
the rotation matrix from a world coordinate system to a carrier coordinate system can be obtained through the mechanical arrangement of inertial navigation;
Figure GDA0003564792230000116
is a gyroscope measurement; omegatIs the true value of angular velocity; bgIs the gyroscope bias; n isgIs the gyroscope noise term.
For image frames k and k +1, the above two equations are integrated, and the time t can be calculatedkAnd tk+1Relative position, velocity, attitude changes therebetween. The pre-integration equations are shown in equations (6) to (11).
Figure GDA0003564792230000121
Figure GDA0003564792230000122
Figure GDA0003564792230000123
Figure GDA0003564792230000124
Figure GDA0003564792230000125
Figure GDA0003564792230000126
Wherein, Δ tkIs tkAnd tk+1The time interval between, B denotes the carrier coordinate system,
Figure GDA0003564792230000127
is a rotation matrix from a carrier coordinate system to a world coordinate system;
Figure GDA0003564792230000128
and
Figure GDA0003564792230000129
respectively the position, the speed and the posture of an image frame k and an image frame k +1 under a world coordinate system;
Figure GDA00035647922300001210
representing the posture variation of the camera relative to the k frame at the time t; Ω is a quaternion representing the attitude;
Figure GDA00035647922300001211
and
Figure GDA00035647922300001212
respectively representing true values omega of angular velocity at t timetThe components on the xyz three axes of the carrier coordinate system, i.e.
Figure GDA00035647922300001213
And 1.3, removing dynamic points, based on an IMU pre-integration result, namely a relative pose relationship between two continuous frames, including relative position, speed and pose change relationships of two adjacent frames of images, projecting part of feature points with semantic categories of 'vehicles' and 'pedestrians' into the next frame of image based on the pose change, and taking the part of feature points as predicted positions of the part of feature points in the next frame of image. And eliminating the dynamic points to obtain the static characteristic points and the positions thereof of the next frame image by tracking the inconsistent pixel points of the actual positions and the predicted positions of the points in the next frame image, namely the dynamic points, and taking the pixel points of which the actual positions and the predicted positions are consistent as the static points.
Fig. 4 is an image frame acquired by a camera, fig. 5 is a dynamic point removing effect of the prior art on fig. 4, fig. 6 is a dynamic point removing effect of the embodiment of the present application, and it can be seen from fig. 6 that the dynamic point removing of the embodiment of the present application can completely divide the part of pixels of the "person" and completely remove the pixel points of the moving person, whereas the prior art cannot completely remove all dynamic points, and the moving "person" in fig. 5 still has partial dynamic feature point residues.
And 2, obtaining the position of the local camera based on the feature point position of each frame of image and a visual and IMU tight combination fusion positioning algorithm.
Step 2.1, VIO initialization.
The feature point matching result after the dynamic point is removed restores the pose transformation of the image frame through a motion restoration structure technology, restores the initial values of the depth and position speed postures of the external reference feature points of the Camera, specifically refer to the reference Yang Z, Shen S. The initialization process is completed and all the metric values are entered into the next step.
And 2.2, solving the optimal camera pose in the sliding window, namely the local camera pose, based on the nonlinear optimization of the sliding window.
The state vectors of the tight combination sliding window are shown in equations (12) to (14).
Figure GDA0003564792230000131
Figure GDA0003564792230000132
Figure GDA0003564792230000133
Wherein X is a state vector; chi shapehIs the state of the IMU at the h-th moment in the sliding window, and comprises the position of the IMU in the world coordinate system at the h-th frame image
Figure GDA0003564792230000134
Speed of rotation
Figure GDA0003564792230000135
And posture
Figure GDA0003564792230000136
Where h ∈ [0, kf ]]Kf is the total number of key frames extracted from the image frames acquired by the camera in step 1.1; fp is the total number of features in the sliding window; lambda [ alpha ]lIs the inverse depth of the point in the camera coordinate system when the first feature point is observed for the first time, which is the reciprocal of the depth obtained by the initialization process, where l ∈ [0, fp];
Figure GDA0003564792230000137
Is an external parameter of the camera, C is a camera coordinate system,
Figure GDA0003564792230000138
and
Figure GDA0003564792230000139
respectively an origin translation vector and a rotation attitude vector between a camera coordinate system and a carrier coordinate system.
And solving the optimal camera pose by using a minimum reprojection error method, wherein the to-be-optimized error function of the nonlinear optimization problem is shown as a formula (15).
Figure GDA00035647922300001310
In the formula, gammamargIs prior information from marginalization, i.e. the oldest frame in the sliding window and its corresponding state quantity are marginalized as prior information; gamma rayIAnd gammaLRespectively an IMU residual error item and a visual residual error item; i is the set of all IMU measurements therein,
Figure GDA00035647922300001311
the covariance matrix of an IMU pre-integration noise item between the kth frame image and the (k + 1) th frame image is obtained; l is the set of feature points observed at least twice in the current sliding window,
Figure GDA0003564792230000141
is the covariance matrix of the visual observation noise of the ith pixel coordinate in the theta frame image in the set L.
The IMU residual term is shown in equation (16).
Figure GDA0003564792230000142
In the formula (I), the compound is shown in the specification,
Figure GDA0003564792230000143
and
Figure GDA0003564792230000144
three-dimensional error state representation of the position and the speed between the image of the k frame and the image of the (k + 1) th frame respectively;
Figure GDA0003564792230000145
is a three-dimensional error state representation of the attitude quaternion from the kth frame image to the (k + 1) th frame image; Δ baAnd Δ bgAre respectively provided withAre error states of the accelerometer and gyroscope biases.
Assuming that the ith feature point is observed for the first time in the upsilon frame image, the visual residual term of the upsilon frame image (theta > upsilon) is as shown in formula (17).
Figure GDA0003564792230000146
Figure GDA0003564792230000147
Figure GDA0003564792230000148
Wherein the content of the first and second substances,
Figure GDA0003564792230000149
the coordinates of the ith characteristic pixel point appearing in the upsilon frame image are observed for the first time;
Figure GDA00035647922300001410
is the pixel coordinate of the observation of the same feature point in the theta frame image;
Figure GDA00035647922300001411
is the pixel coordinates before back projection;
Figure GDA00035647922300001412
is a back projection function, which projects the residual vector onto a unit sphere, divides the unit sphere by the inverse depth to obtain the coordinates of the pixel points in a camera coordinate system, projects the coordinates onto a normalized tangent plane,
Figure GDA00035647922300001413
and
Figure GDA00035647922300001414
are two arbitrarily chosen orthogonal bases in the tangential plane;
Figure GDA00035647922300001415
the rotation matrix from the camera coordinate system to the carrier coordinate system is known as the external parameter of the camera.
Figure GDA00035647922300001416
And
Figure GDA00035647922300001417
is a rotation matrix and an origin translation vector from a carrier coordinate system to a world coordinate system when the upsilon frame image is provided,
Figure GDA00035647922300001418
and
Figure GDA0003564792230000151
the rotation matrix and the origin translation vector of the carrier coordinate system from the world coordinate system to the theta frame image are obtained from the camera attitude at the moment.
Solving the nonlinear optimization problem by using a nonlinear optimization library Ceres solution library to ensure that the error function (15) to be optimized obtains the minimum value, namely the optimal carrier pose estimation p is obtainedvioAnd q isvioAnd inputting the data to step 4 to be loosely combined with the GNSS.
Step 3, obtaining the motion state of the vehicle based on the IMU measurement data; and switching the self-adaptive algorithm according to the motion state of the vehicle.
And (3) inputting the acceleration and the angular velocity measured by the IMU into the step (3) to perform the switching of the self-adaptive algorithm.
Step 3.1, designing a sliding window acceleration standard deviation factor with the size of N based on the measured value of the acceleration
Figure GDA0003564792230000152
And judging whether the vehicle state is in a static state or not according to the acceleration standard deviation factor, and if the vehicle state is in the static state, performing static zero-speed correction in a GNSS, IMU and visual loose combined fusion positioning algorithm.
Sliding window acceleration scale with size of NTolerance of the laser
Figure GDA0003564792230000153
As shown in equations (20) and (21).
Figure GDA0003564792230000154
Figure GDA0003564792230000155
Wherein the content of the first and second substances,
Figure GDA0003564792230000156
respectively outputting three axes of an accelerometer xyz at the time t; n is the length of the sliding window, where the window length is set to 100 data, N equals 100; i denotes the time index of the acceleration data within the sliding window, i ∈ [ t-N +1, t ∈](ii) a Mu is the average of all accelerations within the sliding window, and as a mathematical expectation, here an unbiased estimate is used,
Figure GDA0003564792230000157
is the standard deviation of the acceleration calculated by the sliding window at the time t.
During the movement of the vehicle, the static threshold value of the vehicle can be set
Figure GDA0003564792230000158
To determine whether the vehicle is stationary.
Figure GDA0003564792230000159
In which the vehicle is stationary
Figure GDA00035647922300001510
Then the fixed threshold is not set and a window of size M is set. Acceleration standard deviation at different times within the window
Figure GDA00035647922300001511
Threshold value for vehicle standstill
Figure GDA00035647922300001512
The zero speed state of the vehicle can be detected more flexibly by changing the sliding of the window in a self-adaptive manner.
Figure GDA0003564792230000161
Figure GDA0003564792230000162
Where j represents the time index of the acceleration standard deviation within the sliding window M, j ∈ [ t-M +1, t ∈ [ ]],ηjThe weights of the respective standard deviations are set, and the smaller the weight of the standard deviation farther from the current time is, the larger the weight of the standard deviation closer to the current time is.
It is important to select a suitable size M. Therefore, a window size self-adaption method is designed, and the requirements of the sensitivity and the precision of the sliding window are met. Defining the upper limit of the size of the sliding window to be MmaxThe lower limit is MminFirst get MminAs the window length. Set a current group
Figure GDA0003564792230000163
When the mean value sigma is less than or equal to the threshold epsilon of the stability of the acceleration value, the acceleration value representing the period of time is relatively stable, a larger M value is selected, and when the mean value sigma is greater than the threshold, the acceleration value representing the stage change is selected, and a smaller window length is selected to improve the sensitivity.
Figure GDA0003564792230000164
Figure GDA0003564792230000165
Wherein, M' belongs to [0 ],Mmax-Mmin]σ is taken to be MminThe mean of the set of σ ajs for the window length; collecting acceleration output by IMUs of a plurality of groups of vehicles in different motion states to calculate a plurality of groups of sigma, and selecting a critical value of whether the acceleration value is stable or not as an acceleration value stability threshold epsilon. Through test experiments, the result that the window length is 20-50 is less different, so MmaxTake 50, MminTaking 20; the acceleration value stationarity threshold ε is set to 0.05.
And when the acceleration value of the vehicle meets the formula (22), judging that the vehicle is in a zero-speed static state, adding zero-speed correction to the vehicle, namely in the Kalman filtering process of the step 4.3, using the velocity solved by inertial navigation as the observed quantity of the system velocity error, performing Kalman filtering estimation, and using the updated state estimator to feedback and correct the navigation parameter result in the INS subsystem to finish the static zero-speed correction.
And 3.2, if the vehicle is detected to be in a non-static state, entering a linear motion detection part, designing a turning detection factor based on the angular velocity mean value based on the measured value of the angular velocity, and adopting the angular velocity mean value | omega | in the sliding window N time as a judgment basis for turning of the vehicle.
Figure GDA0003564792230000171
Wherein, | ω | is the average value of the angular velocity, and N is the data amount in the sliding window; omegazIs the output of the gyroscope in the z-axis direction; TH (TH)ωFor a set angular rate mean threshold, set here at 2 °/s. When the z-axis angular rate exceeds THωIf so, that is, if equation (27) is satisfied, it is determined that the vehicle is in a turning state, otherwise, it is determined that the vehicle is traveling straight.
When a vehicle turns, because vision generates larger errors due to image blurring and characteristic point matching is prone to errors, a robust kernel function is added in the process of tightly combining vision and IMU to improve the tolerance of vision navigation, a robust kernel function Huber kernel function is added in a vision residual error item in the formula (15), and when the residual error item is larger, the original quadratic function is changed into a linear function, so that the influence of error data is reduced.
Figure GDA0003564792230000172
Figure GDA0003564792230000173
And delta is a threshold value for judging whether the visual residual error item is overlarge, data with good straight-ahead quality of the vehicle and data with image blurring and mismatching during turning are respectively collected, and the obtained visual residual error item is compared, so that the visual residual error item threshold value delta for judging whether the visual data quality is good or bad can be determined. For example, the residual term values obtained from the data with better quality are all less than 0.5, and the residual term values when the image is blurred are all greater than 0.5, the value of the residual term value δ is set to be 0.5.
And 3.3, if the algorithm judges that the vehicle is in a straight-ahead state, the vehicle cannot generate transverse sideslip and movement in the vertical direction of the road surface in the running process, and incomplete constraint is added in the Kalman filtering process of the step 4.3 to correct the accumulated error of inertial navigation, namely, the speeds in the transverse direction and the vertical direction are set to be zero.
And 4, obtaining a final positioning result based on the local camera pose, the switching result of the self-adaptive algorithm, Kalman filtering and a GNSS, IMU and visual pine combined fusion positioning algorithm.
Step 4.1, GNSS position information resolving
And removing the satellites with the satellite elevation angles lower than the satellite elevation angle threshold value. Empirically setting the satellite elevation threshold to alphathre15 DEG, and the satellite elevation angle alpha obtained by traversingsat(sat=1,2,…,numvis),numvisThe number of visible stars for the current epoch if alpha is satisfiedsatthreAnd if not, deleting the observation data related to the satellite sat. And solving the screened observation information through a pseudo-range point positioning algorithm to obtain GNSS position information.
And 4.2, initializing the VIO/GNSS, and converting the local camera pose obtained in the step 2 and the GNSS position information obtained by resolving in the step 4.1 into corresponding information in a geocentric and geostationary coordinate system.
The pose estimate of the VIO is in the local coordinate system, so it must be converted to the earth-centered earth-fixed ECEF coordinate system. Recording the initial one-minute GNSS position information of the start point in the step 4.1, namely the positions of the origin points of the northeast coordinate system and the carrier coordinate system (the origin points are the same) in the ECEF coordinate system
Figure GDA0003564792230000188
The heading angle offset ψ between the carrier coordinate system and the northeast coordinate system is corrected using a low-noise doppler measurement. The rotation matrix from the northeast to the ECEF coordinate system is shown in equation (30).
Figure GDA0003564792230000181
Wherein n is the northeast coordinate system; e is the ECEF coordinate system; alpha and beta are respectively the latitude and longitude of the origin of the northeast coordinate system, the rotation matrix from the carrier coordinate system to the northeast coordinate system
Figure GDA0003564792230000182
As shown in equation (31).
Figure GDA0003564792230000183
Therefore, the transformation relationship from the carrier coordinate system to the ECEF is shown in equation (32).
Figure GDA0003564792230000184
Wherein the content of the first and second substances,
Figure GDA0003564792230000185
are the coordinates of the receiver or GNSS in ECEF,
Figure GDA0003564792230000186
is the difference between the position of the receiver and the origin of the carrier coordinate system in the carrier coordinate system. To this end, GNSS location information
Figure GDA0003564792230000187
And navigation parameters p of VIO outputvioAnd q isvioUnified under the ECEF coordinate system.
Step 4.3, GNSS/IMU/visual loose integrated navigation based on Kalman filtering
And (3) inputting corresponding information of the local camera pose and the GNSS position information obtained in the step (4.2) in a geocentric and geostationary coordinate system and a switching result of the adaptive algorithm in the step (3) into a Kalman filter, and outputting to obtain corresponding error correction information. And inputting the output error correction information into the sliding window-based tight combination nonlinear optimization of the IMU/visual tight combination process in the step 2.2 to perform feedback correction on the navigation parameters, and finally outputting the corrected pose information as a final positioning result.
Fig. 7, fig. 8, tables 1 and table 2 are schematic diagrams of comparison of three-dimensional position errors and attitude errors in a northeast coordinate system of a navigation and positioning method of multi-source fusion navigation and positioning in combination with vehicle motion state and environmental perception and a root mean square error table, which are provided by the prior art and embodiments of the present invention.
Table 1 three-dimensional position root mean square error table of the present application and prior art
Figure GDA0003564792230000191
Table 2 root mean square error table of attitude of the present application and prior art
Figure GDA0003564792230000192
The invention provides a multisource fusion navigation positioning method based on motion state and environment perception, and a plurality of methods and ways for implementing the technical scheme are provided, the above description is only a specific implementation manner of the invention, and it should be noted that, for a person skilled in the art, a plurality of improvements and decorations can be made without departing from the principle of the invention, and these improvements and decorations should also be regarded as the protection scope of the invention. All the components not specified in the present embodiment can be realized by the prior art.

Claims (7)

1. A multi-source fusion navigation positioning method based on motion state and environment perception is characterized by comprising the following steps:
step 1, collecting image frames by using a camera, eliminating dynamic points in the image frames, and obtaining static characteristic points and positions of the static characteristic points of each image frame;
step 2, obtaining the position of a local camera based on the feature point position of each frame of image and a visual and IMU tight combination fusion positioning algorithm;
step 3, obtaining the motion state of the vehicle based on the IMU measurement data; switching the self-adaptive algorithm according to the motion state of the vehicle;
step 4, obtaining a final positioning result based on the local camera pose, the switching result of the self-adaptive algorithm, Kalman filtering and a GNSS, IMU and visual pine combined fusion positioning algorithm;
the step 1 comprises the following steps:
step 1.1, semantically segmenting pedestrians and vehicles of each frame of image in an image frame, for an image frame set OPic collected by a camera, respectively extracting and tracking feature points of each frame of image by using a KLT sparse optical flow method to obtain feature point positions, and semantically segmenting the vehicles and the pedestrians on the basis of a convolutional neural network to obtain feature points of each frame of image belonging to semantic categories of the vehicles or the pedestrians;
step 1.2, performing pre-integration on IMU measurement data to obtain an IMU pre-integration result, namely a relative pose relationship of two adjacent frames of images, including relative position, speed and attitude change relationships of the two adjacent frames of images, wherein the measurement data comprises measurement values output by an accelerometer and a gyroscope;
step 1.3, removing dynamic points, and projecting feature points of each frame of image belonging to semantic categories of vehicles or pedestrians to the next frame of image based on an IMU pre-integration result as predicted positions of the feature points in the next frame of image based on the IMU pre-integration result; the dynamic point is removed by tracking the inconsistent pixel points of the actual positions and the predicted positions of the points in the next frame of image, namely the dynamic point, and the pixel points of the actual positions and the predicted positions are static points, so that the static characteristic points and the positions of the static characteristic points of the next frame of image are obtained;
in the step 1.1, the convolutional neural network adopts an improved U-net convolutional neural network, the improved U-net convolutional neural network encoder and decoder include a Ghost module, and performing semantic segmentation on vehicles and pedestrians based on the convolutional neural network includes:
step 1.1.1, training an improved U-net convolutional neural network model by using a data set; the data set acquires road images under different scenes in advance in a mode of vehicle-mounted camera acquisition, establishes a vehicle and pedestrian data set, or directly downloads the existing public road data set; training the improved U-net convolutional neural network model to obtain a trained improved U-net convolutional neural network;
step 1.1.2, inputting an image frame set OPic collected by a camera into a trained encoder of an improved U-net convolutional neural network to obtain a high-level feature vector of the image frame set OPic collected by the camera;
step 1.1.3, inputting high-level feature vectors of an image frame set OPic collected by the camera into a decoder of a trained improved U-net convolution neural network to obtain feature images for semantic category segmentation;
step 1.1.4, inputting a characteristic image for semantic category segmentation into a softmax layer to obtain a characteristic point of each frame of image belonging to a vehicle or pedestrian semantic category in an image frame set OPic acquired by a camera;
in step 3, the switching of the self-adaptive algorithm refers to restraining and correcting the GNSS, the IMU and the visual loose combination fusion positioning algorithm and the visual and IMU tight combination fusion positioning algorithm respectively by using zero speed correction, a robust kernel function and incomplete restraint according to the motion state of the vehicle;
the step 3 comprises the following steps:
step 3.1, designing a sliding window with the size of N to solve an acceleration standard deviation factor based on a measured value output by an accelerometer, judging whether the vehicle state is in a static state or not according to the acceleration standard deviation factor, and if the vehicle state is in the static state, performing static zero-speed correction in a GNSS, IMU and visual loose combined fusion positioning algorithm;
step 3.2, if the vehicle is detected to be in a non-static state, the vehicle is in a turning state or a straight-going state; if the vehicle is in a turning state, adding a robust kernel function into a visual and IMU (inertial measurement unit) close-combination fusion positioning algorithm;
and 3.3, if the vehicle is judged to be in a straight-going state, adding incomplete constraint in a GNSS, IMU and vision loose combination fusion positioning algorithm to correct the accumulated error of inertial navigation.
2. The multi-source fusion navigation positioning method based on motion state and environment perception according to claim 1, wherein the step 2 comprises:
step 2.1, performing VIO initialization based on the feature point position of each frame of image and the IMU pre-integration result to obtain initial values of external parameters, the depth of the feature point, the tightly coupled position, the speed and the posture between the camera and the IMU;
and 2.2, performing the vision and IMU tight combination nonlinear optimization of the sliding window according to initial values of external parameters, depths of characteristic points, tight coupling positions, speeds and postures between the camera and the IMU based on a nonlinear optimization mode, and solving the optimal camera pose, namely the local camera pose, in the sliding window.
3. The multi-source fusion navigation positioning method based on motion state and environmental perception according to claim 2, characterized in that in step 2.1, based on the feature point position of each frame of image, the pose transformation of the image frame is restored through a motion restoration structure technology, and then the pose transformation of the image frame is aligned with the IMU pre-integration result, so as to restore the external reference between the camera and the IMU and obtain the transformation relation between the IMU coordinate system and the world coordinate system;
the step 2.2 comprises: acquiring a state vector X of a sliding window with a combination of vision and IMU, solving a nonlinear optimization problem with the combination of vision and IMU by using a minimum reprojection error algorithm, and solving the nonlinear optimization problem by using a nonlinear optimization library Ceres solution library so as to solve the optimal camera pose in the sliding window;
the error function to be optimized for the tight combination of vision and IMU nonlinear optimization problem is:
Figure FDA0003564792220000031
in the formula, gammamargIs prior information from marginalization, i.e. the oldest frame in the sliding window and its corresponding state quantity are marginalized as prior information; gamma rayIAnd gammaLRespectively an IMU residual error item and a visual residual error item; i is the set in which all IMU measurements are taken,
Figure FDA0003564792220000032
the covariance matrix of an IMU pre-integration noise item between the kth frame image and the (k + 1) th frame image is obtained; l is the set of feature points observed at least twice in the current sliding window,
Figure FDA0003564792220000033
is the covariance matrix of the visual observation noise of the ith pixel coordinate in the theta frame image in the set L.
4. The multi-source fusion navigation positioning method based on motion state and environment perception according to claim 3, wherein the step 4 includes:
step 4.1, resolving GNSS position information;
step 4.2, initializing VIO and GNSS, and converting the local camera pose obtained in the step 2 and the GNSS position information obtained by resolving in the step 4.1 into corresponding information under a geocentric geostationary coordinate system;
and 4.3, obtaining a global estimation pose, namely a final positioning result, based on Kalman filtering and a combined fusion positioning algorithm of the GNSS, the IMU and the visual pine.
5. The multi-source fusion navigation positioning method based on motion state and environment perception according to claim 4, wherein the step 3.1 comprises:
sliding window acceleration standard deviation of magnitude N
Figure FDA0003564792220000034
Expressed as the following equation:
Figure FDA0003564792220000035
Figure FDA0003564792220000036
wherein the content of the first and second substances,
Figure FDA0003564792220000041
respectively outputting three axes of an accelerometer xyz at the time t; i denotes the time index of the acceleration data within the sliding window, i ∈ [ t-N +1, t ∈](ii) a μ is the average of all accelerations within the sliding window, as a mathematical expectation, where an unbiased estimate is used;
by setting a stationary threshold value of the vehicle during movement of the vehicle
Figure FDA0003564792220000042
To judge whether the vehicle is stationary or not, when the acceleration standard deviation of the vehicle is
Figure FDA0003564792220000043
When the vehicle is determined to be at zero speed and quietStopping the state, namely adding zero-speed correction to the vehicle, namely in the Kalman filtering process of the step 4.3, using the velocity solved by inertial navigation as the observed quantity of the system velocity error to carry out Kalman filtering estimation, and using the updated state estimator to feedback and correct the navigation parameter result inside the INS subsystem to complete static zero-speed correction;
wherein the vehicle is stationary
Figure FDA0003564792220000044
Setting as dynamic threshold, setting a sliding window with size of M, acceleration standard deviation at different time in the sliding window
Figure FDA0003564792220000045
Threshold value of vehicle standstill
Figure FDA0003564792220000046
The zero speed state of the vehicle can be detected more flexibly by changing along with the sliding of the sliding window M in a self-adaptive manner:
Figure FDA0003564792220000047
Figure FDA0003564792220000048
wherein j represents the standard deviation of the acceleration within the sliding window M
Figure FDA0003564792220000049
Is given by the time index of (j) e [ t-M +1, t ∈],ηjThe weight of each acceleration standard deviation, the more distant the acceleration standard deviation from the current time t, the smaller the weight, and the more close the weight, the larger the weight;
defining the upper limit of the size of the sliding window M as MmaxThe lower limit is MminFirst get MminAs the length of the sliding window, a current group is set
Figure FDA00035647922200000410
When the average value sigma is less than or equal to the stability threshold epsilon of the acceleration value, the acceleration value representing the period of time is relatively stable, the length M of the sliding window is increased, and when the average value sigma is greater than the stability threshold epsilon of the acceleration value, the sliding window represents that the acceleration value is changed in stages, the length M of the sliding window is reduced, and the sensitivity is improved;
Figure FDA00035647922200000411
Figure FDA00035647922200000412
wherein M' belongs to [0, M ∈max-Mmin]σ is taken to be MminFor window length, the set
Figure FDA0003564792220000051
The mean value of (a); and acquiring the accelerated speeds output by IMUs of a plurality of groups of vehicles in different motion states to calculate a plurality of groups of sigma, and selecting a critical value of whether the accelerated speed value is stable or not as a threshold value epsilon of stability of the accelerated speed value.
6. The multi-source fusion navigation positioning method based on motion state and environment perception according to claim 5, wherein the step 3.2 includes:
designing a turning detection factor based on an angular velocity mean value based on a measured value of the angular velocity, adopting the angular velocity mean value | omega | in the time of a sliding window N as a judgment basis of vehicle turning, and when the angular velocity mean value is used
Figure FDA0003564792220000052
If so, judging that the vehicle is in a turning state, otherwise, judging that the vehicle is in a straight-going state;
in the formula, ωzIs the output of the gyroscope in the z-axis direction; THωFor a set angular rate mean threshold;
If the vehicle is judged to be in a turning state, adding a robust kernel function into a visual and IMU (inertial measurement Unit) close combination fusion positioning algorithm, and adding a robust kernel function Huber kernel function into a residual error item of visual measurement in an error function to be optimized of a visual and IMU close combination nonlinear optimization problem:
Figure FDA0003564792220000053
Figure FDA0003564792220000054
the delta is a threshold value for judging whether the visual residual error item is too large, data with good straight-ahead quality of the vehicle and data with image blurring and mismatching during turning are respectively collected, and the obtained visual residual error item is compared, so that a visual residual error item threshold value delta for judging whether the visual data quality is good or bad is determined;
in the step 3.3, if it is determined that the vehicle is in a straight-ahead state, the vehicle does not generate lateral sideslip and motion along the vertical direction of the road surface in the driving process, and an incomplete constraint is added in the kalman filtering process of the step 4.3 to correct the accumulated error of inertial navigation, that is, the speeds in the lateral direction and the vertical direction are set to be zero.
7. The multi-source fusion navigation positioning method based on motion state and environment perception according to claim 6, wherein the step 4.1 includes: removing satellites with satellite elevation angles lower than a satellite elevation angle threshold value to obtain satellite observation data after screening; resolving to obtain GNSS position information through a pseudo-range single-point positioning algorithm according to the screened satellite observation data;
step 4.3 comprises: inputting corresponding information of the local camera pose and the GNSS position information obtained in the step 4.2 in a geocentric geocoordinate system and a switching result of the adaptive algorithm in the step 3 into a Kalman filter to obtain error correction information; inputting the error correction information into the sliding window-based close-combination nonlinear optimization of the step 2.2, and carrying out feedback correction on the navigation parameters; and finally outputting the corrected pose information as a global estimation pose, namely a final positioning result.
CN202210156085.7A 2022-02-21 2022-02-21 Multi-source fusion navigation positioning method based on motion state and environment perception Active CN114199259B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210156085.7A CN114199259B (en) 2022-02-21 2022-02-21 Multi-source fusion navigation positioning method based on motion state and environment perception

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210156085.7A CN114199259B (en) 2022-02-21 2022-02-21 Multi-source fusion navigation positioning method based on motion state and environment perception

Publications (2)

Publication Number Publication Date
CN114199259A CN114199259A (en) 2022-03-18
CN114199259B true CN114199259B (en) 2022-06-17

Family

ID=80645739

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210156085.7A Active CN114199259B (en) 2022-02-21 2022-02-21 Multi-source fusion navigation positioning method based on motion state and environment perception

Country Status (1)

Country Link
CN (1) CN114199259B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114777797B (en) * 2022-06-13 2022-09-30 长沙金维信息技术有限公司 High-precision map visual positioning method for automatic driving and automatic driving method
CN115128655B (en) * 2022-08-31 2022-12-02 智道网联科技(北京)有限公司 Positioning method and device for automatic driving vehicle, electronic equipment and storage medium
CN115847446B (en) * 2023-01-16 2023-05-05 泉州通维科技有限责任公司 Inspection robot in bridge compartment beam
CN116147618B (en) * 2023-01-17 2023-10-13 中国科学院国家空间科学中心 Real-time state sensing method and system suitable for dynamic environment
CN116026316B (en) * 2023-03-30 2023-08-29 山东科技大学 Unmanned ship dead reckoning method coupling visual inertial odometer and GNSS
CN116442248A (en) * 2023-06-19 2023-07-18 山东工程职业技术大学 Robot vision positioning module and method based on target detection
CN117310773B (en) * 2023-11-30 2024-02-02 山东省科学院海洋仪器仪表研究所 Autonomous positioning method and system for underwater robot based on binocular stereoscopic vision
CN117473455B (en) * 2023-12-27 2024-03-29 合众新能源汽车股份有限公司 Fusion method and device of multi-source positioning data and electronic equipment

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109405824A (en) * 2018-09-05 2019-03-01 武汉契友科技股份有限公司 A kind of multi-source perceptual positioning system suitable for intelligent network connection automobile
CN109900265A (en) * 2019-03-15 2019-06-18 武汉大学 A kind of robot localization algorithm of camera/mems auxiliary Beidou
CN109993113B (en) * 2019-03-29 2023-05-02 东北大学 Pose estimation method based on RGB-D and IMU information fusion
CN110412635B (en) * 2019-07-22 2023-11-24 武汉大学 GNSS/SINS/visual tight combination method under environment beacon support

Also Published As

Publication number Publication date
CN114199259A (en) 2022-03-18

Similar Documents

Publication Publication Date Title
CN114199259B (en) Multi-source fusion navigation positioning method based on motion state and environment perception
Wen et al. Tightly coupled GNSS/INS integration via factor graph and aided by fish-eye camera
CN110243358B (en) Multi-source fusion unmanned vehicle indoor and outdoor positioning method and system
CN110412635B (en) GNSS/SINS/visual tight combination method under environment beacon support
CN110411462B (en) GNSS/inertial navigation/lane line constraint/milemeter multi-source fusion method
CN111024066A (en) Unmanned aerial vehicle vision-inertia fusion indoor positioning method
CN111880207B (en) Visual inertial satellite tight coupling positioning method based on wavelet neural network
CN107796391A (en) A kind of strapdown inertial navigation system/visual odometry Combinated navigation method
CN111426320B (en) Vehicle autonomous navigation method based on image matching/inertial navigation/milemeter
CN112556719B (en) Visual inertial odometer implementation method based on CNN-EKF
CN114693754B (en) Unmanned aerial vehicle autonomous positioning method and system based on monocular vision inertial navigation fusion
CN114526745A (en) Drawing establishing method and system for tightly-coupled laser radar and inertial odometer
CN114019552A (en) Bayesian multi-sensor error constraint-based location reliability optimization method
CN115479602A (en) Visual inertial odometer method fusing event and distance
CN115407357A (en) Low-beam laser radar-IMU-RTK positioning mapping algorithm based on large scene
CN112781582A (en) Multi-sensor fusion high-precision pose estimation algorithm under satellite weak observation condition
CN115435779A (en) Intelligent body pose estimation method based on GNSS/IMU/optical flow information fusion
CN114690229A (en) GPS-fused mobile robot visual inertial navigation method
CN109387198A (en) A kind of inertia based on sequential detection/visual odometry Combinated navigation method
CN112945233B (en) Global drift-free autonomous robot simultaneous positioning and map construction method
CN114234967A (en) Hexapod robot positioning method based on multi-sensor fusion
Gao et al. Vido: A robust and consistent monocular visual-inertial-depth odometry
CN112923934A (en) Laser SLAM technology suitable for combining inertial navigation in unstructured scene
CN116182855A (en) Combined navigation method of compound eye-simulated polarized vision unmanned aerial vehicle under weak light and strong environment
CN112837374B (en) Space positioning method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant