CN115344033A

CN115344033A - Monocular camera/IMU/DVL tight coupling-based unmanned ship navigation and positioning method

Info

Publication number: CN115344033A
Application number: CN202210230594.XA
Authority: CN
Inventors: 张福斌; 刘昱君; 潘光; 宋保维; 高剑; 彭星光
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2022-03-10
Filing date: 2022-03-10
Publication date: 2022-11-15

Abstract

The invention relates to a monocular camera/IMU/DVL tight coupling based unmanned ship navigation and positioning method, wherein a monocular camera, an IMU and a DVL are organically fused in a tight coupling mode, the information uniformity among multiple sensors is effectively ensured and the scale of the monocular camera is better recovered through IMU/DVL pre-integration and combined initialization, more accurate pose estimation is obtained through nonlinear optimization based on a sliding window, and the navigation positioning precision is improved on the basis of effectively reducing the navigation error of smooth autonomous operation of an unmanned ship near the shore.

Description

Monocular camera/IMU/DVL tight coupling-based unmanned ship navigation and positioning method

Technical Field

The invention relates to the technical field of navigation and positioning in the process of unmanned ship offshore operation, in particular to an unmanned ship navigation and positioning method based on monocular camera/IMU/DVL tight coupling.

Background

The unmanned ship is an intelligent water surface platform integrating multiple technologies such as automatic control, navigation positioning and multi-sensor fusion, and can complete related operations of a target water area under the condition of remote control or autonomous navigation. Autonomous operation on the water surface often needs to meet the requirement of high positioning precision, so a set of complete high-precision navigation and positioning system is indispensable. The navigation and positioning system can provide accurate navigation information, ensure positioning accuracy and realize accurate control, thereby increasing the reliability of water surface operation. In order to meet the requirement that unmanned ships can smoothly complete water surface operation, the improvement of the precision of navigation positioning systems becomes a hot point for research of various scholars. At present, researches on the problem of improving the positioning accuracy at home and abroad mainly take visual navigation and inertial navigation as main parts, and both are core technologies for realizing autonomous navigation.

The visual navigation uses a camera as a core sensor to acquire environmental information, and generally comprises five modules of camera information reading, a visual odometer, rear-end optimization, loop detection and graph building. The visual navigation is divided into monocular visual navigation, binocular visual navigation and depth visual navigation according to different camera types. The visual navigation uses a camera to observe the surrounding environment, and has the advantages of no drift, good stability, high resolution, low price, easy installation and the like, but the problems brought by different cameras are different. The monocular camera does not have depth information, the synchronism of the binocular camera is difficult to guarantee, huge calculation amount is needed, the depth camera is easy to be influenced by illumination, the observation visual field is small, and the problems can cause that the positioning precision of the visual navigation system is reduced under the scene that the camera moves fast or weak textures are generated.

The Inertial navigation system is an autonomous navigation system which takes an Inertial Measurement Unit (IMU) as a core sensor, does not depend on external information, and is not interfered by the outside. The IMU consists of a three-axis accelerometer and a three-axis gyroscope, and the attitude of the carrier is obtained by measuring the linear velocity and the angular velocity and integrating. But the pose diverges over time, causing cumulative errors to the system. This reflects the feature that the inertial navigation system has a high-precision estimation in a short time and cannot be used for long-time navigation. The IMU has high and low precision, and the navigation equipment with high precision is generally expensive and is suitable for the military; the inertial device with low precision is often seen in the fields of the public due to the advantages of low cost, small volume, low power consumption and the like, and has the shadow in the fields of electronic industry, aerospace and aviation and the like. Therefore, long-time accurate navigation and positioning are difficult to realize only by the inertial navigation system, and the navigation and positioning can be completed by fusing other sensors.

A Doppler Velocimeter (DVL) is a navigation device based on acoustic measurement of carrier Velocity, and its working principle is Doppler effect, and it is mainly composed of transducer, transmitting unit and receiving unit. The doppler effect is that when the transmitting source and the receiver move relatively, and the sound wave is reflected to the receiver through the water bottom, the receiving frequency will have a certain deviation from the transmitting frequency, this frequency deviation is called doppler shift, and the velocity of the carrier relative to the water bottom can be obtained by calculating the amount of doppler shift. The DVL has the characteristics of high speed measurement precision, strong real-time property, good stability, fast data output and the like, and is widely applied to underwater navigation.

Disclosure of Invention

Technical problem to be solved

In order to improve the navigation and positioning precision of the unmanned ship during offshore operation, the invention provides a monocular camera/IMU/DVL tight coupling based navigation and positioning method, the method combines a monocular camera, an IMU and a DVL in a tight coupling mode, obtains more accurate pose estimation by joint initialization of multi-sensor information and a sliding window based nonlinear optimization mode, and improves the navigation and positioning precision on the basis of effectively reducing the navigation error of smooth autonomous operation of the unmanned ship on the shore.

Technical scheme

A monocular camera/IMU/DVL tight coupling-based unmanned ship navigation and positioning method is characterized by comprising the following steps:

step 1: when the unmanned ship works on the shore, starting a monocular camera, an IMU (inertial measurement unit) and a DVL (digital video recorder) which are installed on the unmanned ship; the monocular camera is responsible for acquiring monocular visual images, the IMU is responsible for acquiring information of a gyroscope and an accelerometer, and the DVL is responsible for acquiring speed information; performing measurement preprocessing operation on the original information of the three sensors, namely performing feature extraction and matching on a visual part, and performing IMU/DVL pre-integration on an inertial part to better perform joint initialization;

and 2, step: after preprocessing of original information of the monocular camera, the IMU and the DVL is finished, joint initialization among the sensors is carried out; aligning IMU/DVL pre-integrated data with a visual information initialization result obtained through a pure visual SFM process, performing monocular visual/inertial/DVL combined initialization to recover initial values of initial frames of optimization variables, including scale, gyroscope zero offset, speed and gravity vector, and providing the initial values to subsequent nonlinear optimization;

and step 3: after the joint initialization work is finished, entering a tightly coupled nonlinear optimization process, wherein the process saves space by utilizing a sliding window; in the window sliding process, IMU/DVL measurement values and visual characteristic observation values are closely fused, a joint optimization model is constructed, high-precision pose estimation is carried out on a fixed number of state variables, and an edge scheme is adopted to reduce the complexity of optimization calculation; secondly, establishing a characteristic relation between the current frame and a loop candidate frame by using loop detection, considering nonlinear optimization, finishing correction and update of the position of the landmark point by using the minimum calculated amount, realizing repositioning and obtaining more accurate position and orientation information;

and 4, step 4: after the relocation is finished, carrying out global pose graph optimization by utilizing four degrees of freedom, wherein the four degrees of freedom comprise relative displacement and yaw angles on x, y and z axes; and updating the pose information of all key frames in the database by using the sequence edge and the loop edge, eliminating accumulated errors introduced by nonlinear optimization, loop detection and repositioning, and obtaining the pose information of the point position of the landmark which is closer to the actual landmark.

The further technical scheme of the invention is as follows: in the step 1, feature points of continuous frame images acquired by a monocular camera are extracted aiming at a visual part, and the feature points are tracked by calculating the instantaneous speed of pixel points of two adjacent frame images through a sparse optical flow method to obtain successfully matched feature point pairs.

The invention further adopts the technical scheme that: in step 1, aiming at an inertial part, pre-integration operation is carried out on IMU and DVL measured values between two adjacent frames, and inertial information and DVL information are aligned in time to obtain data input in the same time.

The further technical scheme of the invention is as follows: in step 2, the sampling frequency of the monocular camera is 10Hz, the sampling frequency of the IMU is 100Hz, and the sampling frequency of the DVL is 10Hz; after the measurement preprocessing is finished, the information of the monocular camera, the monocular camera and the monocular camera is aligned by using the sampling frequency of the monocular camera as a standard, and data input with the same time is obtained.

The further technical scheme of the invention is as follows: in step 2, the pure vision SFM process can obtain the relative pose between each frame of image obtained by the camera according to the PnP principle and all the visual information acquired by the camera, recover the three-dimensional spatial position of the feature point between two frames by triangulation, and define the reference coordinate system according to the three-dimensional spatial position.

The invention further adopts the technical scheme that: in the step 2, the vision inertia alignment process aligns the pure vision information with the IMU/DVL pre-integration data, so that initial values of parameters of zero offset, speed and gravity vector of the gyroscope can be estimated, and the scale problem of the monocular camera is solved; the method comprises the following specific steps: firstly, calibrating zero offset of a gyroscope, and updating all IMU/DVL pre-integral terms again according to the obtained zero offset; secondly, initializing the recovery speed, the gravity vector and the scale of the monocular camera, and carrying out secondary optimization on the gravity vector; then, calculating the depth of the characteristic points, and scaling the position, the speed and the depth of the characteristic points according to the scale; and finally, determining a world coordinate system according to the gravity vector after the secondary optimization, and converting all variables into the world coordinate system.

The further technical scheme of the invention is as follows: the specific process in the step 3 is as follows:

step 3.1: setting state variables in a sliding window, wherein the state variables comprise camera poses and IMU states corresponding to images: position, attitude, speed, zero offset of an accelerometer and a gyroscope in the IMU, zero offset of position and speed of the DVL, inverse depth of a characteristic point, camera-IMU external parameter and IMU-DVL external parameter;

step 3.2: judging whether the penultimate frame in the sliding window is a key frame or not, selecting a marginalization specific strategy, and obtaining a marginalization residual error item according to related information in the state variable;

step 3.3: obtaining measurement residual error items of the IMU and the DVL according to the IMU state in the state variables, zero offset of an IMU accelerometer and a gyroscope, zero offset of the DVL position and speed and external parameters of the IMU-DVL;

step 3.4: obtaining a visual re-projection residual error item according to the poses of all images in the state variable corresponding to the IMU, external parameters of the camera-IMU and the inverse depths of all feature points in the sliding window;

step 3.5: under the condition of known state variables, constructing a tightly-coupled nonlinear optimized cost function according to the three residual error items obtained in the steps 3.2 to 3.4;

step 3.6: performing iterative optimization on the cost function obtained in the step 3.5, changing the current frame into the previous frame when a new frame arrives, moving a sliding window forwards, transmitting relevant parameters, removing all invalid feature points in the latest sliding window, and updating feature point information;

step 3.7: when a key frame comes, firstly, angular points of the frame of picture are extracted, descriptors are calculated, and a key frame database is established; when loop detection occurs, measuring the similarity between the current key frame and all historical key frames in the database by using a bag-of-word model to judge whether the carrier passes through the same environment before, and if the similarity reaches a preset threshold value, determining that loop is generated; however, the situation that two pictures are too similar may occur, and in order to prevent the occurrence of mismatching, a certain inspection condition needs to be set; when the detection condition is met, namely the correct loop is confirmed to be generated, the pose of the landmark point can be corrected and updated, the relocation is completed, and more accurate pose information is obtained.

Advantageous effects

The invention provides a monocular camera/IMU/DVL tight coupling-based unmanned ship navigation and positioning method. The method organically fuses a monocular camera, an IMU (inertial measurement Unit) and a DVL (digital video recorder) in a tight coupling mode, and firstly, effectively unifies data input of the IMU and the DVL through IMU/DVL pre-integration; secondly, the scale problem of the monocular camera is effectively solved through joint initialization, and the initial value of the state variable is restored; then, space is saved through close-coupled nonlinear optimization based on the sliding window, and optimization calculation amount is reduced, so that more accurate pose estimation is obtained; and finally, correcting accumulated errors generated in the front process through loop detection and a pose graph optimization process, correcting and updating the pose information of the landmark points to obtain a pose closer to the actual pose, and thus improving the navigation and positioning accuracy of the unmanned ship.

The invention has the advantages that the DVL suitable for underwater navigation is introduced, and the nonlinear optimization is carried out by adopting a tight coupling method, so that the invention has the following advantages:

1) The IMU pre-integration is expanded into IMU/DVL pre-integration, information input in the same time is obtained through pre-integration on the IMU with high sampling frequency and the DVL with low sampling frequency, and the information uniformity among the sensors is guaranteed through pre-integration operation.

2) Compared with a single sensor navigation algorithm and a vision/IMU navigation algorithm, the algorithm fully utilizes the coupling among information of multiple sensors, utilizes DVL constraint to better recover the scale of a monocular camera, inhibits the speed divergence of an IMU accelerometer to a certain extent, obtains a more accurate pose result on the basis of effectively correcting navigation errors, and further improves the navigation positioning precision.

Drawings

The drawings, in which like reference numerals refer to like parts throughout, are for the purpose of illustrating particular embodiments only and are not to be considered limiting of the invention.

FIG. 1 is a framework of monocular/IMU/DVL tightly coupled navigation and positioning methods.

FIG. 2 is a diagram of IMU pre-integration.

Fig. 3 is a schematic view of a visual inertial alignment process.

FIG. 4 is a schematic view of the marginalization strategy.

FIG. 5 is a schematic diagram of the sequential edge and the loop edge.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the respective embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

As shown in FIG. 1, the method comprises the following specific steps:

step 1: when the unmanned ship operates on the shore, a monocular camera, an IMU (inertial measurement Unit) and a DVL (digital video recorder) are started simultaneously, the monocular camera is responsible for acquiring monocular visual images, and the data sampling frequency is 10Hz; the IMU is responsible for acquiring information of the gyroscope, and the data sampling frequency is 100Hz; the DVL is responsible for collecting speed information and the data sampling frequency is 10Hz. And carrying out measurement preprocessing operation on the original information of the three sensors, carrying out feature extraction and matching on a visual part, and carrying out IMU/DVL pre-integration on an inertial part to obtain information input in the same time.

Step 2: after the preprocessing of the original information of the monocular camera, the IMU and the DVL is finished, the sampling frequency of the monocular camera is used as a standard, the information of the monocular camera, the IMU and the DVL is aligned by utilizing a linear interpolation method, data input with the same time is obtained, and then joint initialization is carried out, wherein the joint initialization comprises a pure vision SFM process and a vision inertia alignment process. And aligning the IMU/DVL pre-integrated data with a visual information initialization result obtained through a pure visual SFM process, performing monocular visual/inertial/DVL combined initialization to recover initial values of the initial frames of optimization variables, including scale, gyroscope zero offset, speed, gravity vector and the like, and providing the initial values to subsequent nonlinear optimization.

Step 2.1: the pure vision SFM process can obtain the relative pose between each frame of images obtained by the camera according to all the vision information acquired by the camera and the PnP principle, recover the three-dimensional space position of the characteristic points between two frames by utilizing triangulation, and define a reference coordinate system according to the three-dimensional space position.

Step 2.2: in the visual inertia alignment process, data alignment is carried out on pure visual information and IMU/DVL pre-integration, firstly, the zero offset of a gyroscope is calibrated, and all IMU/DVL pre-integration terms are updated again according to the obtained zero offset; secondly, initializing the recovery speed, the gravity vector and the scale of the monocular camera, and carrying out secondary optimization on the gravity vector; then, calculating the depth of the feature points, and scaling the position, the speed and the depth of the feature points according to the scale; and finally, determining a world coordinate system according to the gravity vector after the secondary optimization, and converting all variables into the world coordinate system.

And 3, step 3: after the joint initialization work is finished, a tightly coupled nonlinear optimization process is entered, and the process saves space by utilizing a sliding window. In the window sliding process, IMU/DVL measurement values and visual feature observation values are fused closely, a joint optimization model is constructed, high-precision pose estimation is carried out on a fixed number of state variables, and an edge scheme is adopted to reduce the complexity of optimization calculation. And then, establishing a characteristic relation between the current frame and a loop candidate frame by utilizing loop detection, considering nonlinear optimization, finishing correction and update of the position of the landmark point by using the minimum calculated amount, realizing repositioning and obtaining more accurate position and orientation information.

Step 3.1: and setting state variables in a sliding window, wherein the state variables comprise camera poses, IMU states (positions, postures and speeds) corresponding to the images, zero offsets of accelerometers and gyroscopes in the IMU, zero offsets of positions and speeds of DVLs, inverse depths of characteristic points, camera-IMU external parameters and IMU-DVL external parameters.

Step 3.2: and judging whether the penultimate frame in the sliding window is a key frame or not, selecting a marginalized specific strategy, and obtaining a marginalized residual error item according to related information in the state variable.

Step 3.3: and obtaining measurement residual terms of the IMU and the DVL according to the IMU state in the state variables, zero offset of an IMU accelerometer and a gyroscope, zero offset of the DVL position and speed and external parameters of the IMU-DVL.

Step 3.4: and obtaining a visual re-projection residual error item according to the poses of all the images in the state variable corresponding to the IMU, the external parameters of the camera-IMU and the inverse depths of all the feature points in the sliding window.

Step 3.5: and under the condition of known state variables, constructing a tightly-coupled nonlinear optimized cost function according to the three residual error items obtained in the steps 3.2 to 3.4.

Step 3.6: and (4) performing iterative optimization on the cost function obtained in the step (3.5), when a new frame arrives, changing the current frame into the previous frame, moving the sliding window forwards, transmitting relevant parameters, completely removing invalid feature points in the latest sliding window, and updating feature point information.

Step 3.7: the loop detection is only performed on the key frames, but the key frames of the first 50 frames after the system starts to operate are not subjected to loop detection and only added into the key frame database. When a key frame comes, firstly extracting a FAST corner point from the frame and calculating a BRIEF descriptor; when loop detection occurs, similarity measurement is carried out on a current frame and all historical key frames by using a bag-of-word model, if the similarity reaches a preset threshold value, a certain number of loop candidate frames can be screened out, and then the loop can be considered to be generated; ORB feature matching is carried out on the current frame and the loop candidate frame, mismatching points are removed through a random consistency principle (RANSAC), and the relative pose of the loop candidate key frame and the current frame is solved according to a PnP method. And if the course angle between the loop candidate key frame and the current frame is less than 30 degrees and the relative displacement is less than 20 meters, the loop candidate key frame is the loop key frame. And when the loop key frame is detected, repositioning the current frame according to the pose of the loop key frame and the relative pose of the loop key frame and the current frame.

And 4, step 4: after the repositioning is finished, global position and orientation graph optimization is carried out by using four degrees of freedom (relative displacement and yaw angle on x, y and z axes), position and orientation information of all key frames in the database is updated by using the sequence edge and the loop edge, accumulated errors caused by nonlinear optimization, loop detection and repositioning are corrected, and position and orientation information of the landmark points which is closer to the actual position and orientation information of the landmark points is obtained.

The correlation model is given below:

1. IMU/DVL pre-integration correlation model

Considering the error model of DVL, here only the velocity offset error, there are measurement models of IMU/DVL:

in the formula, the angle marks a and omega respectively represent an accelerometer and a gyroscope, and the angle marks w and b respectively represent a world coordinate system and an IMU body coordinate system;

respectively representing the measured and actual values of the accelerometer,

respectively representing the measured value and the actual value of the gyroscope; b is a mixture of _a ,b _ω Zero bias for the accelerometer and gyroscope, respectively; n is _a ,n _ω White noise representing the accelerometer and gyroscope, respectively, which is generally assumed to be gaussian; g is a radical of formula ^w Is the gravity vector under the world coordinate system;

is the rotation matrix from the world coordinate system to the IMU body coordinate system.

The kinematic model of the IMU/DVL fusion for the k +1 th frame can thus be derived as:

in the formula (I), the compound is shown in the specification,

respectively representing the position, velocity and rotation of the IMU system in a reference systemRotation, here in quaternion form;

is the rotation of the IMU under the world system at time t,

is the rotational variation of the IMU at the k-th frame at time t; Δ t represents the time difference between k to k +1 frames;

indicating the location of the DVL's ontology under a reference frame,

representing the rotational argument between IMU-DVL.

Separating out optimized variable from IMU/DVL fused kinematic model, and multiplying both ends of the optimized variable by the same product

Converting a world coordinate system in the model into a kth frame of a carrier coordinate system, assuming that random walk zero offset of the IMU and the DVL and a pre-integral term are in a linear relation after sorting and simplification, and performing first-order Taylor expansion on the pre-integral term at the kth frame to obtain an IMU/DVL pre-integral formula:

in the formula (I), the compound is shown in the specification,

representing the calculated value of IMU position, velocity, rotating pre-integral term,

is a calculated value of the DVL pre-integration term; δ b _a ,δb _ω Representing the zero offset values, δ b, of the accelerometer and gyroscope, respectively _d A velocity null error value representing the DVL;

respectively representing the partial derivatives of the position pre-integral term on the zero bias of an IMU accelerometer and the zero bias of a gyroscope in the k frame,

respectively representing the partial derivatives of the velocity pre-integral term on the zero offset of an IMU accelerometer and the zero offset of a gyroscope at the k frame,

the rotation pre-integral term at the k frame is expressed to calculate partial derivative to zero offset of the IMU gyroscope,

respectively representing the partial derivatives of the DVL velocity zero bias and the IMU gyro zero bias calculated by the DVL position pre-integral term in the k frame,

are both jacobian matrices with errors.

2. Joint initialization

The purpose of the joint initialization is to recover the initial values of the optimization variables of the first few frames of the system, including scale, gyroscope zero bias, velocity, gravity vector, etc., and provide these initial values to the subsequent nonlinear optimization.

1) Zero bias of gyroscope _ω ：

Constructing a rotation least square model by utilizing an IMU rotation pre-integral term to obtain the following cost function:

wherein B is a set of all frame images,

is the pose of the camera at the k and k +1 frames,

is the rotational pre-integral term of the IMU, the expression is:

rewriting the above equation, only preserving the imaginary part and

conversion to positive definite matrix, there are:

the least square solution obtained by Cholesky decomposition of the above formula is the estimated gyroscope zero bias _ω And after zero offset is obtained, all IMU pre-integral terms need to be transmitted again.

2) Velocity, gravity and scale parameters

Taking the first frame c of the camera ₀ As a reference coordinate system, the IMU body coordinate system and DVL body coordinate system can be obtained with respect to c ₀ The transformation relation of the coordinate system and the scale s definition formula of the monocular camera are as follows:

in the formula (I), the compound is shown in the specification,

is the camera-IMU extrinsic reference translation,

is the camera-DVL external reference translation. After the zero offset of the gyroscope is successfully recovered, the speed c of the system is required to be adjusted ₀ Gravity vector in a coordinate system

The scale s is initialized and the optimization variables are represented by k ₀ Represents:

converting definitional assignments from world coordinate system w to c ₀ IMU position and speed pre-integral term under coordinate system

And the positional pre-integral term of the DVL

A new expression is obtained and represented by a matrix, having:

to simplify the above formula, it is noted that,

will be provided with

Conversion to positive definite matrix, there are:

the least squares solution obtained by using Cholesky decomposition on the above formula is the estimated optimization variable k ₀ 。

Solving three-dimensional gravity vector

When the gravity is known, the gravity makes

There are only two degrees of freedom. And (3) carrying out secondary optimization on the gravity vector to obtain an expression of gravity parameterization:

in the formula (I), the compound is shown in the specification,

is two unit tangent vectors, w, in the tangent plane of gravity ₁ ,w ₂ Is to optimize the variables to be estimated in the gravity vector expression.

Substituting the above formula into a definitional formula in a matrix form to estimate w, and ignoring the DVL position pre-integral term

The influence of (a) is:

is according to the above formulaSolving for gravity vector

Repeatedly executing the above steps if

And (5) converging to finish the secondary optimization of the gravity vector. Moving the gravity vector from c ₀ The coordinate system is converted into the z-axis of the world coordinate system, and the rotation matrix of the camera coordinate system to the world coordinate system can be determined

All initialization steps are completed at this time, and all optimization variables are also converted into a world coordinate system and transferred into the nonlinear optimization based on tight coupling.

3. Non-linear optimization

The sliding window comprises a camera state, an IMU zero offset, an inverse depth of a feature point, a camera-IMU external parameter and an IMU-DVL external parameter, so that the state variables are expressed in the form of:

in the formula, the number of the images is n +1, the number of the feature points is m +1, and the external parameter comprises a camera-IMU external parameter

And IMU-DVL external reference

x _k Is the IMU state corresponding to the k frame image acquired by the camera, including the position

Speed of rotation

Posture of the vehicle

And IMU accelerometer zero bias in the present system _a And gyroscope zero bias b _g 。

Under the condition of known state variables, the cost function of nonlinear optimization of the algorithm can be constructed as follows:

the cost function comprises three residual terms, and the first term corresponds to a residual term of the marginalization prior information; the second term corresponds to the measured residual terms of the IMU and DVL,

the BD is a set of all measurement values of the IMU and the DVL; the third term corresponds to the visual re-projection residual term,

is the noise covariance for the visual observation and C is the set of all frame images in the sliding window. And solving the minimum value of the cost function, namely the sum of the mahalanobis distances of the three residual error terms is minimum.

1) Measurement residual of IMU and DVL

Suppose there are two consecutive frames b _k And b _k+1 Frame, then the residual error of the IMU corresponding to the DVL measurement can be represented by:

in the formula (I), the compound is shown in the specification,

respectively referring to corresponding position, speed and rotation pre-integration residual error of IMU in the system, corresponding position pre-integration residual error of DVL in the system, zero-offset residual error of IMU accelerometer and gyroscope, and zero-offset residual error of DVL speed, wherein the rotation pre-integration residual error of IMU is still expressed by quaternion, [ g [ [ g ]] _xyz The representation retains only the imaginary part of the quaternion,

representing the value of each pre-integral term between two adjacent frames.

The measurement residual items of the IMU and the DVL are derived from the pose, the speed and the zero offset of two continuous frames, 7 corresponding variables are provided for each frame, and the two frames contain 14 variables which can be expressed as;

2) Visual reprojection residual

Assuming the ith feature point is initially observed in the ith frame, the transformation is performed to the pixel coordinates in the jth frame, and the visual re-projection residual is defined as follows:

in the formula (I), the compound is shown in the specification,

is the coordinates of the l-th feature point in the normalized plane coordinate system in the j-th frame,

is the l characteristic point normalizes the plane coordinate in the j frameThe estimated coordinates in the coordinate system are:

in the formula, pi _C ^-1 Is to convert the camera internal parameters from pixel coordinates to a back projection matrix on the unit sphere corresponding to the normalized camera coordinate system,

is the coordinate in the i-th frame at which the i-th feature point was initially observed.

The visual re-projection residual error item is derived from the pose of an IMU (inertial measurement Unit) of two frames of the camera i and j and the inverse depth of an external parameter and a characteristic point l between the camera and the IMU, and the two frames contain 7 variables which can be expressed as follows;

3) Marginalization

The matrix form of the normal equation for gauss-newton is:

wherein, Δ x ₁ ,Δx ₂ Respectively representing the state quantity of the old frame needing marginalization retention and the state quantity of the key frame to be solved in the sliding window. The method of Schur's complement for the above formula is used for elimination of primordia, and comprises the following steps:

h in the above formula ₄ -H ₃ H ₁ ^-1 H ₂ Is H ₁ At H ₂ The Schur term in (1) is obtained after finishing:

(H ₄ -H ₃ H ₁ ^-1 H ₂ )Δx ₂ ＝g ₂ -H ₃ H ₁ ^-1 g ₁

in the above formula,. DELTA.x ₁ The corresponding constraint still exists, so that the solving of the delta x can be avoided on the premise of keeping the constraint by only solving the above formula ₁ 。

4. Four-degree-of-freedom pose diagram optimization

Assuming that the sequential edges in the above are established between the ith and jth frames, the relative displacement and yaw angle are defined as follows:

in the formula (I), the compound is shown in the specification,

is the rotation value of the keyframe under the reference frame at time i,

is the translation value of the key frame under the reference frame at the moment i, j,

is the yaw angle of the keyframe at time i, j in the reference frame.

The residual r between the ith frame and the jth frame _i,j Can be expressed as:

in the formula (I), the compound is shown in the specification,

representing the roll and pitch estimates for the keyframe at time i. The overall cost function taking into account all sequential edges and loop edges is:

where S is a set of all sequential edges and L is a set of all loop edges. The state variables considered when saving the attitude map are the relative displacement between the ith frame and the jth frame and the corresponding yaw angle, and include:

while the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications or substitutions can be easily made by those skilled in the art within the technical scope of the present disclosure.

Claims

1. A monocular camera/IMU/DVL tight coupling-based unmanned ship navigation and positioning method is characterized by comprising the following steps:

step 2: after preprocessing of original information of the monocular camera, the IMU and the DVL is finished, joint initialization among the sensors is carried out; aligning IMU/DVL pre-integrated data with a visual information initialization result obtained through a pure visual SFM process, performing monocular visual/inertial/DVL combined initialization to recover initial values of initial frames of optimization variables, including scale, gyroscope zero offset, speed and gravity vector, and providing the initial values to subsequent nonlinear optimization;

and step 3: after the joint initialization work is finished, entering a tightly coupled nonlinear optimization process, wherein the process saves space by utilizing a sliding window; in the window sliding process, IMU/DVL measurement values and visual feature observation values are fused closely, a joint optimization model is constructed, high-precision pose estimation is carried out on a fixed number of state variables, and an edge scheme is adopted to reduce the complexity of optimization calculation; secondly, establishing a characteristic relation between the current frame and a loop candidate frame by utilizing loop detection, considering nonlinear optimization, finishing correction and update of the position of the landmark point by using the minimum calculated amount, realizing repositioning and obtaining more accurate position and orientation information;

and 4, step 4: after the repositioning is finished, carrying out global position and orientation diagram optimization by utilizing four degrees of freedom, wherein the four degrees of freedom comprise relative displacement and a yaw angle on an x axis, a y axis and a z axis; and updating the pose information of all key frames in the database by using the sequence edge and the loop-back edge, eliminating accumulated errors introduced by nonlinear optimization, loop-back detection and repositioning, and obtaining the pose information of the landmark points which is closer to the actual position.

2. The monocular camera/IMU/DVL tightly-coupled based unmanned ship navigation and positioning method of claim 1, wherein: in the step 1, feature points of continuous frame images acquired by a monocular camera are extracted aiming at a visual part, and the feature points are tracked by calculating the instantaneous speed of pixel points of two adjacent frame images through a sparse optical flow method to obtain successfully matched feature point pairs.

3. The monocular camera/IMU/DVL tightly-coupled unmanned ship navigation and positioning method of claim 1, wherein: in step 1, for an inertial part, pre-integration operation is performed on IMU and DVL measurement values between two adjacent frames, and inertial information and DVL information are aligned in time to obtain data input in the same time.

4. The monocular camera/IMU/DVL tightly-coupled based unmanned ship navigation and positioning method of claim 1, wherein: in step 2, the sampling frequency of the monocular camera is 10Hz, the sampling frequency of the IMU is 100Hz, and the sampling frequency of the DVL is 10Hz; after the measurement preprocessing is finished, the information of the monocular camera, the monocular camera and the monocular camera is aligned by using the sampling frequency of the monocular camera as a standard, and data input with the same time is obtained.

5. The monocular camera/IMU/DVL tightly-coupled unmanned ship navigation and positioning method of claim 1, wherein: in step 2, the pure vision SFM process can obtain the relative pose between each frame of image obtained by the camera according to the PnP principle and all the visual information acquired by the camera, recover the three-dimensional spatial position of the feature point between two frames by triangulation, and define the reference coordinate system according to the three-dimensional spatial position.

6. The monocular camera/IMU/DVL tightly-coupled unmanned ship navigation and positioning method of claim 1, wherein: in the step 2, the vision inertia alignment process aligns the pure vision information with the IMU/DVL pre-integration data, so that initial values of parameters of zero offset, speed and gravity vector of the gyroscope can be estimated, and the scale problem of the monocular camera is solved; the method comprises the following specific steps: firstly, calibrating zero offset of a gyroscope, and updating all IMU/DVL pre-integral terms again according to the obtained zero offset; secondly, initializing the recovery speed, the gravity vector and the scale of the monocular camera, and carrying out secondary optimization on the gravity vector; then, calculating the depth of the characteristic points, and scaling the position, the speed and the depth of the characteristic points according to the scale; and finally, determining a world coordinate system according to the gravity vector after the secondary optimization, and converting all variables into the world coordinate system.

7. The monocular camera/IMU/DVL tightly-coupled unmanned ship navigation and positioning method of claim 1, wherein: the specific process in the step 3 is as follows:

step 3.1: setting state variables in a sliding window, wherein the state variables comprise camera poses and IMU states corresponding to images: position, attitude, velocity, zero bias of accelerometers and gyroscopes in the IMU, zero bias of position and velocity of the DVL, inverse depth of feature points, camera-IMU external parameters, IMU-DVL external parameters;

step 3.2: judging whether the penultimate frame in the sliding window is a key frame or not, selecting a marginalized concrete strategy, and obtaining a marginalized residual error item according to related information in the state variable;

step 3.3: according to the IMU state in the state variables, the zero offset of an IMU accelerometer and a gyroscope, the zero offset of the DVL position and speed and the external parameters of the IMU-DVL, measuring residual error items of the IMU and the DVL are obtained;

step 3.6: performing iterative optimization on the cost function obtained in the step 3.5, when a new frame arrives, changing the current frame into the previous frame, moving a sliding window forwards, transmitting relevant parameters, completely removing invalid feature points in the latest sliding window, and updating feature point information;

step 3.7: when a key frame comes, firstly, angular points of the frame of picture are extracted, descriptors are calculated, and a key frame database is established; when loop detection occurs, measuring the similarity between the current key frame and all historical key frames in the database by using a bag-of-word model to judge whether the carrier passes through the same environment before, and if the similarity reaches a preset threshold value, determining that loop is generated; however, the situation that two pictures are too similar is likely to occur, and a certain inspection condition needs to be set in order to prevent the occurrence of mismatching; when the detection condition is met, namely the correct loop is confirmed to be generated, the pose of the landmark point can be corrected and updated, the relocation is completed, and more accurate pose information is obtained.