CN113052908B

CN113052908B - Mobile robot pose estimation algorithm based on multi-sensor data fusion

Info

Publication number: CN113052908B
Application number: CN202110422808.9A
Authority: CN
Inventors: 程明; 杨慧蓉
Original assignee: Nanjing Tech University
Current assignee: Nanjing Tech University
Priority date: 2021-04-16
Filing date: 2021-04-16
Publication date: 2023-08-04
Anticipated expiration: 2041-04-16
Also published as: CN113052908A

Abstract

The invention provides a mobile robot pose estimation algorithm based on multi-sensor data fusion, which comprises the following steps: step1: obtaining laser data, image data and IMU data; step2: laser is marked on a black calibration ruler, laser data and image data are calibrated in a combined mode, and a rotation matrix and a translation matrix of the laser data under a camera coordinate system are obtained; step3: acquiring image information and laser point cloud data between two adjacent key frames, projecting three-dimensional laser data onto an image through a rotation matrix and a translation matrix, and fusing the two to enable the image data to obtain depth information to form depth image data; step4: performing feature tracking and motion estimation on the image containing the depth information; step5: performing pre-integration processing on IMU information between two adjacent key frames; stsp6: and carrying out joint optimization on the IMU information and the depth image data to obtain a pose estimation result. Compared with the traditional calibration method, the method is simple to operate and high in calibration precision.

Description

Mobile robot pose estimation algorithm based on multi-sensor data fusion

Technical Field

The invention relates to the technical field of autonomous navigation, in particular to a mobile robot pose estimation algorithm based on multi-sensor fusion.

Background

In order to realize accurate positioning and navigation of the mobile robot in a complex environment and avoid the influence of factors such as ambient light change, scene rapid movement, object dynamics and the like on positioning and mapping, the SLAM scheme adopting multi-sensor fusion becomes a main research direction of SLAM technology of the future mobile robot. Vision sensors, laser sensors, and inertial navigation sensors have been widely used in intelligent robot navigation, and by fusing these data, more accurate pose and three-dimensional maps can be provided to the robot.

Disclosure of Invention

The utility model provides a mobile robot pose estimation algorithm based on multisensor data fusion, this document has designed the pose estimation algorithm based on multisensor data fusion, and laser and monocular camera jointly mark at first, compare in traditional calibration method, easy operation and calibration precision are higher. The visual information and the depth information are fused, and the pose estimation is carried out by carrying out joint optimization on the visual information and the depth information and the IMU information after feature extraction and matching.

In order to solve the technical problems, the invention adopts the following technical scheme:

a mobile robot pose estimation algorithm based on multi-sensor data fusion comprises the following steps:

step1: obtaining laser data fed back by a laser radar of the mobile robot, image data fed back by a monocular camera and IMU data fed back by an inertial navigation positioning device;

step2: laser is marked on a black calibration ruler, laser data and image data are calibrated in a combined mode, and a rotation matrix and a translation matrix of the laser data under a camera coordinate system are obtained;

step3: acquiring image information and laser point cloud data between two adjacent key frames, projecting three-dimensional laser data onto an image through a rotation matrix and a translation matrix, and fusing the two to enable the image data to obtain depth information to form depth image data;

step4: performing feature tracking and motion estimation on the image containing the depth information;

step5: performing pre-integration processing on IMU information between two adjacent key frames;

stsp6: and carrying out joint optimization on the IMU information and the depth image data to obtain a pose estimation result.

Preferably, the laser radar, the monocular camera and the inertial navigation positioning device in step1 are all arranged on the mobile robot,

the working frequency of the laser radar is 5-20Hz, and the detection angle in the horizontal direction is 360 degrees; the angle which can be detected in the vertical direction is 30 degrees, 15 degrees above and below respectively, and the scanning effective radius is 1 meter to 100 meters;

the inertial navigation system is composed of a magnetometer, a gyroscope and an accelerometer and is used for detecting and obtaining a triaxial rotation angle and triaxial acceleration, carrying out initial calibration on a direction angle and correcting drift on a course angle;

the resolution of the monocular camera is 2080×1552, and the highest operating frequency is 60Hz.

Further, step2 performs laser abnormal data rejection in advance, and the laser abnormal data rejection formula is as follows:

wherein t is ₁ 、t ₂ 、t ₃ 、t ₄ Respectively represent different moments, and t ₁ ＜t ₂ ＜t ₃ ＜t ₄ ，l ₁ Indicating that the laser ray is at t ₁ Distance between time laser emission point and imaging point, l ₂ Indicating that the laser ray is at t ₂ Distance between time laser emission point and imaging point, l ₃ Indicating that the laser ray is at t ₃ Distance between time laser emission point and imaging point, l ₄ Indicating that the laser ray is at t ₄ Distance between time laser emission point and imaging point, c ₁ Indicating that the laser ray is at t ₁ Time sum t ₂ Distance between two imaging points at moment, c ₂ Indicating that the laser ray is at t ₂ Time sum t ₃ Distance between two imaging points at moment, c ₃ Indicating that the laser ray is at t ₃ Time sum t ₄ The distance between the two imaging points at the moment, arccos is the inverse cosine in the inverse trigonometric function, v represents the angular velocity of the laser rotation,

(1) when meeting the publicIf the formula (1) does not satisfy the formula (2), rejecting the laser beam at t ₂ Imaging points at time;

(2) when the formula (2) is satisfied and the formula (3) is not satisfied, rejecting the laser ray at t ₃ Imaging points at time;

(3) when the formulas (1) and (3) are simultaneously satisfied, rejecting the laser ray at t ₁ Time sum t ₄ Imaging point of time.

Further, the joint calibration in Step2 includes the following steps:

step2.1: setting the coordinate P of laser data in a camera coordinate system _C The calculation formula is as follows:

P _C ＝RP _L +t (4)

wherein P is _C Representing the coordinates of the laser data in the camera coordinate system, is composed of three-dimensional vectors (x _C ，y _C ，z _C ) ^T ，x _C ，y _C ，z _C The coordinates of the laser data in the x-coordinate system, y-coordinate system, and z-coordinate system of the camera, respectively, R represents a rotation matrix, and is represented by a three-dimensional vector (θ _x ，θ _y ，θ _z ) Constitution, θ _x ，θ _y ，θ _z For rotation angles about three axes of the coordinate system x, y, z, respectively, t represents a translation matrix, represented by a three-dimensional vector (t _x ，t _y ，t _z ) Constitution, t _x ，t _y ，t _z The translation distances P in the three directions of the x, y and z axes of the coordinate system _L Is the coordinates of the laser data in the laser's own coordinate system, which is defined by a three-dimensional vector (x _L ，y _l ，z _L ) ^T Composition, x _L ，y _l ，z _L Respectively representing the coordinates of the laser data in the x-axis, the y-axis and the z-axis of a laser coordinate system;

step2.2: introducing an internal parameter change matrix I of a camera, and establishing a connection between image data and laser data:

P＝IP _C (5)

wherein P is the coordinates of the image data in the camera coordinate system, and is defined by three-dimensional vectors (u, v, l) ^T The composition, u, v, l, represents the image data in the x-axis of the camera coordinate systemCoordinates in the y axis and the z axis, and the camera internal parameter change matrix I is as follows:

wherein I is a camera internal parameter change matrix, f _x 、f _y 、f _z 、u _x And u _y Are all the parameters of the transformation, and the transformation parameters,

step2.3: based on Gaussian Newton iteration method, solving optimal transformation parameters, and obtaining parameters of a rotation matrix R and a translation matrix t:

P＝f(Pc，x) (7)

where x is a transformation parameter matrix, formed by (f _x ，f _y ，u _x ，u _y ，θ _x ，θ _y ，θ _z ，t _x ，t _y ，t _z ) The composition f (Pc, x) represents a function composed of coordinates Pc and x of the laser data in the camera coordinate system, P _i Representing the coordinates, pc, of the ith image data in the camera coordinate system _i Representing the coordinates of the ith laser data in the camera coordinate system, x _i Represents the transformation parameter matrix corresponding to the ith laser data,x represents the time of k _i ，/>At time k, at +.>The first-order Taylor expansion is performed nearby, < >>For g (Pc, x) with respect to the first derivative of x, Δx ^k Representing a descent vector, +.>Representing vector P _i Vector->The sum of squares of the straight line distances, arg min, represents the minimum of the solution parameters.

Further, step3, t _j-1 From time to t _j The projection of the moment uses the following formula:

ΔS _j-1，j ＝R _x R _y R _z Δx (11)

wherein the method comprises the steps of

Wherein DeltaR _j-1，j Three-axis rotation speed, deltaS, of motion estimation between j-1 frames and j frames of images calculated for visual mileage _j-1，j The calculated three-axis distance of motion estimation between the j-1 frame and the j frame images for the visual mileage is represented by Rx, ry and Rz, which are the calculated rotational speeds of the x, y and z axes for the visual mileage, roll, pitch, yaw are the attitude angles of the x, y and z axes for the j frame image time, cos (roll) represents the cosine of the attitude angle of the x axis for the j frame image time, sin (roll) represents the sine of the attitude angle of the x axis for the j frame image time, cos (pitch) represents the y axis for the j frame image timeSin (pitch) represents a sine value of the attitude angle of the y-axis at the j-frame image time, cos (yaw) represents a sine value of the attitude angle of the z-axis at the j-frame image time, sin (yaw) represents a sine value of the attitude angle of the z-axis at the j-frame image time, x _j 、y _j 、z _j The displacement of the x-axis, the y-axis and the z-axis calculated by the visual mileage calculation respectively representing the moment of j frames of images, deltax represents a displacement matrix, T represents a time matrix, T _j 、t _j-1 Respectively representing the t and t-1 times.

Further, step4 includes the steps of:

step4.1: extracting feature key points from the image containing depth information, calculating descriptors, and finishing a feature extraction process;

step4.2: matching the extracted features to obtain feature matching results with fewer mismatching numbers;

step4.3: and calculating a motion transformation matrix by using the matched characteristic pairs.

Preferably, the descriptor is calculated in step4.1, and the hamming distance is used to represent the similarity between two feature points, and the smaller the distance is, the greater the similarity is.

Preferably, step4.3 adopts a random sample consistency algorithm to remove mismatching points, and purifies matched feature pairs.

Furthermore, step5 pre-integration processing comprises IMU modeling, a kinematic model, IMU pre-integration, a noise propagation model and a drift model, and IMU observation residual errors and covariances required in the optimization process are obtained through an IMU pre-integration algorithm.

Further, the step6 joint optimization includes the following steps:

step5.1, supposing that the visual observation value and the IMU observation value are mutually independent;

step5.2: applying a Bayes formula, and estimating the state quantity of the system by adopting the maximum posterior probability;

step5.3, under the assumption of zero-mean Gaussian noise distribution, converting the zero-mean Gaussian noise distribution into an optimization problem;

step5.4: the optimized objective function residual is composed of three parts, namely a sliding window initial value residual, an IMU observation residual and a visual observation residual.

The invention has the beneficial effects that:

(1) The invention is based on multi-sensor data fusion, and the fusion of the data can provide more accurate pose and three-dimensional map for the robot;

(2) The invention creatively provides a laser abnormal data eliminating formula, and the eliminated data is more accurate.

(3) According to the invention, black calibration paper is used, and the combined calibration parameters of the laser and the camera are solved, so that the accurate calibration parameters of the three-dimensional laser and the monocular camera are obtained.

(4) The invention solves the optimal transformation parameters based on the Gaussian Newton iteration method, and obtains the parameters of the rotation matrix R and the translation matrix t.

(5) Based on the problem that the image feature points need to be fused with scale information, the three-dimensional laser data is projected onto the image by a joint calibration method, so that the actual three-dimensional coordinates of part of the image data are obtained.

(6) In the joint optimization link, the visual information and the IMU information are fused, and the pose of the camera is obtained through optimization. And optimizing the key frame state in the camera motion process by adopting a sliding window method.

(7) The invention performs pose tracking comparison with the visual inertial odometer system OKVIS through actual data measurement and calculation, and the algorithm is superior to the OKVIS system in positioning accuracy and can be applied in real time.

Drawings

Fig. 1 is a flowchart of an algorithm provided by the present invention.

Detailed Description

The mobile robot pose estimation algorithm based on multi-sensor data fusion is further described in detail below with reference to the accompanying drawings and a specific implementation method.

Example 1

As shown in fig. 1, a flow chart of a mobile robot pose estimation algorithm for multi-sensor data fusion mainly comprises the following steps:

step3: acquiring image information and laser point cloud data between two adjacent key frames, projecting three-dimensional laser data onto an image through a rotation matrix and a translation matrix, and fusing the two to enable the image data to obtain depth information to form depth image data; step4: performing feature tracking and motion estimation on the image containing the depth information;

Example 2

Example 2 differs from example 1 only in that: step was optimized.

Specifically, in step2, firstly, a monocular camera, a laser radar and an inertial navigation positioning device are mounted on a mobile robot platform, wherein the working frequency of the laser radar is 5-20Hz, and the angle which can be detected in the horizontal direction is 360 degrees; the angle which can be detected in the vertical direction is 30 degrees, 15 degrees above and below each, and the scanning effective radius is 1 meter to 100 meters. The inertial navigation system consists of a magnetometer, a gyroscope and an accelerometer, and can detect and obtain a triaxial rotation angle and triaxial acceleration, and perform initial calibration on a direction angle and correction drift on a course angle. The resolution of the monocular camera used was 2080×1552, with a maximum operating frequency of 60Hz.

To obtain accurate three-dimensional laser and monocular camera calibration parameters, accurate laser and visual data matching pairs are required, black calibration paper is used here, 54cm long and 39cm wide.

The joint calibration parameters of the laser and the camera are calculated, namely, a rotation matrix Rθ is needed to be calculated _x ，θ _y ，θ _z ]And a translation matrix t [ t ] _x ，t _y ，t _z ]So that the formula is established, wherein P _L ＝(x _L ，y _l ，z _L ) ^T Is the coordinate of the three-dimensional laser data under the coordinate system of the laser, P _C ＝(x _C ，y _C ，z _C ) ^T Is the coordinates of the laser data in the camera coordinate system, p= (u, v, l) ^T Is the coordinates of the three-dimensional laser spot in the camera coordinate system, such as equation (1), equation (2) and equation (3). The monocular camera used herein obtains the camera internal reference I matrix by a Zhengyou calibration method.

P _C ＝RP _L +t (14)

p＝IP _C ＝I(RP _L +t) (15)

According to the obtained corner pairs formed by the laser corner points and the image corner points, solving a rotation translation matrix between two coordinate systems by adopting a Gauss Newton method, and writing laser data and image data corner points into a homogeneous form P respectively for the convenience of calculation _i ＝(x _i ，y _i ，z _i ，1)、P _i ＝(u _i ，v _i ,1). Conversion of laser data from laser coordinates to image coordinate System you parameter is x (f _x ，f _y ，u _x ，u _y ，θ _x ，θ _y ，θ _z ，t _x ，t _y ，t _z ) Wherein f _x ，f _y ，u _x ，u _y Is the internal reference of the camera, θ _x ，θ _y ，θ _z To the rotation angle around three axes, t _x ，t _y ，t _z Is the translation distance in three directions of the coordinate system. The transformation of the laser data from the laser coordinate system to the camera coordinate system can be represented by equation (4):

P＝f(Pc，x) (17)

where x is a transformation parameter matrix, formed by (f _x ，f _y ，u _x ，u _y ，θ _x ，θ _y ，θ _z ，t _x ，t _y ，t _z ) The composition f (Pc, x) represents a function composed of coordinates Pc and x of the laser data in the camera coordinate system, P _i Representing the coordinates, pc, of the ith image data in the camera coordinate system _i Representing the coordinates of the ith laser data in the camera coordinate system, x _i Represents the transformation parameter matrix corresponding to the ith laser data,x represents the time of k _i 。

The optimization solution is carried out by adopting a Gauss Newton iteration method, and at the moment k, at the momentThe first-order Taylor expansion is performed nearby, there is

Wherein the method comprises the steps ofFor the first derivative of g (Pc, x) with respect to x, a jacobian matrix is calculated. The goal here is to find the descent vector Deltax ^k Minimizing this formula as follows

At time k, at +.>The first-order Taylor expansion is performed nearby, < >>For g (Pc, x) with respect to the first derivative of x, Δx ^k Representing a descent vector, +.>Representing vector P _i Vector->The sum of squares of the straight line distances, arg min, represents the minimum of the solution parameters

Example 3

Example 3 differs from example 2 only in that: step3 is specifically defined.

Specifically, in step3, image information and laser point cloud data between two adjacent key frames are acquired, three-dimensional laser data are projected onto an image, and the two are fused, so that the image data obtain depth information;

based on the problem that the image feature points need to be fused with scale information, three-dimensional laser data are projected onto an image through the joint calibration method, so that part of image data can obtain actual three-dimensional coordinates.

Since the camera selects the working frequency to be 60Hz and the laser working frequency to be 10Hz, the laser data time stamp is not necessarily completely aligned with the image data time stamp, the laser data is obtained at the moment t, and the laser data at the moment is supposed to be projected to the moment t _j-1 On the image at time instant because from time instant t to time instant t _j-1 Time of dayMovement has already occurred, so that a certain rotational-translational transformation of the laser light is required.

Assuming that the motion between the k-1 frame image and the k frame image needs to be solved, the visual odometer motion before the k-1 frame is calculated, and the scale information of the characteristic points on the k-1 frame image needs to be obtained at the moment, then laser at the latest moment t needs to be projected to the k-1 frame image, then certain rotation translation transformation needs to be carried out on the laser data, the transformation process is as follows, and the assumption is made that t _j-1 From time to t _j Is a linear model.

ΔS _j-1，j ＝R _x R _y R _z Δx (22)

Wherein the method comprises the steps of

Wherein DeltaR _j-1，j Three-axis rotation speed (θ) of motion estimation between j-1 frames and j frames of images calculated for visual mileage _x ，θ _y ，θ _z ) Roll, pitch, yaw is three-axis attitude angle, x of j frame image moment _j 、y _j 、z _j For this time, the three-axis displacement calculated by the visual mileage at that time was calculated.

The internal reference of the gray level camera is known, then the combined calibration parameters of the three-dimensional laser and the monocular camera are obtained through calculation by the combined calibration method, and the three-dimensional laser is projected onto the two-dimensional gray level image.

Example 4

Example 4 differs from example 3 only in that a specific implementation is given for step4.

Specifically, in Step4, since feature extraction and matching are only one link in the visual odometer, if too long is taken on the link, the efficiency of the whole system is necessarily affected, so that the ORB algorithm with good performance and extremely obvious speed advantage is selected. The feature point extraction is carried out on the image by adopting the FAST algorithm, and in order to solve the problem of rotation invariance, the ORB uses a gray centroid method:

assuming that an image block with a feature point as a geometric center is B, the moment of the image block is defined as:

m _pq ＝∑ _x，y∈B x ^p y ^q I(x，y) p，q＝{0，1} (25)

wherein I (x, y) is the gray value of the point (x, y). The centroid of the image block is easily obtained by the moment of the image block:

connecting the feature point (geometric center O of the image block) with the centroid C to obtain a direction vectorThe direction of the feature points can be defined as follows:

the FAST corner has a direction by a gray centroid method. After the key points are obtained, the descriptors of the key points are calculated. The function of the descriptor is to match whether two feature points are the same feature point. The general descriptor is constructed by drawing a plurality of circles around the key points, and calculating the pixel points in the circles to obtain a vector. This vector can reflect the characteristics of the pixels surrounding this keypoint. I.e. it may describe the surrounding characteristics of this feature point at present. Because the same feature points are certainly similar around, a descriptor can be used to determine whether two feature points are identical.

For binary BRIEF descriptors, pass throughHamming distance is often used to represent the degree of similarity between two feature points, the smaller the distance, the greater the degree of similarity. Let a=a ₁ a ₂ ...a _n And b=b ₁ b ₂ ...b _n Is two binary strings, wherein a _n And b _n The hamming distances between a and B are described as:

for each feature point, only the hamming distances to all feature points of the matching image need be measured, and then the nearest neighbor is selected as its initial corresponding match. Setting a threshold valueWhen the distance between descriptors is greater than +.>When the match is deleted as a false match. Although the thresholding method is adopted to primarily filter out some mismatching, there are still some inaccurate matching, and the matching result needs to be purified by adopting a RANSAC (random sample consensus) algorithm. After the RANSAC algorithm further eliminates the mismatching, a large number of inaccurate matching is reduced, the matching accuracy is improved, and a more reliable basis is provided for the following pose estimation.

Example 5

Example 5 differs from example 1 only in that: one specific implementation is given for step5.

Step5, and for IMU information, pre-integration processing is performed. Firstly, carrying out mathematical modeling on IMU sensor data, and constructing an IMU measurement model formula to show by taking measurement white noise and random walk noise in the measurement process into consideration.

W represents the world coordinate system and B represents the IMU coordinate system. White noise obeys gaussian distribution, i.e.While the random walk noise carries out Brownian motion, which is differentiated into Gaussian distribution noise, i.e.>Wherein:

establishing a kinematic differential equation of the system:

in a visual odometer system, the original state quantity x of the system _i Is constantly optimized and changed, so that the integration process needs to be carried out again, the process is called as a weighted integration process, and the part irrelevant to the state quantity before the system is separated, so that the following result is obtained:

wherein:

definition of IMU Pre-integral variable Y _IMU ＝(ΔR _ij ,Δv _ij ,Δp _ij ). Firstly, IMU pre-integration is carried out to obtain an IMU pre-integration variable Y _IMU When the system estimates state R at time i _i ,v _i ，p _i When changing, we calculate the estimated state R of the system at time j quickly by pre-integration _j ,v _j ，p _j 。

Pre-integral variable Y as defined herein _IMU ＝(ΔR _ij ,Δv _ij ,Δp _ij ) Can be separated into noise-free partsAnd its corresponding noise->Definitions->For IMU observance, its corresponding noise +.>The covariance of (1) is the covariance Σ in the optimization objective function _ij . Next, the IMU observables are analyzed>Corresponding residual->Covariance Σ _ij

Example 6

Example 6 differs from example 1 only in that a specific implementation is given for step 6.

Specifically, in Step6, in the joint optimization link, the visual information and the IMU information are fused, and the pose of the camera is obtained through optimization. And optimizing the key frame state in the camera motion process by adopting a sliding window method, wherein the observation value of the system is divided into two parts, namely a vision measurement value and an IMU measurement value. Defining the visual measurement of the keyframe time i system as set C _i . At C _i The 3-dimensional characteristic point l observed by the middle camera is marked as z _il . Defining all IMU observations located between keyframe I and keyframe j as set I _ij . At the time of optimization, the observation value of the known system is Z _k Solving state quantity X of system _k . Assuming that the vision observation value and the IMU observation value are mutually independent, and the vision measurement at the moment i is only related to the state at the moment i, applying a Bayesian formula, estimating the state quantity of the system by adopting the maximum posterior probability, converting the maximum posterior distribution problem into an optimization problem under the assumption of zero-mean Gaussian noise distribution, and defining an optimization objective function as follows:

in the optimized objective function, the residual error consists of three parts, namely a sliding window initial value residual error r ₀ IMU observation residualAnd visual observation residuals, which correspond to covariances Σ respectively ₀ ，Σ _ij ，Σ _C . Based on the assumption that the visual observation value and the IMU observation value are mutually independent, the visual observation residual error and the IMU observation residual error are mutually independent, and the corresponding covariance is mutually independent.

Example 7

Example 7 differs from example 1 only in that: the step2 adds a culling formula for abnormal data.

Specifically, step2 performs laser abnormal data rejection in advance, and the laser abnormal data rejection formula is:

(1) when the formula (1) is satisfied and the formula (2) is not satisfied, rejecting the laser ray at t ₂ Imaging points at time;

The pose estimation algorithm based on multi-sensor data fusion is designed, and firstly, laser and monocular camera are calibrated in a combined mode, and compared with a traditional calibration method, the pose estimation algorithm is simple to operate and high in calibration accuracy. The visual information and the depth information are fused, and the pose estimation is carried out by carrying out joint optimization on the visual information and the depth information and the IMU information after feature extraction and matching.

The algorithms modified herein were compared to the well-known visual odometer system okis for pose tracking, and experiments were performed using sequences in the EuRoC dataset, mh_05_difficult.

Table 1MH_05_difficult sequence experiment results

Where ATE represents the absolute track error, the difference between the real point and the estimated track point is described. PRE then represents the relative pose error, describing the error in the relative motion between frames. From the comparison table, the improved algorithm ATE, PRE and average rotation error are lower than that of the okrais system, the average time consumption is higher than that of the okrais system, and the average time consumption is lower than 30ms and is far lower than the interval of two images of the data set by 50ms, which indicates that the algorithm is superior to that of the okrais system in positioning accuracy and can be applied in real time.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of the above description, will appreciate that other embodiments are contemplated within the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit thereof. The disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention.

The foregoing is only a preferred embodiment of the invention, it being noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the invention.

Claims

1. The mobile robot pose estimation algorithm based on multi-sensor data fusion is characterized by comprising the following steps of:

step6: performing joint optimization on the IMU information and the depth image data to obtain a pose estimation result;

wherein, joint calibration in Step2 includes the following steps:

P _C ＝RP _L +t (4)

wherein P is _C Representing the coordinates of the laser data in the camera coordinate system, is composed of three-dimensional vectors (x _C ,y _C ,z _C ) ^T ，x _C ,y _C ,z _C The laser data are respectively in the x coordinate system, the y coordinate system and the z coordinate of the cameraAnd the coordinates below the system, R represents a rotation matrix, which is represented by a three-dimensional vector (θ _x ,θ _y ,θ _z ) Constitution, θ _x ,θ _y ,θ _z For rotation angles about three axes of the coordinate system x, y, z, respectively, t represents a translation matrix, represented by a three-dimensional vector (t _x ,t _y ,t _z ) Constitution, t _x ,t _y ,t _z The translation distances P in the three directions of the x, y and z axes of the coordinate system _L Is the coordinates of the laser data in the laser's own coordinate system, which is defined by a three-dimensional vector (x _L ,y _l ,z _L ) ^T Composition, x _L ,y _l ,z _L Respectively representing the coordinates of the laser data in the x-axis, the y-axis and the z-axis of a laser coordinate system;

P＝IP _C (5)

wherein P is the coordinates of the image data in the camera coordinate system, and is defined by three-dimensional vectors (u, v, l) ^T The composition, u, v, l, represents the coordinates of the image data in the x-axis, y-axis and z-axis of the camera coordinate system, and the camera internal parameter change matrix I is:

P＝f(Pc,x) (7)

where x is a transformation parameter matrix, formed by (f _x ,f _y ,u _x ,u _y ,θ _x ,θ _y ,θ _z ,t _x ,t _y ,t _z ) The composition f (Pc, x) represents a function composed of coordinates Pc and x of the laser data in the camera coordinate system, P _i Representing the coordinates, pc, of the ith image data in the camera coordinate system _i Representing the coordinates of the ith laser data in the camera coordinate system, x _i Represents the transformation parameter matrix corresponding to the ith laser data,x represents the time of k _i ，/>At time k, at +.>The first-order Taylor expansion is performed nearby, < >>For g (Pc, x) with respect to the first derivative of x, Δx ^k Representing a descent vector, +.>Representing vector P _i Vector->The square sum of the straight line distances, arg min represents the minimum value of the solving parameters;

step3 t _j-1 From time to t _j The projection of the moment uses the following formula:

ΔS _j-1,j ＝R _x R _y R _z Δx (11)

wherein the method comprises the steps of

Wherein DeltaR _j-1，j Three-axis rotation speed, deltaS, of motion estimation between j-1 frames and j frames of images calculated for visual mileage _j-1，j The three-axis distance of motion estimation between the j-1 frame and the j frame images calculated for the visual mileage, rx, ry and Rz represent the rotational speeds of the x, y and z axes calculated for the visual mileage, roll, pitch, yaw are the attitude angles of the x, y and z axes at the j frame image time, cos (roll) represents the cosine value of the attitude angle of the x axis at the j frame image time, sin (roll) represents the sine value of the attitude angle of the x axis at the j frame image time, cos (pitch) represents the cosine value of the attitude angle of the y axis at the j frame image time, sin (pitch) represents the sine value of the attitude angle of the y axis at the j frame image time, cos (yaw) represents the cosine value of the attitude angle of the z axis at the j frame image time, sin (yaw) represents the sine value of the attitude angle of the z axis at the j frame image time, x _j 、y _j 、z _j The displacement of the x-axis, the y-axis and the z-axis calculated by the visual mileage calculation respectively representing the moment of j frames of images, deltax represents a displacement matrix, T represents a time matrix, T _j 、t _j-1 Respectively representing the t and t-1 moments;

joint optimization in step6 includes the steps of:

2. The mobile robot pose estimation algorithm based on multi-sensor data fusion according to claim 1, wherein the laser radar, the monocular camera and the inertial navigation positioning device in step1 are all installed on the mobile robot,

3. The mobile robot pose estimation algorithm based on multi-sensor data fusion according to claim 1, wherein Step2 performs laser abnormal data rejection in advance, and a laser abnormal data rejection formula is as follows:

4. The mobile robot pose estimation algorithm based on multi-sensor data fusion according to claim 1, wherein Step4 comprises the steps of:

5. The multi-sensor data fusion-based mobile robot pose estimation algorithm according to claim 4, wherein the descriptor is calculated in step4.1, and a hamming distance is used to represent the similarity between two feature points, and the smaller the distance is, the greater the similarity is.

6. The mobile robot pose estimation algorithm based on multi-sensor data fusion according to claim 4, wherein step4.3 adopts a random sample consistency algorithm to reject mismatching points, and the matched feature pairs are purified.

7. The mobile robot pose estimation algorithm based on multi-sensor data fusion according to claim 1, wherein step5 pre-integration processing comprises an IMU modeling part, a kinematic model part, an IMU pre-integration part, a noise propagation model part and a drift model part, and IMU observation residual errors and covariances required in an optimization process are obtained through the IMU pre-integration algorithm.