CN111076733B

CN111076733B - Robot indoor map building method and system based on vision and laser slam

Info

Publication number: CN111076733B
Application number: CN201911257094.XA
Authority: CN
Inventors: 林欢; 程敏; 许春山; 毛成林; 王�锋
Original assignee: Yijiahe Technology Co Ltd
Current assignee: Yijiahe Technology Co Ltd
Priority date: 2019-12-10
Filing date: 2019-12-10
Publication date: 2022-06-14
Anticipated expiration: 2039-12-10
Also published as: CN111076733A

Abstract

The invention discloses a robot indoor map building method based on vision and laser slam, which comprises the following steps: a visual sensor acquires a left view and a right view to generate a feature point cloud; calculating a pose transformation estimate for the vision-based robot; obtaining laser data, calculating the confidence of pose transformation estimation and map grid updating of the robot based on laser; generating subgraphs by using continuous laser point cloud data by adopting a tile type mapping method; matching the obtained characteristic points with each sub-graph; and updating the pose of the robot in each sub-image through image optimization to form a final image construction result. The invention adopts a mapping method combining vision and laser, on one hand, the defect that a vision algorithm is greatly influenced by illumination is improved, on the other hand, closed loop is detected by using the descriptor attributes of the characteristic points, the problem that the success rate of laser closed loop detection is too low is solved, and the closed loop detection efficiency is improved.

Description

Robot indoor map building method and system based on vision and laser slam

Technical Field

The invention relates to the field of inspection robots, in particular to a robot indoor map building method and system based on vision and laser slam.

Background

The completion of various polling tasks of the robot is directly influenced by the quality of the indoor navigation environment. To complete the map creation, information about the surroundings is first obtained, which is done using sensors. The drawing research hot spot of the existing robot is mainly concentrated on two sensors, one is a laser sensor, the laser sensor has the advantages of high resolution, high distance measurement precision, strong active interference resistance, good detection performance and no influence of light rays, and the defects are that the obtained information does not have semantic information, the registration precision is not high, and closed loop detection failure often exists in the drawing establishing process. The other vision sensor can obtain abundant texture information and has the advantage of accurate matching, but has the defect of large influence by illumination change.

In order to overcome the defects of a laser sensor and a vision sensor, the invention provides a robot indoor map building method and system based on vision and laser slam.

Disclosure of Invention

In order to solve the above problems, the present invention provides a robot indoor map building method based on vision and laser slam, comprising the following steps:

the visual sensor acquires a left view and a right view, and generates a feature point cloud, which specifically comprises the following steps:

extracting feature points of the left view and the right view,

the obtained feature points are matched with each other,

calculating the space coordinates of the matched characteristic points;

calculating pose transformation estimation of the vision-based robot, namely performing secondary matching on the feature points matched with the left view and the right view at the moment and the feature points matched with the left view and the right view at the last moment at the moment, acquiring a pose transformation relation of the robot according to coordinate changes of the secondary matched feature points and main direction changes of the feature points, projecting descriptors of the feature points onto corresponding map grids, and defining feature point descriptor attributes of the grids;

obtaining laser data, calculating the confidence of pose transformation estimation and map grid updating of the robot based on laser;

generating subgraphs by using continuous laser point cloud data by adopting a tile type mapping method;

closed loop detection, namely matching descriptors of the obtained feature points with the attribute of the feature point descriptor of the grid corresponding to each grid of each sub-graph;

the graph optimization specifically comprises the following steps: the pose of the robot is represented by the nodes of the graph, the transformation relation between the poses of the robot is represented by the edges connecting the nodes, the error between each pose of the robot and the estimated pose transformed by the transformation matrix is calculated, the error objective function of the graph with a plurality of edges is calculated, the pose of the robot converging the objective function is obtained through multiple iterations, and the pose of the robot in each sub-graph is used for updating the pose of the robot in each sub-graph, so that the final graph building result can be formed.

Further, the extracting feature points of the left view and the right view comprises:

performing FAST corner detection on the left view and the right view by using an ORB algorithm, and extracting a characteristic point P;

establishing a coordinate system with the characteristic point P as an origin, calculating a centroid position C in the field S, and constructing a vector with the characteristic point P as a starting point and the centroid C as an end point

The moment of the domain S can be expressed as:

wherein Y (x, Y) is the gray value of the image, x, Y belongs to [ -r, r ], r is the radius of the neighborhood S, and p and q are 0 or 1, then the centroid position of the neighborhood is:

the principal direction θ of the feature point P is:

θ＝arctan(m₀₁/m₁₀)；

generating a characteristic point descriptor:

within the neighborhood S of the feature point P, a random set of n pairs of points is generated, represented by a2 × n matrix:

principal direction θ of feature points and corresponding rotation matrix R_θCalculate the rotated 2 × n matrix:

wherein the content of the first and second substances,

the binary test criterion in the neighborhood S of the feature point P may be defined as:

wherein i_θ＝i_1θ，…，i_nθ，j_θ＝j_1θ，…，j_nθAnd tau denotes a binary test criterion,

represents a point i_θIs determined by the gray-scale value of (a),

represents a point j_θThe gray value of (a);

a descriptor of the rotation of the feature point P is now available:

k' is 128, 256 or 512, and the size of the corresponding descriptor is 16 bytes, 32 bytes or 64 bytes, respectively, which is selected according to the real-time performance, storage space and recognition efficiency of the method.

Further, the matching of the obtained feature points specifically includes:

calculating the Hamming distance D of the descriptors of the feature points of the left view and the right view for matching:

wherein f is_L，f_RDescriptors of a certain characteristic point of the left view and the right view respectively; when two feature points with the Hamming distance smaller than the threshold exist in the two frames of images, the two feature points are successfully matched.

Further, the calculating the spatial coordinates of the matched feature points specifically includes:

wherein (x, y, z) is the coordinate of the characteristic point P in space, (u)_L，v_L) For the image point coordinates of the feature point P in the left camera, (u)_R，v_R) As coordinates of image points in the right camera, f_{Focal length}Is the focal length of the binocular camera and d is the distance between the two cameras.

Further, the tile-type map building method, which generates sub-maps by using continuous laser point cloud data, specifically includes:

and generating sub-graphs by using continuous laser point cloud data, wherein each continuous Ts data generates one sub-graph, and the second sub-graph comprises the data in the last Ts of the previous sub-graph, wherein T is less than T.

Further, the graph optimization specifically includes:

the pose of the robot is represented as nodes in the graph, observation information is converted into a constraint relation between poses of the robots after being processed, and the constraint relation is represented by connecting edges between the nodes; each node of the graph depends on the description of the pose of the robot, and the state equation of all the nodes is

Wherein

Respectively the poses of the robot under the global coordinate system, and the positions of the robots are describedThe transformation relationship of (1); the observation equation between node Pi and node Pj can be expressed as:

z_j＝x_iT_i，j

wherein x is_iIs a node P_iPose, x, of the corresponding robot in the global coordinate system_jIs a node P_jPose, z, of the corresponding robot in the global coordinate system_jIs a node P_jOf observed value of (a), i.e. x_iEstimated pose after transformation by transformation matrix, T_i，jIs a node P_iAnd node P_jThe transformation relationship between the two;

robot pose x_jAnd x_iEstimated pose z after transformation by transformation matrix_jThere is an error e (z) between_j，x_j) The calculation formula is as follows:

e(z_j，x_j)＝x_iT_i，j-x_j

the error objective function e (x) of the graph with several edges is:

wherein the information matrix omega_kIs the inverse of the covariance matrix, is a symmetric matrix; each element omega thereof_i，jAs a coefficient of error, pair e (z)_i，x_i)、e(z_j，x_j) The error correlation of (a) is predicted;

let x be_kAnd adding an increment delta X, finally obtaining a convergence value meeting the objective equation E (X) through multiple iterations, updating the state variable X by using the obtained delta X, and forming a final mapping result by using the pose of each node in the X.

A robot indoor map system based on vision and laser slam comprises a front end and a rear end, wherein the front end of the system acquires a left view and a right view of an environment by using a binocular vision sensor, extracts the left view and the right view by using an ORB algorithm, matches feature points of the left view and the right view, and further calculates coordinates of each feature point in a space; matching the feature points obtained at adjacent moments, further solving the pose transformation relation of the robot at the moments, projecting the feature points onto a map grid, and defining that the grid map with the feature point projection has descriptor attributes; the front end of the system obtains laser point cloud data of the environment through a laser sensor, de-noizes the laser point cloud data, obtains pose transformation estimation of the robot by obtaining a pose initial value of the robot, defining a scanning window and obtaining a confidence coefficient, projects the laser point cloud on a grid map, and updates the confidence coefficient of each map grid; and the back end of the system performs closed-loop detection and map optimization on the obtained map.

Compared with the prior art, the invention has the following beneficial effects:

by adopting the mapping method combining vision and laser, the defect that a vision algorithm is greatly influenced by illumination is overcome, the closed loop is detected by using the descriptor attributes of the characteristic points, the problem of low success rate of laser closed loop detection is solved, and the closed loop detection efficiency is improved.

Drawings

Fig. 1 is a robot indoor positioning and mapping process based on vision and laser slam.

FIG. 2 is a schematic diagram of the M16 template.

Detailed Description

The following describes a mobile robot indoor positioning and mapping method based on vision and laser slam in detail with reference to the accompanying drawings.

A mobile robot indoor positioning and mapping method based on vision and laser slam is disclosed, and the flow is shown in figure 1. Specifically, the indoor positioning and mapping system of the mobile robot based on vision and laser slam is divided into a front end and a rear end, the left view and the right view of the environment are obtained by the front end of the system through a binocular vision sensor, the left view and the right view are extracted through an ORB algorithm, feature points of the left view and the right view are matched, and then coordinates of the feature points in the space are obtained; and then matching the feature points obtained at adjacent moments, further solving the pose transformation relation of the robot at the moments, projecting the feature points onto a map grid, and defining that the grid projected by the feature points has descriptor attributes. Meanwhile, the front end of the system obtains laser point cloud data of the environment through a two-dimensional laser sensor, de-noizes the laser point cloud data, obtains pose transformation estimation of the robot through the modes of solving an initial pose value of the robot, defining a scanning window, solving a confidence coefficient and the like, projects the laser point cloud on a grid map on the basis, and updates the confidence coefficient of each map grid. And finally, in order to reduce the accumulated error of the sensor, the back end of the system performs closed-loop detection and map optimization on the obtained map.

First, system front end

1. ORB-based feature point cloud generation

(1) Extracting characteristic points: the binocular vision sensor can simultaneously obtain a left view I of the environment_LAnd right view I_RUsing ORB (organized FAST and Rotated BRIEF, an algorithm for extracting and describing FAST feature points) algorithm to perform FAST (features from estimated Segment test) corner detection on left and right views, extracting feature points, and calculating the main direction of the feature points to generate an oFAST, the specific flow is as follows:

firstly, carrying out gray processing on the obtained left view and right view: y is 0.39 × R +0.5 × G +0.11 × B, Y is the gray level of a certain grayed pixel, and R, G, B is the three color components of red, green, and blue of the pixel.

Secondly, in the left view, 16 pixels (M10 template) are located in the circle with the radius rounding to 3 downwards and with the pixel point P as the center, as shown in fig. 2. If the gray value of N continuous (9 is more than or equal to N is less than or equal to 12) points in M16 is more than Y_P+ t or both less than Y_PT (wherein Y)_PIs the gray value of the point P, t is the threshold), the point P is determined to be a feature point. (drawing a circle with a radius of 3 pixels around the center of P. if the gray value of n continuous pixel points on the circle is larger or smaller than that of the P point, the P point is considered as a characteristic point, and n is generally set to be 12.)

FIELDThe moment of S can be expressed as:

the principal direction θ of the feature point P is:

θ＝arctan(m₀₁/m₁₀)

(2) and (3) generating a feature point descriptor: within the neighborhood S of the feature point P, a random set of n pairs of points is generated, represented by a2 × n matrix:

according to the principal direction theta of the characteristic points obtained in the step (c) and the corresponding rotation matrix R_θCalculate the rotated 2 × n matrix:

wherein the content of the first and second substances,

represents a point i_θIs determined by the gray-scale value of (a),

represents a point j_θThe gray value of (a);

a descriptor of the rotation of the feature point P is now available:

k' may be 128, 256, or 512, and the corresponding descriptors may have sizes of 16 bytes, 32 bytes, and 64 bytes, respectively, selected according to the real-time performance, storage space, and recognition efficiency of the algorithm.

(3) Matching the characteristic points: after descriptors of feature points of two frames of images of a left view and a right view are obtained according to the scheme, the Hamming distance D of the two frames of images is calculated for matching:

wherein f is_L，f_RRespectively, a descriptor of a certain feature point of the left view and the right view. When two feature points with the Hamming distance smaller than the threshold exist in the two frames of images, the two feature points are successfully matched.

Calculating the space position of the feature point: suppose that the image point coordinate of the feature point P in the left camera is (u)_L，v_L) The image point coordinate in the right camera is (u)_R，v_R) Then, the coordinates (x, y, z) of the feature point P in space are calculated as follows:

wherein f is_{Focal length}Calculating all matched feature point coordinates for the focal length of the binocular camera and d the distance between the two cameras, wherein the feature points have unique space coordinatesA unique descriptor.

2. Robot pose transform estimation

And processing the laser data to obtain laser point cloud data, and filtering the laser point cloud data, namely removing corresponding noise points (points with a short distance and points with a long distance) in each laser reflection point in the laser point cloud data, and taking the rest as effective points. And calculating the pose of the effective laser point cloud in the laser sensor coordinate system.

(1) Confidence coefficient of robot posture transformation estimation and map grid updating when observed value is laser

Robot pose initial value estimation

And setting the time point of obtaining the laser data of each frame as the time point of calculating the pose. Suppose that the pose of the robot at the time point t is P (t), the time point of the former pose calculation is t-1, the corresponding pose is P (t-1), the time point of the next pose calculation is t +1, and the corresponding pose is P (t + 1). And calculating the linear velocity V (x, y) and the angular velocity W of the robot motion through the position and posture difference (comprising displacement and rotation angle) between the time points t and t-1.

Linear velocity V (x, y) — (displacement between t and t-1)/time difference

Angular velocity W ═ angle of rotation between t and t-1)/time difference

Correcting the initial predicted pose value according to the latest received odometer data and the latest received inertial navigation data to obtain an estimated pose value (position _ estimated);

② positioning the scanning window

The displacement scanning parameters are used for limiting the displacement range of a positioning scanning window, and are respectively deviated from squares with the size of lcm up, down, left and right by taking the position and pose estimated value (position _ estimated) as the center. The angle scanning parameters are used for limiting the angle range of the positioning scanning window, and the angle of the upper part and the lower part deviating from the angle of w degrees is taken as the central angle by the estimated value of the pose.

And determining the position and pose of each scanning angle on each map grid in the positioning and scanning window according to the size of the positioning and scanning window, and taking the position and pose as all possible candidate position and pose _ candidates.

(iv) confidence estimation

According to the confidence degree of each map grid corresponding to each candidate pose (the confidence degree value of the map grid is related to the map building process, and is a determined value in the positioning process), calculating the confidence degree candidate _ probability of each candidate pose, and the formula is as follows:

wherein m is the total number of map grids in the discrete scanning data of the scanning angle corresponding to a certain candidate pose, and the map coordinate of the nth grid is (x)_n，y_n) The grid confidence is

The value range is 0.1-0.9.

Calculating confidence coefficient weight candidate _ weight corresponding to each candidate pose according to the pose difference between each candidate pose able _ candidate and the estimated value of the pose, position _ estimated, and the formula is as follows:

wherein x _ offset is the displacement along the x-axis between each candidate pose and the estimated value of pose, position _ estimated, y _ offset is the displacement along the y-axis between each candidate pose and the estimated value of pose, position _ estimated, trans_weightRotation is the rotation angle between candidate pose and pose estimate, position _ estimated, rot_weightIs the rotation angle weight;

the product of the confidence level candidate _ probability and the confidence weight candidate _ weight of each candidate pose is used as the confidence level score of the current pose, the formula is as follows,

score＝candidate_probability*candidate_weight

the estimated value of the pose update pose with the highest confidence score, pos _ estimated, is selected as the optimal pose estimate P (t + 1). And (3) according to the obtained optimal pose transformation estimation P (T +1) and the current pose P (T) of the robot, solving a robot pose transformation C (comprising a rotation matrix and a displacement matrix), moving the robot to the pose of P (T +1) according to the transformation relation T, and projecting the current point cloud data into a map grid of a corresponding position on the basis. If a plurality of points repeatedly falling on the same map grid position exist, only the coordinates of the map grid corresponding to one point in the map coordinate system are taken, and the probability of the grid is updated according to the projection condition of the laser point cloud. After the first frame of laser points are projected on the map grids, the confidence coefficient after each grid is updated is as follows:

when the second frame of laser data falls on the two-dimensional grid plane, updating the confidence level of each grid according to the following formula:

the state update coefficients of the respective grids are:

when the nth frame of laser point cloud data falls into a two-dimensional plane grid, the confidence coefficient of each grid is as follows:

wherein, P_n-1(x_i，y_i) When the n-1 th frame of laser point cloud data falls into a two-dimensional plane grid, the coordinate is (x)_i，y_i) Confidence of the grid of (a);

(2) robot pose transformation estimation when observed values are feature points

Suppose that the matching feature point sequence obtained from the left and right views at time t-1 is S ═ S₁，s₂，…，s_g]And the matched characteristic point sequence obtained from the left view and the right view at the moment t is P ═ P₁，p₂，…，p_h]And matching the characteristic point sequence S, P at the time T-1 and the time T, and then obtaining a pose transformation relation T' of the robot according to the coordinate change of the matched characteristic points and the main direction change of the characteristic points, so that the descriptors of the characteristic points are projected onto the corresponding map grids on the basis, and at this time, a second attribute of the grids, namely the descriptor attribute of the characteristic points, is defined. At the moment, the confidence coefficient of the grid map has the double attributes of laser probability and characteristic points, and the first heavy attribute and the second heavy attribute are independent of each other.

4. Generating subgraphs

And generating subgraphs by using continuous laser point cloud data by using a tile type graph building method, wherein each continuous Ts data generates one subgraph, and the second subgraph comprises data in the last T (T is less than T) s of the last subgraph. Such as: the data in the 1s to 40s are registered to form a subgraph A1, the data in the 21s to 60s are registered to form a subgraph A2, and the data in the 41s to 80s are registered to form a subgraph A3.

Second, the back end of the system

1. Closed loop detection

And extracting feature points by utilizing an ORB algorithm according to the currently obtained left view and right view, and calculating descriptors of the feature points. Storing the obtained descriptors of the characteristic points into a set Q, Q ═ Q₁，Q₂，…，Q_NAnd matching each descriptor in the Q with a second-layer attribute, namely a grid feature point descriptor, corresponding to each grid in each subgraph, assuming that M subgraphs are generated at the time. And when the success rate of matching between the descriptor in the set Q and the descriptor of the grid feature point of a certain sub-image reaches over 90 percent, the closed-loop detection is successful.

2. Graph optimization

The graph optimization is to reduce the track distortion by reducing the drift in the track, and the pose of the robot is represented as a node in the graph in the process of the graph optimizationThe observation information is converted into a constraint relation between poses of the robots after being processed, and is represented by edges between the connecting nodes. Each node of the graph depends on the description of the pose of the robot, and the state equation of all the nodes is

Wherein

Respectively describing the transformation relation between the poses of the robot under the global coordinate system

The observation equation between node Pi and node Pj can be expressed as z_j＝x_iT_i，jI.e. the robot passes through a movement T_i，jFrom Pi to Pj, where x_iIs a node P_iPose, x, of the robot in global coordinate system_jIs a node P_jPose, z, of the corresponding robot in the global coordinate system_jIs a node P_jAn observed value of (1), i.e. x_iEstimated pose, T, transformed by a transformation matrix_i，jIs a node P_iAnd node P_jThe transformation relationship between them.

In the optimization process, each node has an estimated value, and two vertexes connected by an edge have an estimated value respectively, and an error exists between the estimated values and the constraint of the edge. That is, ideally we have:

z_j＝x_iT_i，j

robot pose x_jAnd x_iEstimated pose z after transformation by transformation matrix_jThere is an error e (z) between_j，x_j)：

e(z_j，x_j)＝x_iT_i，j-x_j

Since each edge produces a small point error, assuming a graph with K edges is created, the objective function can be written as:

wherein the information matrix omega_kIs the inverse of the covariance matrix and is a symmetric matrix. Each element omega thereof_i，jAs a coefficient of error, pair e (z)_i，x_i)、e(z_j，x_j) The error correlation of (c) is predicted. The simplest is to handle omega_kAnd setting the matrix into a diagonal matrix, wherein the size of the elements of the diagonal matrix indicates the attention degree of corresponding errors of each element. Other covariance matrices may be used.

x_kBy an increment Δ x, the error value is increased from e (z)_k，x_k) Become e (z)_k，x_k+ Δ x), the first order expansion of the error term:

J_kis e (z)_k，x_k) With respect to x_kThe derivative of (c) is in the form of a jacobian matrix. For the objective function term of the kth edge, further developed are:

E(x_k+Δx)＝e(z_k，x_k+Δx)^TΩ_ke(z_k，x_k+Δx)

≈[e(z_k，x_k)+J_kΔx]^TΩ_k[e(z_k，x_k)+J_kΔx]

＝e(z_k，x_k)^TΩ_ke(z_k，x_k)+2e(z_k，x_k)^TΩ_kJ_kΔx+Δx^TJ_k ^TΔx

＝C_k+2b_kΔx+Δx^TH_kΔx

sorting the terms independent of Δ x into constant terms C_kCoefficient of first order of Deltax_kΔ x quadratic coefficient is written as H_k。

Rewriting the objective function:

wherein

x_kAfter the increment occurs, the change in the (x) term of the objective function E is:

ΔE(x)＝2bΔx+Δx^THΔx

to minimize Δ E (x), the derivative of Δ E (x) with respect to Δ x is made zero, having

And updating the state variable X by using the obtained delta X, finally obtaining a convergence value of the state variable X meeting the target equation through multiple iterations, optimizing the pose of the robot in the whole motion process, and forming a final mapping result by the pose of each node in the X.

The invention adopts a mapping method combining vision and laser, on one hand, the defect that a vision algorithm is greatly influenced by illumination is improved, on the other hand, closed loop is detected by using the descriptor attributes of the characteristic points, the problem of low success rate of laser closed loop detection is solved, and the closed loop detection efficiency is improved.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A robot indoor map building method based on vision and laser slam is characterized by comprising the following steps:

extracting feature points of the left view and the right view,

the obtained feature points are matched with each other,

calculating the space coordinates of the matched characteristic points;

calculating pose transformation estimation of the vision-based robot, namely performing secondary matching on feature points matched with the left view and the right view at the moment and feature points matched with the left view and the right view at the last moment at the moment, obtaining a pose transformation relation of the robot according to coordinate change of the secondary matched feature points and main direction change of the feature points, projecting descriptors of the feature points onto corresponding map grids, and defining feature point descriptor attributes of the grids;

the graph optimization specifically comprises the following steps: representing the pose of the robot through the nodes of the graph, representing the transformation relation among the poses of the robot through the edges connecting the nodes, calculating the error between each pose of the robot and the estimated pose transformed by the transformation matrix, calculating an error objective function of the graph with a plurality of edges, and obtaining the pose of the robot converging the objective function through multiple iterations, wherein the pose of the robot in each sub-graph is used for updating the pose of the robot in each sub-graph so as to form a final graph building result;

the extracting feature points of the left view and the right view comprises:

establishing a coordinate system with the characteristic point P as an origin, calculating a centroid position C in the neighborhood S, and constructing a vector with the characteristic point P as a starting point and the centroid C as an end point

The moments of neighborhood S can be expressed as:

the principal direction θ of the feature point P is:

θ＝arctan(m₀₁/m₁₀)；

generating a characteristic point descriptor:

according to the principal direction theta of the characteristic point and the corresponding rotation matrix R_θCalculate the rotated 2 xn matrix:

wherein, the first and the second end of the pipe are connected with each other,

wherein i_θ＝i_1θ,…,i_nθ，j_θ＝j_1θ,…,j_nθτ denotes a binary test criterion, Y_iθRepresents a point i_θGray value of (2), Y_jθRepresents point j_θThe gray value of (a);

a descriptor of the rotation of the feature point P is now available:

k' is 128, 256 or 512, the size of the corresponding descriptor is 16 bytes, 32 bytes or 64 bytes respectively, and the selection is carried out according to the real-time performance, the storage space and the identification efficiency of the method;

the method for obtaining laser data, calculating the confidence of the pose transformation estimation and the map grid updating of the laser-based robot specifically comprises the following steps:

correcting the initial predicted pose value according to the latest received odometer data and the latest received inertial navigation data to obtain an estimated value (position _ estimated) of the pose, wherein the initial predicted pose value is obtained by calculating laser data of the robot,

determining the pose of each scanning angle on each map grid in the positioning scanning window through the size of the positioning scanning window as all possible candidate pose positions,

calculating the confidence coefficient candidate _ probability of each candidate pose according to the confidence coefficient of each map grid corresponding to each candidate pose, wherein the formula is as follows:

wherein m is the total number of map grids in the discrete scanning data of the scanning angle corresponding to a certain candidate pose, and the map coordinate of the nth grid is (x)_n,y_n) The grid confidence is

The value range is 0.1-0.9,

calculating the confidence coefficient weight candidate _ weight corresponding to each candidate pose according to the pose difference between each candidate pose able _ candidate and the estimated value of the pose, position _ estimated, and the formula is as follows:

wherein x _ offset is the displacement along the x-axis between each candidate pose and the estimated value of pose, position _ estimated, y _ offset is the displacement along the y-axis between each candidate pose and the estimated value of pose, position _ estimated, trans_weightRotation is the rotation angle between candidate pose and pose estimate, position _ estimated, rot_weightIs the weight of the angle of rotation,

the product of the confidence measure _ probability and the confidence weight measure _ weight of each candidate pose is used as the confidence score of the current pose, and the formula is as follows,

score＝candidate_probability*candidate_weight

selecting an estimated value (posesestimated) of the pose update pose with the highest confidence score (score) as an optimal pose estimate P (t +1), where t represents a time point at which the robot pose is calculated,

and (3) obtaining robot pose transformation C according to the obtained optimal pose transformation estimation P (T +1) and the current pose P (T) of the robot, moving the robot to the pose of P (T +1) according to the transformation relation T, projecting the current point cloud data into a map grid at a corresponding position on the basis, if a plurality of points repeatedly falling into the same map grid position exist, only taking the coordinate of the map grid corresponding to one point in a map coordinate system, updating the probability of the grid according to the projection condition of the laser point cloud, and after the first frame of laser points are projected onto the map grid, updating the confidence coefficient of each grid as follows:

the state update coefficients of the respective grids are:

wherein, P_n-1(x_i,y_i) When the n-1 th frame of laser point cloud data falls into a two-dimensional plane grid, the coordinate is (x)_i,y_i) The confidence level of the grid of (a),

and (3) estimating the robot pose transformation when the observed value is the characteristic point: suppose that the matching feature point sequence obtained from the left and right views at time t-1 is S ═ S₁,s₂,…,s_g]And the matched characteristic point sequence obtained from the left view and the right view at the moment t is P ═ P₁,p₂,…,p_h]Matching the characteristic point sequence S, P at the time T-1 and the time T, and then obtaining the pose transformation relation T' of the robot according to the coordinate change of the matched characteristic points and the main direction change of the characteristic points, thereby projecting the descriptors of the characteristic points to the corresponding map grids on the basis, and defining the second attribute of the grids, namely the descriptor attribute of the characteristic pointsAt the moment, the confidence coefficient of the grid map has the double attributes of the laser probability and the characteristic points, and the first double attribute and the second double attribute are independent of each other.

2. The robot indoor mapping method based on vision and laser slam according to claim 1, wherein the matching of the obtained feature points is specifically:

wherein f is_L,f_RDescriptors of a certain characteristic point of the left view and the right view respectively; when two feature points with the Hamming distance smaller than the threshold exist in the two frames of images, the two feature points are successfully matched.

3. The robot built-in room map method based on vision and laser slam according to claim 2, characterized in that the calculating the spatial coordinates of the matched feature points is specifically:

wherein (x, y, z) is the coordinate of the characteristic point P in space, (u)_L,v_L) For the image point coordinates of the feature point P in the left camera, (u)_R,v_R) As coordinates of image points in the right camera, f_{Focal length}Is the focal length of the binocular camera and d is the distance between the two cameras.

4. The robot indoor mapping method based on vision and laser slam according to claim 3, wherein the tile-type mapping method is adopted, and the generation of subgraphs by using continuous laser point cloud data specifically comprises:

5. The robot indoor mapping method based on vision and laser slam according to claim 4, wherein the mapping optimization specifically comprises:

Wherein

The positions and postures of the robot under the global coordinate system are respectively described, and the transformation relation among the positions and postures of the robot is described; the observation equation between node Pi and node Pj can be expressed as:

z_j＝x_iT_i,j

wherein x is_iIs a node P_iPose, x, of the corresponding robot in the global coordinate system_jIs a node P_jPose, z, of the corresponding robot in the global coordinate system_jIs a node P_jOf observed value of (a), i.e. x_iEstimated pose after transformation by transformation matrix, T_i,jIs a node P_iAnd node P_jThe transformation relationship between the two;

robot pose x_jAnd x_iEstimated pose z after transformation by transformation matrix_jThere is an error e (z) between_j,x_j) The calculation formula is as follows:

e(z_j,x_j)＝x_iT_i,j-x_j

the error objective function e (x) of the graph with several edges is:

wherein the information matrix omega_kIs the inverse of the covariance matrix, is a symmetric matrix; each element omega thereof_i,jAs a coefficient of error, pair e (z)_i,x_i)、e(z_j,x_j) The error correlation of (a) is predicted;

6. A robot indoor map system based on vision and laser slam is characterized in that the system is realized based on the method of any one of claims 1 to 5 and is divided into a front end and a rear end, the front end of the system acquires a left view and a right view of an environment by using a binocular vision sensor, extracts the left view and the right view by an ORB algorithm, matches feature points of the left view and the right view, and further finds coordinates of the feature points in a space; matching the feature points obtained at adjacent moments, further solving the pose transformation relation of the robot at the moments, projecting the feature points onto a map grid, and defining that the grid map with the feature point projection has descriptor attributes; the front end of the system obtains laser point cloud data of the environment through a laser sensor, de-noizes the laser point cloud data, obtains pose transformation estimation of the robot by obtaining a pose initial value of the robot, defining a scanning window and obtaining a confidence coefficient, projects the laser point cloud on a grid map, and updates the confidence coefficient of each map grid; and the back end of the system performs closed-loop detection and map optimization on the obtained map.