CN115457132A

CN115457132A - Method for positioning grid map based on binocular stereo camera

Info

Publication number: CN115457132A
Application number: CN202211127974.7A
Authority: CN
Inventors: 章雨昂; 仲维; 刘晋源; 王维民; 樊鑫; 刘日升; 罗钟铉
Original assignee: Dalian University of Technology
Current assignee: Dalian University of Technology
Priority date: 2022-09-16
Filing date: 2022-09-16
Publication date: 2022-12-09

Abstract

The invention discloses a method for positioning a grid map based on a binocular stereo camera, belonging to the field of image processing and computer vision. The method comprises the steps of obtaining images by using a visible light binocular stereo camera and an infrared binocular stereo camera together, obtaining a disparity map by using the binocular stereo camera to obtain real-time three-dimensional point cloud information, and constructing a high-quality grid mapping algorithm to realize the mapping from 3d point cloud to 2d grid map, wherein the grid height and grid threshold screening algorithm is realized according to prior information; adding a blocking counting algorithm to process the blocking problem in the barrier by combining with the space geometric knowledge; removing possible noise interference in the occupied grids through the posterior probability of the grids; using a target detection algorithm to assist in clustering the target objects and estimating the actual distance of the target objects; optimizing a dynamic target object by combining particle filtering and multi-frame information; and fusing the grid map results obtained by the visible light camera and the infrared camera and outputting the fused grid map results.

Description

Method for positioning grid map based on binocular stereo camera

Technical Field

The invention belongs to the field of image processing and computer vision, and relates to a method for positioning a grid map based on a binocular stereo camera.

Background

Stereoscopic vision has been one of the most popular research directions in the field of computer vision since the middle of the 20 th century. The stereoscopic vision obtains a parallax image by simulating human eyes to observe objects and by a stereoscopic matching technology, thereby obtaining three-dimensional information of a real scene, namely a depth map, and obtaining a three-dimensional point cloud model. At present, the binocular stereo vision technology is rapidly developed in the fields of automatic driving, robot navigation and path planning, vision measurement and the like. One of the most important problems in the field of robot navigation and path planning is the construction of a navigation map. In the practical application process of navigation and path planning, common navigation maps are generally divided into three categories, namely, topological maps, semantic maps and scale maps. The topological map generally has no actual distance information, only represents the connectivity and the path relation of two positions, and is most common in large-scale path navigation and planning; the scale map generally has actual distance information, including a point cloud map, a grid map, a feature map and the like, and is common in navigation map construction, positioning and small-scale path planning; the semantic map is added with labels on the basis of the scale map, and any place on the map can be represented by the labels and is commonly used for man-machine interaction. Among them, a scale map is of great interest because it can display real environmental information around the robot in real time. The most widely applied grid map in the scale map is the grid map, because the real scene around the carrier can be visually presented on a 2-dimensional plane, and the state of each grid represents whether an obstacle exists in the real scene, the grid map is more suitable for map construction and obstacle positioning.

Most conventional grid maps are generated by a laser radar (LIDAR) construction. The sensor of the laser radar can emit a laser signal towards the surrounding environment, when the sensor encounters an obstacle, the laser can be reflected back to the sensor, so that the distance from the sensor to the obstacle can be calculated through the round-trip time difference, and the laser radar aims to provide a complete 360-degree panoramic view, and the visualization of the surrounding environment of the vehicle is realized in a three-dimensional point cloud mode by using laser pulses. However, the disadvantages of the method are obvious, and for civil vehicles, compared with other devices, the laser radar is an expensive choice, so that most of the laser radar can only be used on high-grade vehicles, mass production and popularization cannot be achieved, and due to the fact that a radar sensor generates noise, the image resolution is generally low, and a real scene cannot be presented; for military use, although the laser radar can feed back surrounding environment information in real time, a large number of signals are transmitted to the outside while emitting laser, and the signals are likely to be utilized by enemies to expose the real position of the laser radar, so that great hidden danger is caused.

So far, an algorithm for navigation and positioning based on binocular stereo vision gradually enters the sight of people. The binocular stereo vision technology simulates stereo perception of human eyes to a three-dimensional space through a group of binocular stereo cameras, the binocular cameras are used for shooting a left image and a right image at the same time, then characteristic points in the left image and the right image are extracted for stereo matching, the parallax in the images is calculated by using a triangular distance measurement principle to obtain a parallax image, a depth image is further calculated through the parallax image, and therefore three-dimensional point information of each object in the images is obtained, a point cloud image is generated, and real three-dimensional information in a scene is restored. The advantages of navigation and positioning based on binocular stereo vision are: compared with a laser radar sensor and a structured light sensor, the cost of the binocular camera is much lower; the problem of target identification is reduced, whether the distance is calculated after the obstacle is identified is not needed, and the depth map is directly calculated on the disparity map; compared with the problem that the laser radar sensor is noisy, the binocular stereo vision estimates the depth by calculating the parallax, and the error is smaller. The binocular stereo vision has the defects that the calculation amount is large, the processing speed of a CPU is required to be fast enough, and therefore a special chip or an integrated calculation method on a circuit board is required to realize the calculation of the characteristic points and the stereo matching. And special FPGA has been researched at home and abroad to complete the calculation of binocular camera image stereo matching.

At present, most grid map construction methods construct two-dimensional maps, lack height information and cannot meet the requirements of equipment with three-dimensional mobility, such as unmanned aerial vehicles and the like. At present, a method for constructing a grid map by using a binocular stereo camera is urgently needed, so that the grid map can contain information of a three-dimensional real scene and can be used as civil vehicle-mounted navigation map equipment or military application. Compared with a laser radar sensor, the general binocular stereo camera has a poor effect in dark days or extremely severe weather.

Disclosure of Invention

In order to solve the problems, the invention provides a method for positioning a grid map by using a binocular stereo vision camera. The invention combines the results of a visible light binocular camera and an infrared binocular camera, so that the visible light binocular camera and the infrared binocular camera can show good imaging effect in various unknown scenes, and aims to abandon the expensive method for generating the grid map by using a laser radar sensor, obtain a disparity map by using a binocular stereo camera, estimate a depth map through the disparity map, and construct the grid map after converting the disparity map into three-dimensional point cloud. And constructing a high-performance operation platform by using a binocular stereo camera and a GPU, and constructing a high-performance solving algorithm to obtain a high-quality grid map containing three-dimensional information. The method can complete the acquisition of input data only by using a binocular stereo camera, and then realizes a grid height and grid threshold value screening algorithm by using spatial prior information and statistics; adding a blocking counting algorithm to process the blocking problem in the barrier by combining with the space geometric knowledge; removing noise interference possibly existing in the occupancy grid through the grid posterior probability; using a target detection algorithm to assist in clustering the target objects and estimating the actual distance of the target objects; and optimizing the dynamic target object by combining particle filtering and multi-frame information. The invention realizes an algorithm for constructing the two-dimensional grid map containing three-dimensional information by using the binocular stereo camera, and adds an efficient post-processing mechanism on the basis of the traditional mapping method to ensure that the grid information is more accurate.

The specific content comprises the following steps: the method comprises the steps of using a visible light binocular stereo camera and an infrared binocular stereo camera to jointly obtain images, respectively calculating a disparity map through the visible light images and the infrared images, fusing the disparity maps obtained by the visible light camera and the infrared camera to generate a fused disparity map, obtaining real-time three-dimensional point cloud information according to the fused disparity map, and constructing a high-quality grid mapping algorithm to realize mapping from 3d point cloud to 2d grid map, wherein the grid height and grid threshold screening algorithm is realized according to prior information; adding a blocking counting algorithm to process the blocking problem in the barrier by combining with the space geometric knowledge; removing possible noise interference in the occupied grid through the posterior probability of the grid; using a target detection algorithm to assist in clustering the target objects and estimating the actual distance of the target objects; and optimizing the dynamic target object by combining particle filtering and multi-frame information.

The technical scheme of the invention is as follows:

the method for positioning the grid map based on the binocular stereo camera comprises the following steps:

1) Acquiring three-dimensional point cloud information:

1-1) respectively acquiring RGB images of a real scene and corresponding parallax images thereof through a binocular visible light camera and an infrared camera;

1-2) fusing the visible light parallax image and the infrared parallax image, wherein the fusing method comprises the following steps: comparing the confidence coefficient of each parallax in the visible light parallax image and the infrared parallax image, keeping the parallax value with large confidence coefficient in the visible light parallax image and the infrared parallax image, and finally obtaining a fusion parallax image;

1-3) generating a three-dimensional point cloud according to the fusion disparity map, and converting the three-dimensional point cloud into a world coordinate system.

2) Initializing a grid map:

2-1) determining the size of the grids, wherein the side length of each grid represents actual distance information;

2-2) determining the spatial position of the grid corresponding to the real scene: the range in the z direction (closest and farthest distances) of the perceived obstacle, and the range in the y direction (highest height h) of the obstacle are set _max And a minimum height h _min ) And the calculated range (field angle) of the obstacle in the x direction;

2-3) initializing grid state: idling;

3) Mapping the three-dimensional point cloud to a two-dimensional grid:

3-1) carrying out noise point filtering on the generated three-dimensional point cloud through bilateral filtering, and mapping the filtered three-dimensional point cloud to a two-dimensional grid map plane;

3-2) counting the number of three-dimensional points mapped in the grid and the height information of the three-dimensional points;

4) Grid height and threshold screening:

4-1) obtaining the height of the three-dimensional points mapped into each grid, sorting the height of the three-dimensional points in each grid, removing the height of the three-dimensional points of the first 10% and the second 10% after sorting, and carrying out weighted average on the heights of the three-dimensional points after screening to obtain the final grid height h _i ；

4-2) obtaining the number of three-dimensional points num mapped into each grid _i Calculating a mapping threshold T for each grid _i The formula is as follows:

wherein img _width Is the width of the image, h _i Height of the current grid, Z _k The grid depth of the k-th row is theta, the field angle size of the initialization setting is theta, and alpha is a preset smoothing coefficient; formula (1) represents the number of pixels represented by a plane formed by the grid of the k-th row and the height of the grid as the threshold value of the number of grids;

4-3) comparing the number of three-dimensional points num in each grid _i And a mapping threshold T for each grid _i If num _i ＞T _i If the grid is in the occupied state, otherwise, the grid is in the idle state;

4-4) if the grid is in the occupied state, the height h of the grid is set _i Comparison with the y-direction Range of the initialization grid settings (h) _max And h _min ) If h is _min ＜h _i ＜h _max Then h is given _i Setting the height value of the current grid, otherwise setting the current grid state to be idle;

5) Screening sheltered obstacles: processing occlusion by a occlusion counter;

5-1) judging whether the grid belongs to a blocking state:

there are three cases of a blocking grid:

a. out of a set distance range or too close to the binocular camera;

b. the grid and the origin are spaced apart by one or more obstacles;

c. the height of a background obstacle in the same obstacle is lower than that of a foreground obstacle;

5-2) find the initial occupied grid, its height is noted gridh _i ；

5-3) using polar coordinates, the occupied-state grid height gridh of the same column after the first occupied-state grid _i-1 And gridh _i Comparison when gridh _i ＞gridh _i-1 When the blocking value is greater than the blocking threshold β, the grid is set to the blocking state, i.e., the barrier is blocked.

6) Noise point elimination: removal of noise interference possibly present in an occupied grid by a grid posterior probability

6-1) calculating the z-direction distance error of the current three-dimensional point under a world coordinate system, wherein the formula is as follows:

wherein b is a base line of the binocular stereo camera, f is the focal length of the camera, z is a z-direction coordinate of the three-dimensional point in a world coordinate system, and delta _d Is a third scale factor in the disparity calculation;

6-2) calculating the x-direction distance error of the current three-dimensional point in a world coordinate system, wherein the formula is as follows:

d is the distance from the origin of the camera to the three-dimensional point in the world coordinate system, x is the x-direction coordinate of the three-dimensional point in the world coordinate system, and δ z is the z-direction distance error of the three-dimensional point in the world coordinate system;

6-3) converting the distance error of the current three-dimensional point in the z and x directions under a world coordinate system into the relative row and column error of the two-dimensional grid map, wherein the formula is as follows:

wherein delta _z And delta _x Distance errors of the three-dimensional points in the z direction and the x direction under a world coordinate system are obtained, and dz and dx are the resolutions of grids relative to the world coordinate system;

6-4) calculating the probability density of the occupation of the grid within the error range of the row and the column by taking the current grid as the center, wherein the formula is as follows:

wherein P is _occupied Representing the occupancy probability density of the current grid:

P _free free probability density representing the grid:

P _free ＝1-P _occupied

δ _row and delta _col gridNum, the relative row-column error of the grid _occupied Representing the number of cells occupying the state gridNum _total Is delta _row And delta _col RangeTotal number of inner grids. The occupancy probability density of the current grid is δ with the current grid as the center _row And delta _col The ratio of the number of all occupied grids in the range to the number of all grids in the range; similarly, the free probability density of the current grid is the complement of the occupied probability density.

6-5) calculating the distance probability density of the current grid to the nearest occupied grid and the free grid, wherein the formula is as follows: wherein P is _distance (anchored) represents the distance probability density of the current grid to the nearest occupied grid:

P _distance (free) represents the distance probability density of the current grid to the nearest free grid:

wherein

And

representing the row-column difference of the current grid to the most recently occupied grid,

and

representing the row-column difference from the current grid to the nearest free grid;

6-6) calculating the occupation weight coefficient and the idle weight coefficient of the current grid, wherein the formula is as follows:

wherein ω is _occupied Occupancy weight coefficient for current grid:

ω _occupied ＝P _occupied *P _distance (occupied)

ω _free for the idle weight coefficient of the current grid:

ω _free ＝P _free *P _distance (free)

wherein P is _occupied And P _free Respectively, the occupation and idle probability densities, P, of the current grid _distance (occupied) and P _distance (free) distance probability densities of the current grid to the nearest occupied and free grids, respectively;

6-7) calculating the posterior occupancy probability of the current grid, wherein the formula is as follows:

wherein gridNum _occupied Is delta _row And delta _col Number of occupied grids in the range gridNum _free Is delta _row And delta _col The number of free grids in the range.

6-8) according to P _posterior If P is a noise point, judging whether the occupied grid is a noise point _posterior Less than the threshold, the occupied grid is considered likely to be noisy, and the occupied grid is changed to a free grid.

7) Clustering the target objects: target object clustering is assisted through target detection algorithm and actual distance of target objects is estimated

7-1) obtaining a target detection result by using yoloV5 algorithm, and obtaining the number Num of target objects in a three-dimensional scene _object Dis, distance estimate per object _i And the Center of each target object in the world coordinate system _i (x，y，z)；

7-2) clustering the targets using a modified K-Means clustering algorithm: center of each target _i (x, y, z) as the initialized clustering center point, and the number Num of the target objects _object As a K value, the K-Means clustering algorithm K unknown defect is made up, the operation speed is increased, and the formula is as follows:

wherein theta is _n (x, y, z) is a three-dimensional point data set, center _k For the cluster centers, k represents the number of cluster centers, α _nk For the binary variable 0,1 represents that the three-dimensional point belongs to class k, 0 represents not,

when the value of j is represented, the value of theta is minimized _n (x，y，z)-Cente _j || ² And in fact, distributing the three-dimensional point data set to the nearest clustering center, continuously performing iterative computation until the algorithm converges to minimize loss, and finishing clustering.

8) Multi-frame fusion: optimization of dynamic objects by particle filtering in combination with multi-frame information

8-1) obtaining the screened grid map model, and adding particles with different speeds into the grids in the occupied state. The particles are marked as S, S = { S = } _i |s _i ＝(x _i ，y _i ，vx _i ，vy _i ，p _i )，i＝1，...N _S In which x _i Lines representing the grid in which the particles are located, y _i Columns representing the grid in which the particles are located, vx _i And vy _i Respectively representing the row and column velocities of the particles, p _i Representing the period of the particle.

8-2) predicting the motion state of each particle according to the information of the next frame; constructing a prediction model, wherein the formula is as follows:

the angle and distance between two adjacent frames are:

wherein

Is the angular velocity, v is the velocity, Δ t is the time interval between two frames; the relative movement between grids is:

DX and Dy are resolution ratios of the grid lines and the columns respectively; the grid position in the current frame may be denoted as x _n ，y _n ：

The prediction model is:

the delta y, the delta x, the delta vy and the delta vx are random disturbance and are derived from a Gaussian distribution with zero mean and a state transition covariance matrix of a Kalman filter.

8-3) reassign the particles in the grid by weighting and resampling:

a. calculating an occupancy probability density and a vacancy probability density for each grid

Wherein p is _density (m (x, y) | occupied) represents the occupancy probability density of the grid:

p _density (m (x, y) | free) represents the free probability density of the grid:

P _density (m(x，y)|free)＝1-P _density (m(x，y)|occupied)

δ _x and delta _y Represents the error of the transverse and longitudinal distances projected by the three-dimensional points to the grid, grid (actual) represents the occupied grid, and the meaning of the formula is that the current grid delta _x ，δ _y Taking the ratio of the number of all occupied state grids in the range to the number of all grids in the range as the occupation probability density of the current grid;

b. calculating the occupation distance probability and the idle distance probability of each grid:

the occupied distances are represented in rows and columns, respectively, as:

row and Col represent the closest Row and column occupying the grid to the current grid, respectively

The free distance is represented in rows and columns, respectively, as:

the distance probability is expressed using a multidimensional gaussian function:

c. calculating an occupancy weight and an idle weight for each grid:

ω _occupied (x，y)＝p _density (m(x，y)|occupied)*p _distance (m(x，y|occupied)

ω _free (x，y)＝p _density (m(x，y)|free)*p _distance (m(x，y)|free)

d. the occupancy posterior probability for each grid is calculated:

wherein N is _i To occupy the actual number of particles of the grid, N _max Is the maximum number of particles allowed in a grid;

e. calculating the number of resamples N per grid _Ri ＝P _i *N _max And calculating the ratio of the number of resamples to the actual number

If f _i If > 1, add f to the grid _i Particles in whole number multiples of f _i If < 1, deleting the particle;

8-4) judging the grid state again according to the number of particles and the particle speed:

the occupancy probability of a grid is the ratio of the number of particles of the current grid to the maximum number of particles of the grid:

the current grid velocity is the vector sum of the particle velocities in the current grid:

if P of the grid _occupied If the grid speed is less than the speed of all particles in the grid, the grid is in an occupied stateThe standard deviation of the degrees indicates that the barrier in the grid is in a static state;

9) And counting the grids in the occupied state in the two-dimensional grids to generate a two-dimensional grid map.

The beneficial effects of the invention are:

compared with a common grid map algorithm, the method combines spatial geometric knowledge to add a blocking counting algorithm to process the problem of shielding in the barrier, so that the output result is more accurate; according to the invention, the grid height and grid threshold screening algorithm is realized according to the prior information, and the generalization capability and robustness of the algorithm are improved; noise interference possibly existing in the occupied grid is removed through the posterior probability of the grid, and the accuracy of the grid map is improved; the target detection algorithm is used for assisting in clustering the target objects and estimating the actual distance of the target objects, so that the problem that the target objects of a common grid map are ambiguous is solved; the result of the grid map is optimized by combining particle filtering and multi-frame information, so that the result optimization between continuous frames is realized; the parallax image results obtained by the visible light camera and the infrared camera are fused to generate a fused parallax image, the grid map is constructed according to the fused parallax image, and the good grid map result can be displayed in a complex scene.

Drawings

Fig. 1 is a left image captured by a binocular camera in the embodiment.

Fig. 2 is a disparity map in the embodiment.

Fig. 3 is an overall algorithm flowchart.

Fig. 4 is a flow chart of altitude and threshold and occlusion policies.

Fig. 5 is a flowchart of the target detection clustering strategy.

Fig. 6 is a flow chart of implementing multi-frame dynamic optimization by particle filtering.

Fig. 7 shows the grid map result without any optimization strategy.

Fig. 8 is the grid map results of fig. 7 after adding height and threshold screening and after blocking strategy.

Fig. 9 is a grid map result after the target detection clustering target object strategy and the particle filtering strategy are added to fig. 8.

Detailed Description

The following further describes a specific embodiment of the present invention with reference to the drawings and technical solutions.

The basic flow of the method for creating the two-dimensional grid map based on the binocular stereo camera is shown in fig. 3, and the method comprises the following specific steps:

1) Acquiring three-dimensional point cloud information:

1-1) respectively acquiring RGB (red, green and blue) images of a real scene and corresponding parallax images thereof through a binocular visible light camera and an infrared camera; the left image and the disparity map are shown in fig. 1 and fig. 2, respectively.

1-2) fusing the visible light parallax image and the infrared parallax image, wherein the fusing method is to compare the confidence coefficient of each parallax in the visible light parallax image and the infrared parallax image, retain the parallax value with high confidence coefficient in the visible light parallax image and the infrared parallax image, and finally obtain a fused parallax image;

2) Initializing a grid map:

2-2) determining the spatial position of the grid corresponding to the real scene: setting a range (closest and farthest distances) in a z direction of a perceived obstacle, as well as a range (highest height and lowest height) of the obstacle in a y direction and a calculation range (angle of view) of the obstacle in an x direction;

2-3) initializing grid states:

initializing grid states including idle, occupied and blocked;

3) The 3d point cloud maps to a 2d grid:

3-1) carrying out noise point filtering on the generated three-dimensional point cloud through bilateral filtering, and mapping the filtered three-dimensional point cloud to a 2-dimensional grid map plane;

4) Grid height and threshold screening (flow shown in fig. 4):

4-1) obtaining three-dimensional point heights mapped into each grid, for each gridSorting the three-dimensional point heights in the grids, removing the three-dimensional point heights of the front 10% and the rear 10% after sorting, and performing weighted average on the three-dimensional point heights after screening to obtain the final grid height h _i ；

wherein img _width Is the width of the image, h _i Height of the current grid, Z _k The grid depth of the k-th row, theta, the field angle size of the initialization setting, and alpha, a preset smoothing coefficient, wherein the formula is expressed as the number of pixels represented by a plane formed by the grid of the k-th row and the height of the grid as the threshold value of the number of the grids;

4-4) if the grid is in the occupied state, the height h of the grid is set _i Comparison with the y-direction Range of the initialization grid settings (h) _max And h _min ) If h is _min ＜h _i ＜h _max Then h will be _i Setting the height value of the current grid, otherwise setting the current grid state to be idle;

5) Screening sheltered obstacles: handling occlusion by blocking counters

5-1) judging whether the grid belongs to a blocking state:

there are three cases of a blocking grid:

a. out of a set distance range or too close to the binocular camera;

b. the grid and the origin are spaced apart by one or more obstacles;

5-2) find the initial occupied grid, its height is noted gridh _i ；

5-3) using polar coordinates, the occupied-state grid height gridh of the same column after the first occupied-state grid _i-1 And gridh _i Comparison, when gridh _i ＞gridh _i-1 When the blocking value is larger than the blocking threshold value beta, setting the grid to be in a blocking state;

wherein b is a base line of the binocular stereo camera, f is the focal length of the camera, z is a z-direction coordinate of the three-dimensional point in a world coordinate system, and delta _d A third scale factor in the disparity calculation;

wherein d is the distance from the origin of the camera to the three-dimensional point in the world coordinate system, x is the x-direction coordinate of the three-dimensional point in the world coordinate system, and delta _z The z-direction distance error of the three-dimensional point under a world coordinate system is obtained;

6-3) converting the distance error of the current three-dimensional point in the z and x directions under a world coordinate system into the relative row error of the two-dimensional grid map, wherein the formula is as follows:

wherein delta _z And delta _x For the z-direction (x-direction) distance error of a three-dimensional point in the world coordinate system, dz and dx are the resolutions of the grid with respect to the world coordinate system. 100mm in this invention;

P _free free probability density representing the grid:

P _free ＝1-P _occupied

δ _row and delta _col gridNum, the relative row-column error of the grid _occupied Representing the number of cells occupying the state gridNum _total Is delta _row And delta _col The total number of grids within the range. The occupancy probability density of the current grid is δ with the current grid as the center _row And delta _col The ratio of the number of all occupied grids in the range to the number of all grids in the range; similarly, the free probability density of the current grid is the complement of the occupied probability density.

6-5) calculating the distance probability density of the current grid to the nearest occupied (free) grid, and the formula is as follows: wherein P is _distance (occupied) represents the distance probability density of the current grid to the nearest occupied grid:

wherein

And

and

6-6) calculating an occupation weight coefficient and an idle weight coefficient of the current grid, wherein the formula is as follows:

wherein ω is _occupied Occupancy weight coefficient for current grid:

ω _occupied ＝P _occupied *P _distance (occupied)

ω _free for the idle weight coefficient of the current grid:

ω _free ＝P _free *P _distance (free)

wherein P is _occupied And P _free Respectively, the occupation (free) probability density, P, of the current grid _distance (occupied) and P _distance (free) distance probability densities of the respective current grid to the nearest occupied (free) grid;

wherein gridNum _occupied Is delta _row And delta _col Number of occupied grids in the range, gridNum _free Is delta _row And delta _col The number of free grids in the range.

6-8) according to P _posterior If P is a noise point, judging whether the occupied grid is a noise point _posterior If < 0.3, the occupied grid is considered possibly noisy, and the occupied grid is changed to a free grid.

7) Clustering the target objects: clustering the target objects and estimating their actual distance assisted by a target detection algorithm (as shown in FIG. 5)

7-1) obtaining a target detection result by using yoloV5 algorithm, and obtaining the number Num of target objects in a three-dimensional scene _object Dis, distance estimate per object _i And a Center of each object in a world coordinate system _i (x，y，z)；

7-2) clustering objects using a modified K-Means clustering algorithm, the Center of each object having been obtained by an object detection algorithm _i (x, y, z) and a k value (Num) _object ) The Center of each object is measured _i (x, y, z) as the initialized clustering center point, and the number Num of the target objects _objec t is used as a K value, so that the defect that the K-Means clustering algorithm K is unknown is overcome, the operation speed is increased, and the formula is as follows:

wherein theta is _n (x, y, z) is a three-dimensional point data set, center _k Is a clustering center, k represents the number of clustering centers, α _nk For the binary variable 0,1 represents that the three-dimensional point belongs to class k, 0 represents not,

when the value of j is represented, the value of theta is minimized _n (x，y，z)-Cente _j || ² The three-dimensional point data set is distributed to the nearest clustering center, iterative calculation is continuously carried out until the algorithm converges to minimize loss, and clustering is finished;

8) Multi-frame fusion: optimization of dynamic objects by particle filtering in combination with multiframe information (as shown in FIG. 6)

8-1) obtaining the screened grid map model, and adding particles with different speeds into the grids in the occupied state. The particles are denoted as S, S = { S = { (S) _i |s _i ＝(x _i ，y _i ，vx _i ，vy _i ，p _i )，i＝1，...N _S In which x _i Lines representing the grid in which the particles are located, y _i Columns representing the grid in which the particles are located, vx _i And vy _i Respectively representing the row and column velocities of the particles, p _i Representing the period of the particles.

8-2) predicting the motion state of each particle according to the information of the next frame, and constructing a prediction model, wherein the formula is as follows:

the angle and distance between two adjacent frames are:

wherein

Is the angular velocity, v is the velocity, Δ t is the time interval between two frames; the relative movement between the grids is:

DX and Dy are resolution of grid array and row respectively, and the resolution is 0.1m in the invention; the grid position in the current frame may be denoted as x _n ，y _n ：

The prediction model is:

8-3) reassign the particles in the grid by weighting and resampling:

P _density (m(x，y)|free)＝1-P _density (m(x，y)|occupied)

δ _x and delta _y Representing the lateral and longitudinal distance error of the projection of the three-dimensional point to the grid, grid (registered) representing the occupied grid, the meaning of the formula is that the current grid delta _x ，δ _y The ratio of the number of all occupied state grids in the range to the number of all the grids in the range is used as the occupation probability density of the current grid;

the occupied distances are represented in rows and columns, respectively, as:

The free distance is represented in rows and columns, respectively, as:

the distance probability is expressed by using a multidimensional Gaussian function:

c. calculating an occupancy weight and an idle weight for each grid:

ω _free (x，y)＝p _density (m(x，y)|free)*p _distance (m(x，y)|free)

d. the occupancy posterior probability for each grid is calculated:

a. calculating the number of resamples N per grid _Ri ＝P _i *N _max And calculating the ratio of the resample amount to the actual amount

If f _i If > 1, add f to the grid _i Particles in whole number multiples of f _i If the number is less than 1, deleting the particles;

if P of the grid _occupied If the velocity of the grid is less than the standard deviation of the velocities of all particles in the grid, the obstacle in the grid is in a static state;

9) Counting the grids in an occupied state in the two-dimensional grids, and generating a two-dimensional grid map: setting the resolution of the grid map, and visually outputting the grid map result; FIG. 7 shows the result of the grid map without any optimization strategy, and it can be seen that there are many noisy points and blocked obstacles in the graph; as shown in fig. 8, the grid map results after adding height and threshold screening and after blocking strategy, noise is reduced a little, and the number of blocked obstacles is reduced; as shown in fig. 9, the grid map result after the target detection clustering target strategy and the particle filtering strategy are added can be seen that noise is significantly less and the removal of the blocking obstacle is significant.

Claims

1. The method for positioning the grid map based on the binocular stereo camera is characterized by comprising the following steps of:

1) Acquiring three-dimensional point cloud information:

1-1) respectively acquiring RGB (red, green and blue) images of a real scene and corresponding parallax images thereof through a binocular visible light camera and an infrared camera;

1-3) generating a three-dimensional point cloud according to the fusion disparity map, and converting the three-dimensional point cloud into a world coordinate system;

2) Initializing a grid map:

2-2) determining the spatial position of the grid corresponding to the real scene: setting the range of the perceived obstacle in the z direction to include the nearest distance and the farthest distance, and the range of the obstacle in the y direction to include the highest height h _max And a minimum height h _min And the field angle of the obstacle in the calculation range in the x direction;

2-3) initializing grid states: idling;

3) Mapping the three-dimensional point cloud to a two-dimensional grid:

3-1) carrying out noise point filtering on the generated three-dimensional point cloud through bilateral filtering, and mapping the filtered three-dimensional point cloud onto a two-dimensional grid map plane;

4) Grid height and threshold screening:

4-1) obtaining the height of the three-dimensional points mapped into each grid, sorting the heights of the three-dimensional points in each grid, removing the heights of the three-dimensional points of the first 10% and the second 10% after sorting, and carrying out weighted average on the heights of the three-dimensional points after screening to obtain the final grid height h _i ；

wherein img _width Is the width of the image, h _i Height of the current grid, Z _k The grid depth of the k-th row is theta, the field angle size of the initialization setting is theta, and alpha is a preset smoothing coefficient; formula (1) shows that the number of pixels represented by a plane formed by the grid of the k-th row and the height of the grid is used as the threshold value of the number of grids;

4-4) if the grid is in the occupied state, the height h of the grid is set _i If h is compared to the y-direction range of the initialization grid setting _min <h _i <h _max Then h will be _i Setting the height value of the current grid, otherwise setting the state of the current grid to be idle;

5) Screening blocking obstacles: processing occlusion by a occlusion counter;

6) Noise point elimination: removing noise interference possibly existing in the occupancy grid through the grid posterior probability;

7) Clustering the target objects: clustering target objects and estimating the actual distance of the target objects by the aid of a target detection algorithm;

8) Multi-frame fusion: optimizing a dynamic target object by combining particle filtering and multi-frame information;

2. The method for positioning a grid map based on binocular stereo cameras of claim 1, wherein the grid state including idle, occupied and blocked is initialized in the step 2-3).

3. The method for positioning the grid map based on the binocular stereo camera according to claim 1, wherein the screening of the blocking obstacles in the step 5) comprises the steps of:

5-1) judging whether the grid belongs to a blocking state:

there are three cases of a blocking grid:

a. out of a set distance range or too close to the binocular camera;

b. the grid and the origin are spaced apart by one or more obstacles;

c. the height of the background obstacle in the same obstacle is lower than that of the foreground obstacle;

5-2) find the initial occupied grid, its height is noted gridh _i ；

4. The method for positioning a grid map based on binocular stereo cameras of claim 1, wherein the noise interference possibly existing in the occupancy grid is removed by the grid posterior probability in the step 6), comprising the steps of:

6-1) calculating the z-direction distance error of the current three-dimensional point in a world coordinate system, wherein the formula is as follows:

wherein d is the distance from the origin of the camera to the three-dimensional point in the world coordinate system, x is the x-direction coordinate of the three-dimensional point in the world coordinate system, and delta _z The distance error of the three-dimensional point in the z direction under a world coordinate system is obtained;

wherein delta _z And delta _x Distance errors of the three-dimensional points in the z direction and the x direction under a world coordinate system are shown, and dz and dx are the resolutions of grids relative to the world coordinate system; 100mm in this invention;

6-4) calculating the probability density of the grid occupation within the error range of the row and the column by taking the current grid as the center, wherein the formula is as follows:

P _free free probability density representing the grid:

P _free ＝1-P _occupied

δ _row and delta _col gridNum, the relative row-column error of the grid _occupied Representing the number of grids occupying the state gridNum _total Is composed of _row And delta _col A total number of grids within the range; the occupation probability density of the current grid is centered on the current grid _row And delta _col The ratio of the number of all occupied grids in the range to the number of all grids in the range; similarly, the idle probability density of the current grid is the complement of the occupancy probability density;

6-5) calculating the distance probability density from the current grid to the nearest occupied grid and the idle grid, wherein the formula is as follows:

wherein P is _distance (occupied) represents the distance probability density of the current grid to the nearest occupied grid:

wherein

And

and

wherein ω is _occupied Occupancy weight coefficient for the current grid:

ω _occupied ＝P _occupied *P _distance (occupied)

ω _free for the idle weight coefficient of the current grid:

ω _free ＝P _free *P _distance (free)

6-7) calculating the posterior occupancy probability of the current grid, and the formula is as follows:

wherein gridNum _occupied Is delta _row And delta _col Number of occupied grids in the range, gridNum _free Is delta _row And delta _col The number of free grids within the range;

6-8) according to P _posterior If P is a noise point, whether the occupied grid is a noise point is judged _posterior Less than the threshold, the occupied grid is considered likely to be noisy, and the occupied grid is changed to a free grid.

5. The method for positioning the grid map based on the binocular stereo camera according to claim 1, wherein the step 7) of clustering the objects and estimating the actual distance thereof by the aid of an object detection algorithm comprises the steps of:

7-1) obtaining a target detection result by using a yoloV5 algorithm to obtain the number Num of target objects in a three-dimensional scene _object Dis, distance estimate per object _i And the Center of each target object in the world coordinate system _i (x，y，z)；

7-2) use of improved K-Means polymerizationClustering the target object by using a class algorithm: center of each target _i (x, y, z) as the initialized clustering center point, and the number Num of the target objects _object As the k value, the formula is as follows:

when the value of j is taken, the value of theta is minimized _n (x，y，z)-Cente _j || ² And actually, distributing the three-dimensional point data set to the nearest clustering center, continuously iterating and calculating until the algorithm converges to minimize loss, and finishing clustering.

6. The method for positioning the grid map based on the binocular stereo camera according to claim 1, wherein the dynamic object is optimized by combining multi-frame information through particle filtering in step 8), comprising the steps of:

8-1) obtaining the screened grid map model, and adding particles with different speeds into the grid in the occupied state; the particles are denoted as S, S = { S = { (S) _i |s _i ＝(x _i ，y _i ，vx _i ，vy _i ，p _i )，i＝1，...N _S In which x is _i Lines representing the grid in which the particles are located, y _i The columns of the grid in which the particles are located,vx _i and vy _i Representing the row and column velocities, p, of the particles, respectively _i Represents the period of the particle;

the angle and distance between two adjacent frames are:

wherein

DX and Dy are resolution ratios of the grid lines and the columns respectively; the grid position in the current frame is denoted x _n ，y _n ：

The prediction model is:

delta y, delta x, delta vy and delta vx are random disturbance and are derived from the Gaussian distribution of zero mean and a state transition covariance matrix of a Kalman filter;

8-3) reassign the particles in the grid by weighting and resampling:

P _density (m(x，y)|free)＝1-P _density (m(x，y)|occupied)

b. calculating the occupation distance probability and the free distance probability of each grid:

the occupied distances are represented in rows and columns, respectively, as:

The free distance is represented in rows and columns, respectively, as:

c. calculating an occupancy weight and an idle weight for each grid:

ω _free (x，y)＝p _density (m(x，y)|free)*p _distance (m(x，y)| free)

d. calculate the occupancy posterior probability for each grid:

If f is _i If > 1, add f to the grid _i Particles in integral multiples of f _i If the number is less than 1, deleting the particles;

8-4) judging the grid state again through the particle number and the particle speed:

if P of the grid _occupied If the velocity of the grid is less than the standard deviation of the velocities of all the particles in the grid, the obstacle in the grid is in a static state.