CN116246033A - Rapid semantic map construction method for unstructured road - Google Patents

Rapid semantic map construction method for unstructured road Download PDF

Info

Publication number
CN116246033A
CN116246033A CN202310012841.3A CN202310012841A CN116246033A CN 116246033 A CN116246033 A CN 116246033A CN 202310012841 A CN202310012841 A CN 202310012841A CN 116246033 A CN116246033 A CN 116246033A
Authority
CN
China
Prior art keywords
point
semantic
point cloud
map
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310012841.3A
Other languages
Chinese (zh)
Inventor
张尊君
陆云超
席鹏
刘意琮
梁启君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Tage Idriver Technology Co Ltd
Original Assignee
Beijing Tage Idriver Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Tage Idriver Technology Co Ltd filed Critical Beijing Tage Idriver Technology Co Ltd
Priority to CN202310012841.3A priority Critical patent/CN116246033A/en
Publication of CN116246033A publication Critical patent/CN116246033A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/003Navigation within 3D models or images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • G06T7/85Stereo camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Remote Sensing (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Computer Hardware Design (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a rapid semantic map construction method for unstructured roads, belongs to the technical field of automatic driving, and solves the problems of low precision, low speed and the like of the conventional high-precision map construction for mining area scenes in the prior art. The semantic map model of the invention rasterizes the point cloud and extracts the characteristics of the rasterized point cloud from the cylindrical characteristics of the rasterized point; the RGB branches are added simultaneously through downsampling and upsampling processes. In the sampling process, multi-modal focus convolution based on RGB information is used, and point cloud importance analysis is carried out on the periphery of a point cloud grid through RGB color information output by an RGB branch circuit so as to determine the shape of the multi-modal focus convolution. Finally, the output features are extracted by using the MLP again, are fused with the initial point features, refine the points in the grid, improve the accuracy of point cloud identification, and enable the obtained map to be high in accuracy and high in map construction speed.

Description

Rapid semantic map construction method for unstructured road
Technical Field
The invention belongs to the technical field of automatic driving, and relates to a rapid semantic map construction method for unstructured roads.
Background
In recent years, with the rapid development of artificial intelligence and new generation information technology, the key technology of automatic driving is promoted to be further perfected. The open-pit mining area is one of the scenes of automatic driving and quick landing due to the characteristics of road closure, relatively simple environment and the like.
The automatic driving key technology is divided into three technologies of environment sensing, planning decision and tracking control, and the participation of a high-precision map is not separated in the three technologies. In terms of perception, the high-precision map plays a role in the aspects of automatic driving beyond visual range perception, effective ROI area reduction, perception precision improvement and the like. Meanwhile, the high-precision map can provide lane-level information, and is a basic guarantee for planning decision-making and tracking control.
Chinese patent publication No. CN113665500A, entitled "method for constructing semantic high-precision map based on Point-line feature fusion laser", mainly provides a method for fusing semantic segmentation information of two-dimensional images, mapping the semantic segmentation information onto a three-dimensional space, and endowing Lei Dadian cloud with semantic and Point-line features. The method comprises the steps of extracting label information of each pixel on a two-dimensional image through image semantic segmentation, projecting the obtained pixel semantic information onto a three-dimensional point cloud, carrying out depth fitting to obtain a depth map, and finally carrying out map construction, dynamic obstacle filtering and other operations.
The invention provides a method for constructing and collecting a high-precision map for a parking lot, which mainly comprises the steps of constructing a road network through vehicle chassis information and a position map, collecting panoramic images, aligning the panoramic images for semantic segmentation, and vectorizing the result into a map coordinate system.
However, the method of mapping a three-dimensional space based on two-dimensional image semantic segmentation information gives semantic and dotted line features to the Lei Dadian cloud. The precision of semantic segmentation is affected by pixel resolution, and meanwhile, the precision loss is unavoidable during the conversion of two dimensions and three dimensions.
In addition, the method for constructing a map based on multidimensional information such as a vehicle chassis, a position map, an image, ultrasonic waves and the like is only aimed at a structured road, and obstacles and edges around the road are fixed, but is difficult to adapt to the situation that a mine unstructured road scene is single and a loading and unloading area is frequently changed.
Disclosure of Invention
In view of the analysis, the invention provides a rapid semantic map construction method for unstructured roads, which is used for solving the problems of low construction precision, low speed and the like of the existing high-precision map for mining area scenes.
The invention discloses a rapid semantic map construction method for unstructured roads, which comprises the following specific steps:
step 1, constructing a semantic segmentation model;
step 11, obtaining pictures and point cloud data at the same moment;
step 12, rasterizing the point cloud data by a cylinder to obtain point cloud grid units and cylinder characteristics of points, and obtaining tensors of the characteristics of each point cloud grid unit based on the cylinder characteristics of the points and the MLP (multi-layer neural network);
step 13, downsampling the points of the point cloud grid unit;
step 131, inputting tensors of the characteristics of each point cloud grid unit into an asymmetric residual block; outputting tensors of the sparse convolution characteristics;
step 132, multi-modal focus convolution downsampling:
carrying out sub-manifold cavity convolution on each cylindrical grid to obtain an output shape of each cylindrical grid position;
projecting RGB pixel point information of the picture onto point cloud data in each point cloud grid unit, and assigning RGB pixel point information to each point in each point cloud grid unit to obtain an RGB point cloud grid unit;
sparse convolution feature extraction is carried out on the points of the RGB point cloud grid units, importance probability calculation is carried out through a sigmoid function, and a tertiary importance graph I at the position k in the convolution kernel is obtained k p And center I of the tertiary importance map 0 p
Three-time importance map I at position k based on convolution kernel k p And center I of the tertiary importance map 0 p Selecting an important grid;
generating a convolution kernel dynamic convolution output position p based on the output shape of each cylindrical grid position and the position of the convolution kernel of the significant grid out
Dynamically convolving the convolution kernel by an output position p out Performing multi-mode focus convolution to obtain tensors of downsampled output features;
step 14, up-sampling the points of the point cloud grid unit;
step 141, multi-modal focus convolution upsampling:
inputting a tensor of the down-sampling output characteristics, and performing multi-mode focus deconvolution based on the RGB point cloud grid unit to obtain a vector of deconvoluted characteristics;
splicing the tensor of the downsampled output features with the tensor of the deconvoluted features to obtain the tensor of the spliced features; inputting tensors of the splicing characteristics into an asymmetric residual block; obtaining up-sampling output grid characteristics;
step 142, up-sampling the obtained up-sampling output grid characteristics to obtain tensors of down-sampling and up-sampling superposition characteristics of each point;
step 15, refining tensors of the downsampled and upsampled superposition characteristics of each point by using the MLP point characteristics of each point obtained in the step 12; acquiring the category probability of each refined point by using a loss function, and taking the highest category probability of each point as the point label of the point;
step 2, constructing a real-time semantic map;
acquiring real-time GPS-IMU data and real-time Lei Dadian cloud data of the unmanned vehicle;
using the semantic segmentation model constructed in the step 1 to form a key frame semantic point cloud for the label of each point in the obtained real-time point cloud data of each frame;
and superposing all the key frame semantic point clouds to obtain a semantic map, and downsampling the semantic map to generate a semantic RGB map.
Optionally, the point tags include obstacle tags, ground tags, and retaining wall tags.
Optionally, the method further comprises a step 3 of correcting the semantic RGB map, and specifically comprises the following steps:
step 31, searching labels of all points with the radius R around the points divided into barrier labels in the semantic RGB map through Kdtree, and resetting other points as barrier labels except the points divided into ground labels; counting the categories of the labels with the most categories of the original labels at the moment as updated labels of the points reset as obstacle labels by searching the N original labels of the points around the reset label through KNN; forming all the points after updating the labels into an obstacle semantic map;
step 32, filtering dynamic obstacles to obtain an output semantic map;
inputting a current frame obstacle semantic map and a previous frame obstacle semantic map;
if the labels of the pixel points at the same position in the two adjacent frames of obstacle semantic maps are different, the two adjacent frames of obstacle semantic maps are assumed to have dynamic obstacles;
for recursive Bayes filteringStability term addition penalty term odds (p) penalty ) Obtaining the stability probability l of the dynamic obstacle at the current moment s (t)
Figure BDA0004038299810000031
Wherein l s (t-1) The stability probability of the dynamic obstacle at the previous moment; p (P) stable And p prior The known stable map surface element probability and the known map surface element prior probability are respectively; p is p penalty Punishment items for point cloud noise; alpha is an included angle between the normal line of the distance image and the normal line of the depth image, and d is a difference value between the obstacle semantic map and the distance image; sigma (sigma) α 2 Noise α; sigma (sigma) d 2 Noise of d;
according to the stability probability l of the dynamic obstacle at the current moment s (t) Judging whether dynamic barriers exist in the current frame barrier semantic map, and l τ To set the stability threshold, if
Figure BDA0004038299810000041
Then there is no dynamic obstacle and the obstacle semantic map of the dynamic obstacle frame is not present to construct the output semantic map.
Optionally, when the real-time semantic map is constructed, the calibrated GNSS/IMU integrated navigation positioning system and the laser radar of the unmanned vehicle are used for acquiring real-time GPS-IMU data and real-time Lei Dadian cloud data of the unmanned vehicle.
Optionally, the calibration method of the GNSS/IMU integrated navigation positioning system and the laser radar of the unmanned vehicle comprises the following steps:
acquiring an initial rotation translation matrix of a GNSS/IMU combined navigation positioning system and a laser radar of the unmanned vehicle; the rotation translation matrix is the initial deviation and the initial attitude angle of the laser radar coordinate system and the GPS vehicle body coordinate system in the XYZ direction, wherein the attitude angle comprises a yaw angle, a roll angle and a pitch angle;
the laser radar is arranged at the front part and/or the tail part of the automatic driving vehicle, the GNSS/IMU integrated navigation positioning system is arranged on the vehicle body, and the measuring deviation of the laser radar position and the XYZ direction of the position of the center of the vehicle body is measured; placing a chess calibration plate in front of a central line in the advancing direction of a vehicle body and perpendicular to the central line, extracting point characteristics of corner points of the calibration plate, fitting a plane where the calibration plate is positioned according to the extracted point characteristics of the corner points, acquiring measurement attitude angles of a GPS vehicle body coordinate system and a laser radar coordinate system, and determining a rotation translation conversion relation of the GPS vehicle body coordinate system and the laser radar coordinate system of the unmanned vehicle according to measurement deviation of the position of the GPS vehicle body coordinate system and the laser radar coordinate system in the XYZ direction and the measurement attitude angle;
and calibrating an initial rotation translation matrix of the GNSS/IMU integrated navigation positioning system and the laser radar of the unmanned vehicle based on the rotation translation conversion relation.
Optionally, extracting edge features and plane features of input real-time Lei Dadian cloud data of each frame; acquiring real-time position and pose information of the current unmanned vehicle through real-time GPS-IMU data, and converting semantic point clouds of all key frames into a world coordinate system by using an initial rotation translation matrix; if the NDT matching score between the two frames of semantic point clouds is smaller than a threshold value, matching among a plurality of key frame semantic point clouds is carried out by using edge features and plane features, and parameters of an initial rotation translation matrix are optimized.
Compared with the prior art, the invention has at least one of the following beneficial effects:
(1) The method adopts the combination of cylindrical grid sampling and point cloud to extract the original data characteristics, overcomes the sparsity and density non-uniformity of the point cloud, and balances the precision and speed of a characteristic extraction network;
(2) According to the method, three-dimensional data of the cylinder characteristics rho, theta and z are input into a BackBone, and grid point cloud characteristic extraction is carried out. Through four downsampling and three upsampling, RGB branches are added simultaneously, and features of each downsampling and features corresponding to the upsampling are fused through a semantic understanding module based on multidimensional decomposition. In the sampling process, multi-modal focus convolution based on RGB information is used, and point cloud importance analysis is carried out on the periphery of a point cloud grid through RGB color information output by an RGB branch circuit so as to determine the shape of the multi-modal focus convolution. Finally, the output features are subjected to point feature extraction by using MLP again, are fused with the initial point features, refine the points in the grid, and improve the accuracy of point cloud identification;
(3) According to the method, abnormal points are processed based on a boundary erosion algorithm of semantic segmentation, single-frame point cloud data are classified through a model extracted by the semantic segmentation algorithm, the boundary erosion is adopted to process the classified abnormal points, and a semantic segmentation result is optimized.
Drawings
The drawings are only for purposes of illustrating particular embodiments and are not to be construed as limiting the invention.
FIG. 1 is a flow chart of a quick semantic map construction method of the present invention;
FIG. 2 is a schematic view of a cylindrical grid point cloud of the present invention;
FIG. 3 is a flow chart of downsampling of the present invention;
FIG. 4 is an upsampling flowchart of the present invention;
FIG. 5 is an asymmetric residual flow chart of the present invention;
FIG. 6 is a flow chart of a multi-modal focus convolution of the present invention.
Detailed Description
The invention will now be described in detail with reference to the drawings and to specific embodiments thereof.
1-6, a rapid semantic map construction method for unstructured roads is disclosed, which comprises the following specific steps:
step 1, constructing a semantic segmentation model;
step 11, enabling the output frequency of the picture and the point cloud data to be consistent, and obtaining the picture and the point cloud data at the same moment; thereby synchronizing the time of the picture with the point cloud data;
preferably, the picture is obtained by a camera and the point cloud data is obtained by a lidar.
Step 12, rasterizing the point cloud data by a cylinder to obtain point cloud grid units and cylinder characteristics of points, and obtaining characteristic tensors of each point cloud grid unit based on the cylinder characteristics of the points and the MLP (multi-layer neural network);
as shown in fig. 2, the point cloud data is divided according to a cylindrical grid from a cartesian coordinate system to obtain a cylindrical grid; converting the points in the cylindrical grid point cloud from the coordinates of a Cartesian coordinate system to cylindrical coordinates;
optionally, the point-wise feature of the point cloud in the cylindrical grid is extracted through the MLP multilayer neural network based on the point-wise target detection method, and the specific steps are as follows:
dividing the point cloud data from a Cartesian coordinate system according to a cylindrical grid to obtain point cloud grid units; the method comprises the steps of acquiring the cylindrical characteristics of points in a point cloud grid unit, wherein the expression is as follows:
Figure BDA0004038299810000061
Figure BDA0004038299810000062
z’z’=z。
wherein X is the X-axis coordinate of the point in a Cartesian coordinate system; y is the Y-axis coordinate of the point in the Cartesian coordinate system; z is the Z-axis coordinate of the point in the Cartesian coordinate system; ρ is the horizontal radius of the point and the Z-axis in the Cartesian coordinate system; θ is the azimuth angle of the point relative to the X-Y axis plane; z' is the difference in height between the point and the origin of the Cartesian coordinate system; all points are mapped into a cylindrical grid.
Meanwhile, inputting the cylindrical features rho, theta and z' of all points into an MLP (multi-layer neural network) to perform point-based MLP feature extraction on point cloud data, and increasing the 3-dimensional point features of the point cloud to C in Maintaining the point characteristics, and carrying out maximum pooling on points in the point cloud grid unit through a maximum pooling layer to obtain C in MLP point feature of dimension, output feature C in Tensors of H, W, L, wherein H, W and L are the radius, azimuth and altitude, respectively, of the point cloud grid element, C in Is the dimension of the point;
step 13, downsampling the points of the point cloud grid unit;
step 131, feature C in Tensors of H, W and L are input into an asymmetric residual block; horizontal three-dimensional convolution and vertical three-dimensional convolution feature C in Tensors of H, W, L; for horizontal three-dimensional convolution, 3×1×3 convolution kernels and 1×3×3 convolution kernel feature extraction are sequentially performed; for vertical three-dimensional convolution, sequentially performing 1×3×3 convolution kernels and 3×1×3 convolution kernel feature extraction; while increasing the dimension C of the input in Output sparse convolution feature 2×c in Tensors of H, W, L; each point in each point cloud grid element is traversed.
And adopting an asymmetric residual block to extract the characteristics of the points in the cylinder grid. The expressive power of points in the grid is enhanced, so that the robustness of the network is enhanced, and meanwhile, compared with the traditional 3D convolution, the calculation cost is reduced by extracting the characteristics through the asymmetric residual errors.
Step 132, multi-modal focus convolution downsampling:
performing sub-manifold hole convolution on each cylindrical grid, namely when the center point position P of each convolution kernel is matched with the input point position P of the non-empty cylindrical grid in When overlapping, carrying out convolution operation on the center point of each convolution kernel; wherein, each cylinder grid is subjected to sub-manifold cavity convolution Y p The expression is:
Figure BDA0004038299810000071
wherein x is p For the input sparse convolution feature at the center point position p of the convolution kernel, x p =(2C inp ,H p ,W p ,L p );w k The weight of the position k in the convolution kernel, d is the spatial dimension, preferably 3; k (K) d Convolving the kernel weight for the sub-manifold holes, preferably k=3, K d =3 3
Convolving the sub-manifold holes for each cylindrical grid location, the output shape of each cylindrical grid location
Figure BDA0004038299810000072
The method comprises the following steps:
Figure BDA0004038299810000073
Figure BDA0004038299810000074
wherein P (P, K) d )={p+k|k∈K d }。
Selecting importance of adjacent points:
converting the picture into a coordinate system of point cloud data through a conversion matrix, projecting RGB pixel point information of the picture onto the point cloud data in each point cloud grid unit, and assigning RGB pixel point information to each point in each point cloud grid unit to obtain an RGB point cloud grid unit; thereby spatially synchronizing the image data and the point cloud data;
importance map I p : sparse convolution feature extraction is carried out on the points of the RGB point cloud grid units, importance probability calculation is carried out through a sigmoid function, and a tertiary importance graph at the position k in the convolution kernel is obtained
Figure BDA0004038299810000075
And center of the three-time importance map +.>
Figure BDA0004038299810000076
Wherein, the sparse convolution kernel weight and the sub-manifold cavity convolution kernel weight K d The same applies.
Importance map I p Sparse convolution feature 2 x C involving input at convolution kernel center point position p in Tensors of H, W, L, RGB image information, thereby obtaining importance of surrounding candidate grid output features, and thus, hole convolution of regular sparse conv and submanifold sparse conv can be balanced, and connectivity information is ensured not to be lost while reducing the calculation amount.
Selecting an important grid:
center of three-time importance graph output by RGB point cloud grid unit
Figure BDA0004038299810000077
When the importance threshold tau is larger than or equal to the importance threshold tau, the RGB point cloud grid unit is an important grid, and the position of a convolution kernel of the important grid is obtained:
Figure BDA0004038299810000081
wherein P is im The position of the convolution kernel, which is the important grid;
Figure BDA0004038299810000082
the output shape of the convolution kernel, which is an important grid; />
Figure BDA0004038299810000083
Convolution kernel weights are generated for the importance grid.
Grid importance includes important, relatively important, and unimportant grids; the unimportant grids are point cloud data in the point cloud grid units, the relatively important grids are point cloud data in the point cloud grid units, and the RGB features of the images are extracted to be background point clouds; the important grid is a point cloud in a point cloud grid unit, and the RGB features are extracted as target point clouds.
Wherein the selected important grid generates a convolution kernel dynamic output shape
Figure BDA0004038299810000084
From the three-time importance diagram->
Figure BDA0004038299810000085
The decision, the expression is:
Figure BDA0004038299810000086
finally, a convolution kernel dynamics is generated based on the output shape of each cylindrical grid location and the location of the convolution kernel of the significant gridConvolved output position p out
Figure BDA0004038299810000087
Dynamically convolving the convolution kernel by an output position p out Convolving with step length of 2 to obtain p out Output characteristics 2 x C in Tensors of H/2, W/2, L/2; performing four downsampling operations to sequentially obtain first downsampled output characteristics 2×C in Tensor of H/2, W/2, L/2, second downsampled output feature 4*C in H/4, W/4, L/4 tensors, third downsampled output characteristics 8*C in H/8,W/8, L/8 tensor, fourth downsampled output feature 16×C in Tensors of H/16, W/16, L/16.
Step 14, up-sampling the points of the point cloud grid unit;
step 141, multi-modal focus convolution upsampling:
input fourth downsampled output feature 16×c in Tensors of H/16, W/16 and L/16, performing multi-mode focus convolution up-sampling, and performing sub-manifold cavity convolution at the same time as 132, and obtaining a final deconvolution shape P based on the obtained three importance maps obtained by the RGB point cloud grid unit out The method comprises the steps of carrying out a first treatment on the surface of the Based on the deconvolution shape P out Deconvolution with step length of 2 is carried out, and the deconvoluted characteristic 8*C is output in ' H/8,W/8, tensor of L/8;
step 142, third one of the downsampled output features 8*C in H/8,W/8, tensor of L/8 and deconvoluted feature 8*C in ' H/8,W/8, tensors of L/8 are spliced to obtain splice characteristics 16X C in Tensors of H/8,W/8, L/8; splice feature 16 x c in The tensor of H/8,W/8 and L/8 is input into an asymmetric residual block; horizontal three-dimensional convolution and vertical three-dimensional convolution stitching feature 16×c in Tensors of H/8,W/8, L/8; for horizontal three-dimensional convolution, 3×1×3 convolution kernels and 1×3×3 convolution kernel feature extraction are sequentially performed; for vertical three-dimensional convolution, sequentially performing 1×3×3 convolution kernels and 3×1×3 convolution kernel feature extraction; at the same timeReducing input dimensions, outputting upsampled features 8*C in Tensors of H/8,W/8 and L/8 to obtain up-sampling output grid characteristics;
the up-sampling output grid characteristic is up-sampled for three times to obtain down-sampling and up-sampling superposition characteristic C of each point in Tensors of H, W, L.
By superposing the lower sampling block and the upper sampling block, an asymmetric three-dimensional multi-modal focus convolution network is established.
The invention adopts asymmetric residual blocks to strengthen the horizontal and vertical kernels, thereby enhancing the robustness of the cylindrical grid.
Step 15, refining the downsampled and upsampled superposition characteristics of each point by using the MLP point characteristics of each point obtained in the step 12; acquiring the category probability of each refined point by using a loss function, and taking the highest category probability of each point as the point label of the point; the Point-wise refinement module is used for reclassifying the Point labels, so that the problem of information loss caused by label division errors when the labels are divided into cylindrical grids is solved.
Specifically, the spot labels include obstacle labels, ground labels, and retaining wall labels.
Step 2, calibrating a GNSS/IMU integrated navigation positioning system and a laser radar of the unmanned vehicle;
acquiring an initial rotation translation matrix of a GNSS/IMU combined navigation positioning system and a laser radar of the unmanned vehicle; the rotation translation matrix is the initial deviation and the initial attitude angle of the laser radar coordinate system and the GPS vehicle body coordinate system in the XYZ direction, wherein the attitude angle comprises a yaw angle, a roll angle and a pitch angle;
the laser radar is arranged at the front part and/or the tail part of the automatic driving vehicle, the GNSS/IMU integrated navigation positioning system is arranged on the vehicle body, and the measuring deviation of the laser radar position and the XYZ direction of the position of the center of the vehicle body is measured; placing a checkerboard calibration plate vertically with a central line 10 meters in front of the central line in the advancing direction of a vehicle body, extracting point characteristics of corner points of the checkerboard calibration plate, fitting a plane where the checkerboard is positioned according to the extracted point characteristics of the corner points, acquiring measurement attitude angles of a GPS vehicle body coordinate system and a laser radar coordinate system, and determining a rotation translation conversion relation of the GPS vehicle body coordinate system and the laser radar coordinate system of the unmanned vehicle according to measurement deviation of the position where the GPS vehicle body coordinate system and the laser radar coordinate system are positioned in the XYZ direction and the measurement attitude angle;
and unifying a coordinate system of the GNSS/IMU integrated navigation positioning system of the unmanned vehicle and a laser radar based on the rotation translation conversion relation obtained through calibration.
Step 3, constructing a real-time semantic map
The calibrated GNSS/IMU integrated navigation positioning system and the laser radar of the unmanned vehicle are used for respectively acquiring real-time GPS-IMU data and real-time Lei Dadian cloud data of the unmanned vehicle;
extracting edge characteristics and plane characteristics of input real-time Lei Dadian cloud data of each frame according to the roughness; and saving the edge characteristics and the plane characteristics of each frame of real-time Lei Dadian cloud data.
Optionally, feature extraction is performed on each frame of real-time Lei Dadian cloud data using a radar odometer.
Using the semantic segmentation model constructed in the step 1 to form a key frame semantic point cloud for the label of each point in the obtained real-time point cloud data of each frame; deleting other data among the key frame semantic point clouds, and estimating by using the key frame semantic point clouds;
acquiring real-time position and pose information of the current unmanned vehicle through real-time GPS-IMU data, and converting semantic point clouds of all key frames into a world coordinate system by using an initial rotation translation matrix; and judging whether to optimize the initial rotation flat matrix parameters by using the NDT feature matching effect, and if the NDT matching score between two frames of semantic point clouds is smaller than a threshold value, matching a plurality of key frame semantic point clouds by using the edge features and the plane features, so as to optimize the initial rotation flat matrix parameters.
And superposing all the key frame semantic point clouds to obtain a semantic map, and downsampling the semantic map to generate a semantic RGB map.
Step 4, correcting the semantic RGB map:
when the semantic RGB map is obtained by adopting the method, the dynamic obstacle in the optimized single-frame semantic RGB map is restored by adopting a boundary erosion algorithm under the condition that the dynamic obstacle has false detection or the mIOU is smaller, and the method comprises the following specific steps of:
step 41, optimizing obstacle labels in a semantic RGB map:
searching labels of all points with the radius R around the points divided into barrier labels in the semantic RGB map through Kdtree, and setting other points as barrier labels except the points divided into ground labels; after the setting is successful, searching the reset N original labels of the points around the points divided into the obstacle labels through KNN, and counting the class of the label with the highest class of the original label at the moment as a new label of the point divided into the labels again; and constructing all the points after updating the labels into an obstacle semantic map.
Step 42, filtering dynamic obstacle to obtain output semantic map
Inputting a current frame obstacle semantic map and a previous frame obstacle semantic map;
if the labels of the pixel points at the same position in the two adjacent frames of obstacle semantic maps are different, the two adjacent frames of obstacle semantic maps are assumed to have dynamic obstacles;
the probability of dynamic barriers in the assumed adjacent two-frame barrier semantic map is judged by a recursive Bayesian filter, and the specific judging method comprises the following steps:
adding a penalty term odds (p) penalty ) Obtaining the stability probability l of the dynamic obstacle at the current moment s (t)
Figure BDA0004038299810000111
Wherein l s (t-1) The stability probability of the dynamic obstacle at the previous moment; p (P) stable And p prior The known stable map surface element probability and the known map surface element prior probability are respectively; p is p penalty Punishment items for point cloud noise; alpha is the distance image normal and depth imageThe included angle between normals, d, is the difference between the obstacle semantic map and the distance image; sigma (sigma) α 2 Noise α; sigma (sigma) d 2 Is noise of d.
According to the stability probability l of the dynamic obstacle at the current moment s (t) Judging whether dynamic barriers exist in the current frame barrier semantic map, and l τ To set the stability threshold, if
Figure BDA0004038299810000112
Then there is a dynamic obstacle and the corresponding obstacle point is deleted; if->
Figure BDA0004038299810000113
Then there is no dynamic obstacle and the obstacle semantic map of the dynamic obstacle frame is not present to construct the output semantic map.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention.

Claims (6)

1. The rapid semantic map construction method for unstructured roads is characterized by comprising the following specific steps of:
step 1, constructing a semantic segmentation model;
step 11, obtaining pictures and point cloud data at the same moment;
step 12, rasterizing the point cloud data by a cylinder to obtain point cloud grid units and cylinder characteristics of points, and obtaining tensors of the characteristics of each point cloud grid unit based on the cylinder characteristics of the points and the MLP (multi-layer neural network);
step 13, downsampling the points of the point cloud grid unit;
step 131, inputting tensors of the characteristics of each point cloud grid unit into an asymmetric residual block; outputting tensors of the sparse convolution characteristics;
step 132, multi-modal focus convolution downsampling:
carrying out sub-manifold cavity convolution on each cylindrical grid to obtain an output shape of each cylindrical grid position;
projecting RGB pixel point information of the picture onto point cloud data in each point cloud grid unit, and assigning RGB pixel point information to each point in each point cloud grid unit to obtain an RGB point cloud grid unit;
sparse convolution feature extraction is carried out on the points of the RGB point cloud grid units, importance probability calculation is carried out through a sigmoid function, and a tertiary importance graph I at the position k in the convolution kernel is obtained k p And center I of the tertiary importance map 0 p
Three-time importance map I at position k based on convolution kernel k p And center I of the tertiary importance map 0 p Selecting an important grid;
generating a convolution kernel dynamic convolution output position p based on the output shape of each cylindrical grid position and the position of the convolution kernel of the significant grid out
Dynamically convolving the convolution kernel by an output position p out Performing multi-mode focus convolution to obtain tensors of downsampled output features;
step 14, up-sampling the points of the point cloud grid unit;
step 141, multi-modal focus convolution upsampling:
inputting a tensor of the down-sampling output characteristics, and performing multi-mode focus deconvolution based on the RGB point cloud grid unit to obtain a vector of deconvoluted characteristics;
splicing the tensor of the downsampled output features with the tensor of the deconvoluted features to obtain the tensor of the spliced features; inputting tensors of the splicing characteristics into an asymmetric residual block; obtaining up-sampling output grid characteristics;
step 142, up-sampling the obtained up-sampling output grid characteristics to obtain tensors of down-sampling and up-sampling superposition characteristics of each point;
step 15, refining tensors of the downsampled and upsampled superposition characteristics of each point by using the MLP point characteristics of each point obtained in the step 12; acquiring the category probability of each refined point by using a loss function, and taking the highest category probability of each point as the point label of the point;
step 2, constructing a real-time semantic map;
acquiring real-time GPS-IMU data and real-time Lei Dadian cloud data of the unmanned vehicle;
using the semantic segmentation model constructed in the step 1 to form a key frame semantic point cloud for the label of each point in the obtained real-time point cloud data of each frame;
and superposing all the key frame semantic point clouds to obtain a semantic map, and downsampling the semantic map to generate a semantic RGB map.
2. The rapid semantic map construction method of claim 1, wherein the point tags comprise obstacle tags, ground tags, and retaining wall tags.
3. The method for constructing a quick semantic map according to claim 2, further comprising the step of correcting the semantic RGB map according to step 3, specifically comprising the steps of:
step 31, searching labels of all points with the radius R around the points divided into barrier labels in the semantic RGB map through Kdtree, and resetting other points as barrier labels except the points divided into ground labels; counting the categories of the labels with the most categories of the original labels at the moment as updated labels of the points reset as obstacle labels by searching the N original labels of the points around the reset label through KNN; forming all the points after updating the labels into an obstacle semantic map;
step 32, filtering dynamic obstacles to obtain an output semantic map;
inputting a current frame obstacle semantic map and a previous frame obstacle semantic map;
if the labels of the pixel points at the same position in the two adjacent frames of obstacle semantic maps are different, the two adjacent frames of obstacle semantic maps are assumed to have dynamic obstacles;
adding penalties to stability terms in a recursive bayesian filterThe term odds (p penalty ) Obtaining the stability probability l of the dynamic obstacle at the current moment s (t)
Figure FDA0004038299800000021
Wherein l s (t-1) The stability probability of the dynamic obstacle at the previous moment; p (P) stable And p prior The known stable map surface element probability and the known map surface element prior probability are respectively; p is p penalty Punishment items for point cloud noise; alpha is an included angle between the normal line of the distance image and the normal line of the depth image, and d is a difference value between the obstacle semantic map and the distance image; sigma (sigma) α 2 Noise α; sigma (sigma) d 2 Noise of d;
according to the stability probability l of the dynamic obstacle at the current moment s (t) Judging whether dynamic barriers exist in the current frame barrier semantic map, and l τ To set the stability threshold, if
Figure FDA0004038299800000031
Then there is no dynamic obstacle and the obstacle semantic map of the dynamic obstacle frame is not present to construct the output semantic map.
4. A rapid semantic map construction method according to any one of claims 1-3, wherein the real-time GPS-IMU data and the real-time Lei Dadian cloud data of the unmanned vehicle are acquired using the calibrated integrated GNSS/IMU navigation positioning system and lidar of the unmanned vehicle when constructing the real-time semantic map.
5. The method for constructing the quick semantic map according to claim 4, wherein the calibration method of the GNSS/IMU combined navigation positioning system and the laser radar of the unmanned vehicle is as follows:
acquiring an initial rotation translation matrix of a GNSS/IMU combined navigation positioning system and a laser radar of the unmanned vehicle; the rotation translation matrix is the initial deviation and the initial attitude angle of the laser radar coordinate system and the GPS vehicle body coordinate system in the XYZ direction, wherein the attitude angle comprises a yaw angle, a roll angle and a pitch angle;
the laser radar is arranged at the front part and/or the tail part of the automatic driving vehicle, the GNSS/IMU integrated navigation positioning system is arranged on the vehicle body, and the measuring deviation of the laser radar position and the XYZ direction of the position of the center of the vehicle body is measured; placing a chess calibration plate in front of a central line in the advancing direction of a vehicle body and perpendicular to the central line, extracting point characteristics of corner points of the calibration plate, fitting a plane where the calibration plate is positioned according to the extracted point characteristics of the corner points, acquiring measurement attitude angles of a GPS vehicle body coordinate system and a laser radar coordinate system, and determining a rotation translation conversion relation of the GPS vehicle body coordinate system and the laser radar coordinate system of the unmanned vehicle according to measurement deviation of the position of the GPS vehicle body coordinate system and the laser radar coordinate system in the XYZ direction and the measurement attitude angle;
and calibrating an initial rotation translation matrix of the GNSS/IMU integrated navigation positioning system and the laser radar of the unmanned vehicle based on the rotation translation conversion relation.
6. The rapid semantic map construction method according to claim 5, wherein edge features and plane features of input real-time Lei Dadian cloud data per frame are extracted; acquiring real-time position and pose information of the current unmanned vehicle through real-time GPS-IMU data, and converting semantic point clouds of all key frames into a world coordinate system by using an initial rotation translation matrix; if the NDT matching score between the two frames of semantic point clouds is smaller than a threshold value, matching among a plurality of key frame semantic point clouds is carried out by using edge features and plane features, and parameters of an initial rotation translation matrix are optimized.
CN202310012841.3A 2023-01-05 2023-01-05 Rapid semantic map construction method for unstructured road Pending CN116246033A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310012841.3A CN116246033A (en) 2023-01-05 2023-01-05 Rapid semantic map construction method for unstructured road

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310012841.3A CN116246033A (en) 2023-01-05 2023-01-05 Rapid semantic map construction method for unstructured road

Publications (1)

Publication Number Publication Date
CN116246033A true CN116246033A (en) 2023-06-09

Family

ID=86632257

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310012841.3A Pending CN116246033A (en) 2023-01-05 2023-01-05 Rapid semantic map construction method for unstructured road

Country Status (1)

Country Link
CN (1) CN116246033A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117685954A (en) * 2024-02-01 2024-03-12 中国科学院自动化研究所 Multi-mode semantic map construction system and method for mining area

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117685954A (en) * 2024-02-01 2024-03-12 中国科学院自动化研究所 Multi-mode semantic map construction system and method for mining area
CN117685954B (en) * 2024-02-01 2024-05-24 中国科学院自动化研究所 Multi-mode semantic map construction system and method for mining area

Similar Documents

Publication Publication Date Title
CN109766878B (en) A kind of method and apparatus of lane detection
CN111798475B (en) Indoor environment 3D semantic map construction method based on point cloud deep learning
CN110163930B (en) Lane line generation method, device, equipment, system and readable storage medium
CN109740604B (en) A kind of method and apparatus of running region detection
CN112613378B (en) 3D target detection method, system, medium and terminal
CN113819890A (en) Distance measuring method, distance measuring device, electronic equipment and storage medium
CN113761999B (en) Target detection method and device, electronic equipment and storage medium
US20230005278A1 (en) Lane extraction method using projection transformation of three-dimensional point cloud map
WO2021056516A1 (en) Method and device for target detection, and movable platform
WO2024012211A1 (en) Autonomous-driving environmental perception method, medium and vehicle
CN114325634A (en) Method for extracting passable area in high-robustness field environment based on laser radar
CN116258826A (en) Semantic map construction and boundary real-time extraction method for open-air mining area
CN115100741B (en) Point cloud pedestrian distance risk detection method, system, equipment and medium
CN116597264A (en) Three-dimensional point cloud target detection method integrating two-dimensional image semantics
CN113744315A (en) Semi-direct vision odometer based on binocular vision
CN116643291A (en) SLAM method for removing dynamic targets by combining vision and laser radar
CN116246033A (en) Rapid semantic map construction method for unstructured road
CN114118247A (en) Anchor-frame-free 3D target detection method based on multi-sensor fusion
CN115147798A (en) Method, model and device for predicting travelable area and vehicle
CN116778262B (en) Three-dimensional target detection method and system based on virtual point cloud
Berrio et al. Fusing lidar and semantic image information in octree maps
CN117173399A (en) Traffic target detection method and system of cross-modal cross-attention mechanism
KR102540624B1 (en) Method for create map using aviation lidar and computer program recorded on record-medium for executing method therefor
Huang et al. A coarse-to-fine LiDar-based SLAM with dynamic object removal in dense urban areas
Endo et al. High definition map aided object detection for autonomous driving in urban areas

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: Room 303, Zone D, Main Building of Beihang Hefei Science City Innovation Research Institute, No. 999 Weiwu Road, Xinzhan District, Hefei City, Anhui Province, 230012

Applicant after: Tage Zhixing Technology Co.,Ltd.

Address before: 100176 901, 9th floor, building 2, yard 10, KEGU 1st Street, Beijing Economic and Technological Development Zone, Daxing District, Beijing

Applicant before: BEIJING TAGE IDRIVER TECHNOLOGY CO.,LTD.

Country or region before: China

CB02 Change of applicant information