CN116385452A

CN116385452A - LiDAR point cloud panorama segmentation method based on polar coordinate BEV graph

Info

Publication number: CN116385452A
Application number: CN202310273933.7A
Authority: CN
Inventors: 王波; 陈宗仁; 张军; 余君
Original assignee: Guangdong Institute of Science and Technology
Current assignee: Guangdong Institute of Science and Technology
Priority date: 2023-03-20
Filing date: 2023-03-20
Publication date: 2023-07-04

Abstract

The invention provides a LiDAR point cloud panorama segmentation method based on a polar coordinate BEV diagram, which comprises the steps of polar coordinate BEV coding, semantic/instance segmentation prediction, point cloud panorama segmentation fusion and the like, wherein the polar coordinate BEV coding is 2D BEV coding for coding original point cloud data into fixed size under polar coordinates; the semantic/instance segmentation prediction is to generate independent 3D semantic prediction, 2D BEV center heat map and 2D instance center offset for the coded point cloud feature matrix through a reference depth network; the point cloud panoramic segmentation fusion firstly generates a 2D BEV Things mask through a 3D semantic segmentation result, forms a class-agnostic instance cluster with a 2D BEV central heat map and an instance center offset, and is combined with 3D semantic segmentation prediction to form a final panoramic segmentation result.

Description

LiDAR point cloud panorama segmentation method based on polar coordinate BEV graph

Technical Field

The invention relates to the technical field of digital image processing, in particular to a LiDAR point cloud panorama segmentation method based on a polar coordinate BEV graph.

Background

Image segmentation for video analysis plays an important role in different research fields such as smart cities, medical care, computer vision, remote sensing applications and the like. Panoramic segmentation is the result of fusion of semantic segmentation and instance segmentation, and can help to obtain finer knowledge of image scenes such as video monitoring, crowd counting, automatic driving, medical image analysis and the like, and deeper understanding of general scenes. With the introduction of LiDAR point cloud datasets, the nature of 3D data, real-time processing requirements, and the level of accuracy required for security and security (e.g., in an autonomous car) present new challenges to panoramic segmentation. The goal is to effectively resolve panorama segmentation with minimal prediction conflicts (instances and classes) and achieve real-time or near real-time speeds without affecting accuracy.

Some researchers have explored indoor point cloud panorama segmentation methods that combine instance segmentation and semantic segmentation methods. Liu et al published paper "Self-prediction for joint instance and semantic segmentation of point clouds" (In ECCV, 2020), proposed to use discriminant loss to learn embedded feature space to cluster instances; zhou et al published paper "join 3d instance segmentation and object detection for autonomous driving" (In CVPR, 2020), proposes to extract instance partitions from region proposals for semantic partition clusters; hurtado et al In the paper "Mopt: multi-object panoptic tracking" (In CVPR workbench, 2020) propose a MOPT model, appending semantic headers to Mask R-CNN to generate panoramic segmentations on the range images; milio et al In paper "Lidar panoptic segmentation for autonomous driving" (In IROS, 2020) propose to first resolve the LiDAR point Yun Quanjing segmentation on the range image and then restore it to the point cloud level by tri-linear up-sampling; the paper "panotic-polar net: proposal-free LiDAR Point Cloud Panoptic Segmentation" (InCVPR, 2021) by Zhou et al proposes a fast, robust LiDAR point cloud panorama segmentation framework (panotic-polar net), using polar Bird's Eye View (BEV) representation, learning semantic segmentation and class-independent clustering of examples In a single inference network, which can circumvent the occlusion problem between examples In urban street scenes, and proposes a highly adaptive example enhancement technique and a novel antagonistic point cloud pruning method to improve the network's learning ability.

The invention patent application with the application number of CN113379748A discloses a point cloud panorama segmentation method and device, and the method comprises the following steps: a point cloud mapping step of projecting the acquired point cloud to a world coordinate system to acquire a mapping point cloud; a video frame association step of projecting each point cloud point in the map-building point cloud into a projectable video frame; and panoramic segmentation, namely panoramic segmentation is carried out on the projectable video frames so as to carry out unified numbering on semantic identification probability of each point cloud point. This method has a disadvantage in that although panoramic segmentation is possible, the segmentation speed is relatively slow and accuracy is not high.

Disclosure of Invention

In order to solve the technical problems, the invention provides a LiDAR point cloud panorama segmentation method based on a polar coordinate BEV graph, which comprises the steps of polar coordinate BEV coding, semantic/instance segmentation prediction, point cloud panorama segmentation fusion and the like. Wherein the polar BEV encoding is a 2D BEV encoding that encodes the original point cloud data into a fixed size at polar coordinates; the semantic/instance segmentation prediction is to generate independent 3D semantic prediction, 2D BEV center heat map and 2D instance center offset for the coded point cloud feature matrix through a reference depth network; the point cloud panoramic segmentation fusion firstly generates a 2D BEV Things mask through a 3D semantic segmentation result, forms a class-agnostic instance cluster with a 2D BEV central heat map and an instance center offset, and is combined with 3D semantic segmentation prediction to form a final panoramic segmentation result. The invention can improve the accuracy and the robustness of panoramic segmentation and realize the real-time or near real-time segmentation speed.

The invention provides a LiDAR point cloud panorama segmentation method based on a polar coordinate BEV graph, which comprises the steps of obtaining original point cloud data containing points with random sizes, and further comprises the following steps:

step 1: performing polar coordinate BEV coding on the original point cloud data;

step 2: given a LiDAR point cloud space, carrying out semantic/instance segmentation prediction on the BEV codes with fixed sizes;

step 3: and carrying out panorama segmentation fusion on the semantic/instance segmentation prediction result to form a 3D panorama segmentation result.

Preferably, the polar BEV encoding means that the original point cloud data is processed by creating a fixed-size representation by projection and quantization, and the distribution of points in different ranges is balanced by the polar representation points.

In any of the above schemes, preferably, the step 1 includes the following substeps:

step 11: grouping the original point cloud data according to the position of the BEV graph in polar coordinates;

step 12: performing block coding by using polar Net point cloud;

step 13: loading a max-pooling layer on each BEV grid, creating a fixed-size BEV code,

wherein (1)>

For real space, H and W are the mesh sizes of the BEV map, and C is the characteristic channel.

In any of the above aspects, preferably, the step 11 includes combining point cloud data

Grouping into

Wherein D is the input feature dimension, N is the number of point clouds, N ^* Is the number of points in each BEV mesh.

In any of the above embodiments, preferably, the stepStep 12 includes sharing a multi-layer perceptron MLP, using polar Net network to group point clouds

Encoding is performed.

In any of the above schemes, preferably, the step 2 includes the following substeps:

step 21: traversing all points in the LiDAR point cloud, and calculating the visibility of the whole 3D space, namely under a polar coordinate system, taking all points (x, y, z) which are in the same direction alpha (x, y, z) and meet 0< alpha <1 into the visibility space;

step 22: constructing a reference depth network by taking a Unet as a basic framework, wherein the reference depth network comprises a depth network model of 4 coding layers and 4 decoding layers;

step 23: connecting the visibility feature with a feature representation generated by a polar BEV encoder, inputting the visibility feature into the reference depth network, and generating a 2D instance header and a 3D semantic header;

step 24: processing the 2D instance header;

step 25: and processing the 3D semantic header.

In any of the above-described aspects it is preferred that, each layer of the coding portion of the reference depth network consists of a 3 x 3 convolution, a batch normalization process, a correction linear unit, and a max pooling operation; each layer of the decoding section consists of an upsampling convolution, attention-gating based feature concatenation, and a 3 x 3 convolution; the last layer in FCN-1 normalizes the output to a probability map of [0,1] using the sigmoid function as the activation function.

Preferably in any of the above schemes, the step 24 further comprises predicting a center heat map and an offset to the center of the object for each BEV pixel using a 2D instance header, grouping pixels having the same nearest center into the same group, providing class independent instance groupings using a bottom-up approach, and encoding the group-trunk center map by training with a two-dimensional gaussian distribution centered around each instance centroid without marking the bounding box.

Any of the abovePreferably, each pixel in the BEV map is set to be p, and then the center thereof is the thermal map H _p The expression is as follows:

wherein C is _i Is the centroid of one example in the polar BEV.

In any of the above schemes, preferably, the step 25 further includes sharing the first 3 decoding layers with the instance segmentation, generating a plurality of predictions at each pixel point, and recombining the predictions into 3D voxels to separate markers at different heights along the Z-axis, calculating voxel level losses for a plurality of points within the same voxel using a voting algorithm, and generating the 3D semantic segmentation predictions.

In any of the above schemes, preferably, the step 3 includes the following substeps:

step 31: selecting the first k centers from the 2D BEV center heat map by a non-maximum suppression operation;

step 32: creating a 2D BEV foreground mask using the 3D semantic segmentation prediction while ensuring that at least one thongs class can be detected for each BEV pixel;

step 33: calculating the foreground pixels p to k example centroids c _i Minimum distance d (p, c) of (i=1, 2, …, k) _i ) And groups them;

step 34: prediction of thins classes in semantic segmentation heads using majority voting based on semantic segmentation probabilities for each group G in BEV _i Designating a unique instance tag L;

step 35: and fusing the generated class agnostic instance cluster with 3D semantic segmentation prediction, and finally outputting a 3D panoramic segmentation result through a majority voting mechanism.

In any of the above embodiments, it is preferable that the minimum distance d (p, c _i ) The expression of (2) is:

d(p,c _i )＝‖p+offset(p)-c _i ‖

wherein offset (p) is the center offset of pixel p.

In any of the above aspects, preferably, the expression of the semantic segmentation probability is:

the invention provides a LiDAR point cloud panorama segmentation method based on a polar coordinate BEV map, which simultaneously learns semantic and instance characteristics on a discretized BEV map, rapidly and robustly realizes the point cloud panorama segmentation based on LiDAR, effectively solves panorama segmentation with minimum conflict between predicted instance and class, and realizes real-time or near real-time speed under the condition of not affecting accuracy.

BEV: birds eye view, i.e., a bird's eye view.

LiDAR: is a system integrating laser, global positioning system and inertial navigation system.

Polar net network: is a lightweight neural network used for realizing real-time on-line semantic segmentation for single laser radar scanning data.

The thins class: i.e. class of things.

Drawings

FIG. 1 is a flow chart of a preferred embodiment of a LiDAR point cloud panorama segmentation method based on polar BEV graphs according to the present invention.

FIG. 2 is a flow chart of another preferred embodiment of a LiDAR point cloud panorama segmentation method based on polar BEV graphs according to the present invention.

FIG. 3 is a schematic diagram of one embodiment of a reference network model of a LiDAR point cloud panorama segmentation method based on polar BEV graphs, according to the present invention.

FIG. 4 is a semantic/instance segmentation visualization at one embodiment of a COCO dataset of a LiDAR point cloud panorama segmentation method based on polar BEV graphs according to the present invention.

FIG. 5 is a schematic view of a panoramic segmentation visualization on a Semantic KITTI dataset of an embodiment of a LiDAR point cloud panoramic segmentation method based on polar BEV graphs in accordance with the present invention.

Detailed Description

The invention is further illustrated by the following figures and specific examples.

Example 1

As shown in fig. 1, step 100 is performed, including obtaining origin point cloud data comprising randomly sized points,

step 110 is performed to perform polar BEV encoding on the original point cloud data, where the polar BEV encoding is to create a representation of a fixed size by projection and quantization to process the original point cloud data, and to balance the distribution of points in different ranges by using polar representation points. In this step, the following sub-steps are included:

step 111 is executed to group the original point cloud data according to the position of the BEV graph in polar coordinates, and the point cloud data is obtained

Grouping->

Wherein (1)>

For real space, D is the dimension of the input feature, N is the number of point clouds, N ^* Is the number of points in each BEV mesh.

Step 112 is executed to share a multi-layer perceptron MLP, and point-to-point cloud grouping is performed by using polar Net network

Encoding is performed.

Step 113 is performed, loading a max-pooling layer on each BEV grid, creating a fixed-size BEV code,

where H and W are the mesh sizes of the BEV map and C is the characteristic channel.

Step 120 is performed to perform semantic/instance segmentation prediction on the BEV code of fixed size, the semantic/instance segmentation prediction being given a LiDAR point cloud space for all points

And realizing the segmentation prediction of semantics and examples. In this step, the following sub-steps are included:

step 121 is performed to traverse all points in the LiDAR point cloud, and calculate the visibility of the entire 3D space, i.e., under a polar coordinate system, all points (x, y, z) along the same direction α (x, y, z) and satisfying 0< α <1 are included in the visibility space.

Step 122 is performed to construct a reference depth network based on the Unet framework, including depth network models of 4 coding layers and 4 decoding layers.

Step 123 is performed to connect the visibility feature with the feature representation generated by the polar BEV encoder, input into the reference depth network, and generate a 2D instance header and a 3D semantic header. Each layer of the coding portion of the reference depth network consists of a 3 x 3 convolution, a batch normalization process, a correction linear unit, and a max pooling operation; each layer of the decoding section consists of an upsampling convolution, attention-gating based feature concatenation, and a 3 x 3 convolution; the last layer in FCN-1 normalizes the output to a probability map of [0,1] using the sigmoid function as the activation function.

Step 124 is performed to process the 2D instance header, predict a center heat map of each BEV pixel and an offset to the center of the object using the 2D instance header, group pixels with the same nearest center into the same group, provide class independent instance groupings using a bottom-up approach to avoid collisions between class prediction and training instance headers, and do not mark bounding boxes, encode the group-trunk center map by training with a two-dimensional gaussian distribution centered on each instance centroid. Setting each pixel in the BEV plot to p, then center heat map H _p The expression is as follows:

wherein the method comprises the steps of，C _i Is the centroid of one example in the polar BEV.

Step 125 is executed to process the 3D semantic header, share the first 3 decoding layers with the instance segmentation, generate multiple predictions at each pixel point, and recombine into 3D voxels to separate markers at different heights along the Z-axis, calculate voxel level loss for multiple points within the same voxel using a voting algorithm, and generate a 3D semantic segmentation prediction.

Step 130 is executed to perform panorama segmentation fusion on the semantic/instance segmentation prediction result to form a 3D panorama segmentation result. In this step, the following sub-steps are included:

step 131 is performed to select the first k centers from the 2D BEV center heat map by a non-maximum suppression operation.

Step 132 is performed to create a 2D BEV foreground mask using the 3D semantic segmentation prediction while ensuring that at least one thongs class can be detected for each BEV pixel.

Step 133 is performed to calculate the foreground pixels p to k instance centroids c _i Minimum distance d (p, c) of (i=1, 2, …, k) _i ) And groups them, the minimum distance d (p, c _i ) The expression of (2) is:

d(p,c _i )＝‖p+offset(p)-c _i ‖

wherein offset (p) is the center offset of pixel p.

Step 134 is executed to predict Things class in semantic segmentation head using majority voting method according to semantic segmentation probability for each group G in BEV _i Assigning a unique instance label L, wherein the expression of the semantic segmentation probability is as follows:

step 135 is executed, in which the generated class-agnostic instance clusters are fused with 3D semantic segmentation predictions, and the 3D panorama segmentation result is finally output through a majority voting mechanism.

Example two

The invention provides a panoramic segmentation framework which can learn the semantics and the example characteristics on a discretized BEV map at the same time and realize the point cloud panoramic segmentation based on LiDAR rapidly and robustly. Considering the unique features of LiDAR data, panoramic segmentation is effectively resolved with minimal prediction conflicts (instances and classes) and real-time or near real-time speeds are achieved without affecting accuracy.

As shown in fig. 2, the method for segmenting the LiDAR point cloud panorama based on the polar coordinate BEV graph provided by the invention comprises the following specific steps:

and 1, performing polar coordinate BEV coding on the original point cloud data. First, the point cloud data is calculated from the position in the polar BEV graph

Grouping->

Where D is the input feature dimension, H and W are the mesh size of the BEV map, N ^* Is the number of points in each BEV grid; then, sharing a multi-layer perceptron MLP, grouping point clouds by using a polar Net network +.>

Coding; finally, a Max-pooling layer is loaded on each BEV grid creating a fixed-size representation +.>

Wherein C is a characteristic channel, and c=512 is taken here.

And 2, carrying out semantic/instance segmentation prediction on BEV codes with fixed sizes. Given a LiDAR point cloud space, for all points

And realizing the segmentation prediction of semantics and examples. The specific implementation method is as follows:

1) Traversing all points in the LiDAR point cloud, and calculating the visibility of the whole 3D space, namely under a polar coordinate system, taking all points (x, y, z) which are in the same direction alpha (x, y, z) and meet 0< alpha <1 into the visibility space;

2) A reference depth network is designed, based on the Unet, comprising a depth network model of 4 coding layers and 4 decoding layers. Wherein each layer of the coding section consists of a 3 x 3 convolution, a batch normalization process, a correction linear unit (ReLU) and a max pooling operation; each layer of the decoding section consists of an upsampling convolution, attention Gate (AG) based feature concatenation, and a 3 x 3 convolution. The last layer in FCN-1 uses sigmoid function as activation function, normalizes the output to probability map of [0,1], and the specific network model is shown in FIG. 3;

3) Connecting the visibility feature with the feature representation generated by the polar BEV encoder and inputting the connection result in the implementation into the reference depth network shown in fig. 3, generating a 2D instance header and a 3D semantic header;

4) The center heat map of each BEV pixel and the offset to the center of the object are predicted by using the 2D example head, the pixels with the same nearest center are divided into the same group, the example grouping irrelevant to the class is provided by adopting a bottom-up method, so as to avoid the conflict between the class prediction and the training example head, and the bounding box is not marked, the group-trunk center map is trained by two-dimensional Gaussian distribution taking the center of mass of each example as the center. Every pixel in the BEV plot is p, then its center is heat-map H _p The expression can be as follows:

wherein C is _i Is the centroid of one instance in the polar BEV;

5) The first 3 decoding layers are shared with the example segmentation, a plurality of predictions are generated at each pixel point and recombined into 3D voxels to separate marks at different heights along a Z axis, a voting algorithm is utilized for a plurality of points in the same voxel, voxel level losses are calculated, and 3D semantic segmentation predictions are generated.

And 3, carrying out panoramic segmentation fusion on the semantic/instance segmentation prediction result to form a 3D panoramic segmentation result. The specific calculation method is as follows:

1) Selecting the first k centers from the 2D BEV center heat map by a non-maximum suppression operation;

2) Creating a 2D BEV foreground mask using the 3D semantic segmentation predictions generated in steps 2-5) while ensuring that each BEV pixel can detect at least one "thongs" class;

3) Calculating the foreground pixels p to k example centroids c _i Minimum distance d (p, c) of (i=1, 2, …, k) _i ) And groups them, the expression of the minimum distance is as follows:

d(p,c _i )＝‖p+offset(p)-c _i ‖ (2)

wherein offset (p) is the center offset of pixel p;

4) Predicting the "thins" class in the semantic segmentation head using majority voting based on semantic segmentation probabilities, for each group G in the BEV _i A unique instance tag L is specified, wherein the semantic segmentation probability is expressed as follows:

5) And fusing the generated class agnostic instance cluster with 3D semantic segmentation prediction, and finally outputting a 3D panoramic segmentation result through a majority voting mechanism.

Example III

The invention provides a LiDAR point cloud panorama segmentation method based on a polar coordinate BEV graph, which comprises the following steps:

step 1. Polar BEV encoding step

The polar BEV encoding is to create a fixed-size representation by projection and quantization to process a point cloud containing randomly sized points, and to use the polar representation points to balance the distribution of points in different ranges, specifically the steps are:

(1.1) grouping the raw point cloud data according to the position in the polar BEV plot;

(1.2) adopting a shared multi-layer perceptron mechanism, and utilizing a simplified polar net point cloud to carry out block coding;

(1.3) loading a max pooling layer on each BEV grid, creating a fixed size BEV code;

step 2. Semantic/instance segmentation prediction step

The semantic/instance segmentation prediction is realized by taking U-Net with a symmetrical structure as a reference network, and comprises the following steps of:

(2.1) fusing the BEV codes output in the step (1.3) with visibility characteristics, and sending the fused BEV codes as input into a reference depth network for training to generate a 2D instance header and a 3D semantic header;

(2.2) processing the 2D instance header output from (2.1), calculating pixel level loss, generating a 2D BEV center heat map and a 2D instance center offset;

(2.3) processing the 3D semantic header output in (2.1), calculating voxel level loss, and generating 3D semantic segmentation prediction.

Step 3, panoramic segmentation and fusion step

The panorama segmentation fusion is to fuse the prediction results from the semantic header and the instance header to create a final panorama segmentation result, and the steps are as follows:

(3.1) framing the 3D semantic segmentation prediction output in (2.3) to generate a 2D BEV inputs mask;

(3.2) merging the 2D BEV thongs mask generated in (3.1) with the 2D BEV center heat map and the 2D instance center offset output in (2.2) to generate a class-agnostic instance cluster;

and (3.3) performing majority voting fusion on the class-agnostic instance cluster generated in the step (3.2) and the 3D semantic segmentation prediction output in the step (2.3), and finally generating a 3D panoramic segmentation result.

The invention discloses a LiDAR point cloud panorama segmentation method based on a polar coordinate BEV diagram, which belongs to a panorama segmentation frame based on no proposal. Aiming at the requirement of 3D point cloud panoramic segmentation, the invention carries out technical research and algorithm improvement on the aspects of point cloud coding, semantic/instance segmentation prediction conflict, panoramic segmentation strategy and the like, and provides an effective processing strategy. Compared with the prior art, the invention has the advantages that:

1) In the original point cloud coding link, a polar coordinate BEV coding mode is adopted, the polar coordinates balance the distribution of points in different ranges, better potential is provided for a neural network, distinguishing characteristics can be learned at a position close to a sensor, and information loss caused by quantization is reduced to the minimum; at the same time, BEV provides a compromise between computational cost and accuracy, enabling the use of more efficient 2D convolution networks to process data and obtain an optimal projection of object detection;

2) In the semantic/instance segmentation link, the non-proposal design is utilized, and the instance head is trained under the condition of no boundary box annotation, so that the conflict of class prediction is effectively avoided;

3) In the panoramic segmentation link, a strategy of sharing a decoding layer between a semantic header and an instance header and performing early fusion at a feature extraction level is designed, so that redundancy among networks is reduced, and the calculation efficiency is improved.

The foregoing description of the invention has been presented for purposes of illustration and description, but is not intended to be limiting. Any simple modification of the above embodiments according to the technical substance of the present invention still falls within the scope of the technical solution of the present invention. In this specification, each embodiment is mainly described in the specification as a difference from other embodiments, and the same or similar parts between the embodiments need to be referred to each other. For system embodiments, the description is relatively simple as it essentially corresponds to method embodiments, and reference should be made to the description of method embodiments for relevant points.

Claims

1. The LiDAR point cloud panorama segmentation method based on the polar coordinate BEV graph comprises the steps of obtaining original point cloud data containing points with random sizes, and is characterized by further comprising the following steps:

2. The method for segmenting the LiDAR point cloud panorama based on the polar BEV graph according to claim 1, wherein the step 1 comprises the following steps:

step 12: performing block coding by using polar Net point cloud;

wherein, wherein->

3. The method for panoramic segmentation of LiDAR point clouds based on polar BEV graphs according to claim 2, wherein said step 11 comprises the step of mapping the point cloud data

Grouping->

4. The method of claim 3, wherein the step 12 includes sharing a multi-layer perceptron MLP, grouping point clouds using polar net network

Encoding is performed.

5. The method for panoramic segmentation of LiDAR point clouds based on polar BEV graphs according to claim 4, wherein said step 2 comprises the sub-steps of:

step 21: traversing all points in the LiDAR point cloud, and calculating the visibility of the whole 3D space, namely under a polar coordinate system, taking all points (x, y, z) which are along the same direction alpha (x, y, z) and meet 0< alpha <1 into the visibility space;

step 24: processing the 2D instance header;

step 25: and processing the 3D semantic header.

6. The method of claim 5, wherein each layer of the coded portion of the reference depth network consists of a 3 x 3 convolution, a batch normalization process, a correction linear unit, and a max pooling operation; each layer of the decoding section consists of an upsampling convolution, attention-gating based feature concatenation, and a 3 x 3 convolution; the last layer in FCN-1 normalizes the output to a probability map of [0,1] using the sigmoid function as the activation function.

7. The method of claim 6, wherein step 24 further comprises predicting a center heat map and an offset to the center of the object for each BEV pixel using a 2D instance header, grouping pixels having the same nearest center into the same group, providing class independent instance groupings using a bottom-up approach, and encoding the group-trunk center map by training a two-dimensional gaussian distribution centered around each instance centroid without marking the bounding box.

8. The LiDAR point cloud panorama segmentation method based on polar BEV images according to claim 7, wherein each pixel in the BEV image is set to be p, and then the center of the BEV image is heat-map H _p The expression is as follows:

wherein C is _i Is the centroid of one example in the polar BEV.

9. The method of claim 8, wherein step 25 further comprises sharing the first 3 decoding layers with the instance segmentation, generating a plurality of predictions at each pixel point and reorganizing into 3D voxels to separate markers at different heights along the Z-axis, calculating voxel level losses for a plurality of points within the same voxel using a voting algorithm, and generating a 3D semantic segmentation prediction.

10. The method for panoramic segmentation of LiDAR point clouds based on polar BEV graphs according to claim 9, wherein said step 3 comprises the sub-steps of: