CN113011430B - Large-scale point cloud semantic segmentation method and system - Google Patents
Large-scale point cloud semantic segmentation method and system Download PDFInfo
- Publication number
- CN113011430B CN113011430B CN202110309423.1A CN202110309423A CN113011430B CN 113011430 B CN113011430 B CN 113011430B CN 202110309423 A CN202110309423 A CN 202110309423A CN 113011430 B CN113011430 B CN 113011430B
- Authority
- CN
- China
- Prior art keywords
- point
- point cloud
- spatial
- sampling points
- learned
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 79
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000005070 sampling Methods 0.000 claims description 132
- 238000012545 processing Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 10
- 238000007500 overflow downdraw method Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 7
- 230000004927 fusion Effects 0.000 claims description 6
- 238000012216 screening Methods 0.000 claims description 6
- 230000003044 adaptive effect Effects 0.000 claims description 5
- 238000012935 Averaging Methods 0.000 claims description 4
- 239000000284 extract Substances 0.000 abstract description 4
- 230000009977 dual effect Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention relates to a large-scale point cloud semantic segmentation method and a system, wherein the semantic segmentation method comprises the following steps: extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified; gradually coding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature; gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics; and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic. The method extracts point-by-point characteristics of the point cloud to be identified, extracts more effective spatial characteristics from large-scale point cloud information, gradually encodes the point-by-point characteristics based on the point cloud spatial information of each point to be identified to obtain point cloud characteristics, further decodes the point cloud characteristics to obtain decoding characteristics, and determines a semantic segmentation prediction result of the 3D point cloud to be identified according to the decoding characteristics to obtain semantic information of the surrounding spatial environment, so that the semantic segmentation precision is improved.
Description
Technical Field
The invention relates to the technical field of computer vision, in particular to a large-scale point cloud semantic segmentation method and system based on spatial context feature learning.
Background
In a mobile robot surrounding environment perception system, semantic segmentation of surrounding environment is an important component, and semantic understanding information of the environment where the mobile robot is located is provided for a decision control system of the mobile robot. Compared with a 2D image sensor, a 3D sensor (such as a laser radar) can provide richer space geometry information, and is more helpful for a mobile robot to understand the three-dimensional space in which the mobile robot is located. Therefore, with the rapid development of 3D sensors, the semantic segmentation of 3D point clouds has recently attracted attention from academic and industrial circles, and the semantic segmentation of large-scale point clouds, which have a large amount of information but can describe the spatial environment in detail, is a computer vision problem that is attracted attention from researchers.
Because of the unstructured and disorderly 3D point cloud information, semantic segmentation of point clouds is a challenging task, especially for large-scale point clouds. In recent years, a number of Deep Neural Network (DNN) based methods have been used for semantic segmentation of point clouds. The existing point cloud semantic segmentation method can be mainly divided into three categories: methods based on spatial projection, methods based on spatial discretization and methods based on point processing. The method based on space projection firstly projects the 3D point cloud to a 2D plane, then utilizes a 2D semantic segmentation method to realize segmentation, and finally back-projects the 2D segmentation result to a 3D space. The method inevitably has information loss in the projection process, and the loss of critical detail information is not beneficial to the accurate understanding of the perception system to the environment. The method based on the space discretization firstly discretizes the 3D point cloud into a voxel form, and then carries out subsequent semantic segmentation based on the voxel. The method has discretization error, and the final semantic segmentation precision and the environment understanding accuracy are influenced by the discretization degree. Meanwhile, the above two methods both require additional complex point cloud space processing steps, such as projection and discretization, and the high computational complexity thereof makes it impossible to process large-scale point clouds. Therefore, how to extract more effective features from the large-scale point cloud is a key problem that the segmentation precision is prevented from being improved on the premise of ensuring the efficiency.
Disclosure of Invention
In order to solve the above problems in the prior art, that is, to improve the semantic segmentation precision, the present invention aims to provide a large-scale point cloud semantic segmentation method and system.
In order to solve the technical problem, the invention provides the following scheme:
a large-scale point cloud semantic segmentation method, which comprises the following steps:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
gradually coding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
Optionally, based on the point cloud spatial information of each point to be identified, the point-by-point features are encoded step by step to obtain corresponding point cloud features, which specifically includes:
performing point cloud down-sampling processing on each point to be identified to obtain a plurality of down-sampling points;
screening out the corresponding characteristics of the down-sampling points from the point-by-point characteristics, wherein the screened out characteristics are learned characteristics;
aiming at each down-sampling point, determining corresponding local spatial features according to point cloud spatial information of the down-sampling point and corresponding learned features;
determining corresponding global spatial context characteristics according to the point cloud spatial information of the down-sampling points;
and determining corresponding spatial context characteristics according to the local spatial characteristics and the global spatial context characteristics of the down-sampling points, wherein the spatial context characteristics of the down-sampling points are point cloud characteristics.
Optionally, the determining, according to the point cloud spatial information of the down-sampling point and the corresponding learned feature, the corresponding local spatial feature specifically includes:
according to the point cloud space information of the down sampling points and the corresponding learned characteristics, local space context characteristics are learned to obtain local space context characteristics;
obtaining a feature map based on the shared parameter multi-layer perceptron MLP according to the learned features;
and obtaining the local spatial features of the down-sampling points according to the local spatial context features and the feature map.
Optionally, the learning of the local spatial context feature is performed according to the point cloud spatial information of the down-sampling point and the corresponding learned feature, so as to obtain the local spatial context feature, and the learning of the local spatial context feature specifically includes:
determining the polar coordinates and the geometric distance of the down-sampling points according to the point cloud spatial information of the down-sampling points, wherein the polar coordinates of the down-sampling points represent the local spatial context information of the down-sampling points;
and obtaining local spatial context characteristics by a double-distance-based neighborhood point characteristic self-adaptive fusion method according to the polar coordinates, the geometric distance and the learned characteristics.
Optionally, the determining the polar coordinates of the down-sampling points according to the point cloud space information of the down-sampling points specifically includes:
obtaining the initial polar coordinates of the down-sampling point according to the following formula
Wherein, a K nearest neighbor KNN method based on Euclidean distance is used for obtaining a down-sampling point p i The K neighbor comprises K neighbor points To lower the sampling point p i K-th neighbor point of (1)Relative position coordinates in a rectangular spatial coordinate system, i denotes the down-sampled point p i K denotes a neighbor pointK =1,2, \8230, K represents the number of neighbors;
determining the polar coordinate angle alpha of the local space direction according to the local space direction i And beta i (ii) a The local spatial direction is defined by a down-sampling point p i Neighborhood centroid pointing to K neighbor
Updating the polar coordinates of the down-sampled points according to the following formulaThe updated polar representation has local rotational invariance:
optionally, the obtaining of the local spatial context feature according to the polar coordinate, the geometric distance, and the learned feature by the double-distance-based neighborhood point feature adaptive fusion method specifically includes:
determining a characteristic distance and a geometric characteristic according to the polar coordinates and the learned characteristics of the down-sampling points;
Wherein softmax () represents a normalized exponential function, MLP () represents a multi-layer perceptron function,the characteristics of the connection are shown as such,the dual-range feature is represented by,representing a join operator;is a near neighbor pointThe geometric distance of (a);is a neighboring pointIs calculated from the learned feature g i And learned characteristics g k Determining; λ is the weight of the adjusted feature distance term, mean () represents the averaging function;representing neighbor pointsFeatures determined from the geometric features and learned features;
fusing each neighborhood point feature with the weighting parameterFusing to obtain local spatial context characteristics f iL :
Where, represents the dot product operator, i represents the down-sampling point p i K denotes a neighbor pointK =1,2, \ 8230, K, K indicates the number of neighboring points.
Alternatively, the down-sampling point p is determined according to the following formula i Global spatial context feature f iG :
Wherein,to connect operators, (x) i ,y i ,z i ) To lower the sampling point p i Spatial coordinates in a spatial rectangular coordinate system, r i Is a volume ratio, v i For reducing the sampling point p i Volume of the neighborhood minimum circumscribed sphere, v g And the minimum circumscribed sphere volume of the point cloud to be identified.
In order to solve the technical problems, the invention also provides the following scheme:
a large-scale point cloud semantic segmentation system, the semantic segmentation system comprising:
the device comprises an extraction unit, a recognition unit and a processing unit, wherein the extraction unit is used for extracting point-by-point characteristics of a point cloud to be recognized, and the point cloud to be recognized is composed of a plurality of points to be recognized;
the encoding unit is used for gradually encoding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
the decoding unit is used for gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and the prediction unit is used for determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
In order to solve the technical problem, the invention also provides the following scheme:
a large-scale point cloud semantic segmentation system comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
gradually encoding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
In order to solve the technical problems, the invention also provides the following scheme:
a computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
gradually encoding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
According to the embodiment of the invention, the invention discloses the following technical effects:
the method extracts point-by-point characteristics of the point cloud to be identified, extracts more effective spatial characteristics from large-scale point cloud information, gradually encodes the point-by-point characteristics based on the point cloud spatial information of each point to be identified to obtain point cloud characteristics, further decodes the point cloud characteristics to obtain decoding characteristics, and determines the semantic segmentation prediction result of the 3D point cloud to be identified according to the decoding characteristics to obtain the semantic information of the surrounding spatial environment, so that the semantic segmentation precision is improved.
Drawings
FIG. 1 is a flow chart of a large-scale point cloud semantic segmentation method of the present invention;
FIG. 2 is a flow chart of polar coordinate determination of a local spatial context representation in the present invention;
FIG. 3 is a diagram of the process of angle update in polar coordinates of a local spatial context representation in the present invention;
FIG. 4 is a flow chart of a method for adaptive fusion of neighborhood point features based on dual distance in the present invention;
FIG. 5 is a detailed flowchart of a neighborhood point feature adaptive fusion method based on dual distance according to the present invention;
FIG. 6 is a point cloud distribution plot;
FIG. 7 is a flow chart of spatial context feature determination in the present invention;
FIG. 8 is a schematic diagram of a modular structure of the large-scale point cloud semantic segmentation system according to the present invention.
Description of the symbols:
an extraction unit-1, an encoding unit-2, a decoding unit-3, and a prediction unit-4.
Detailed Description
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and are not intended to limit the scope of the present invention.
The invention aims to provide a large-scale point cloud semantic segmentation method, which comprises the steps of extracting point-by-point characteristics of a point cloud to be identified, extracting more effective spatial characteristics from large-scale point cloud information, gradually coding the point-by-point characteristics based on the point cloud spatial information of each point to be identified to obtain point cloud characteristics, further decoding the point cloud characteristics to obtain decoding characteristics, and determining a semantic segmentation prediction result of the 3D point cloud to be identified according to the decoding characteristics to obtain semantic information of a surrounding spatial environment so as to improve semantic segmentation precision.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
As shown in FIG. 1, the large-scale point cloud semantic segmentation method of the invention comprises the following steps:
step 100: and extracting point-by-point characteristics of the point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified.
And extracting point-by-point characteristics of the points to be identified through a full connection layer according to the point cloud data.
The point cloud data is point cloud information of Nxd, wherein N is the number of points included in the point cloud, and d is the dimension of the input point cloud information. In some preferred embodiments, the point cloud information dimension d =6, including three dimensions of position information and three dimensions of color information.
Step 200: and gradually coding each point-by-point feature based on the point cloud space information of each point to be identified to obtain the corresponding point cloud feature.
Step 300: and gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics.
Step 400: and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
In step 200, the four encoding feature layers are used for encoding step by step to obtain corresponding point cloud features.
Specifically, the step of gradually encoding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature includes:
step 210: and carrying out point cloud downsampling processing on each point to be identified to obtain a plurality of downsampling points.
Preferably, a point cloud random down-sampling algorithm is adopted for down-sampling processing.
Step 220: and screening out the corresponding characteristics of the down-sampling points from the point-by-point characteristics, wherein the screened out characteristics are learned characteristics.
Step 230: and aiming at each down-sampling point, determining corresponding local spatial characteristics according to the point cloud spatial information of the down-sampling point and the corresponding learned characteristics.
Step 240: and determining corresponding global spatial context characteristics according to the point cloud spatial information of the down-sampling points.
Step 250: and determining corresponding spatial context characteristics according to the local spatial characteristics and the global spatial context characteristics of the down-sampling points, wherein the spatial context characteristics of the down-sampling points are point cloud characteristics.
As shown in fig. 7, in step 230, determining a corresponding local spatial feature according to the point cloud spatial information of the down-sampling point and the corresponding learned feature specifically includes:
step 231: and according to the point cloud space information of the down-sampling points and the corresponding learned characteristics, learning the local space context characteristics to obtain the local space context characteristics.
Step 232: and obtaining a feature map based on the shared parameter multi-layer perceptron MLP according to the learned features.
Step 233: and obtaining the local spatial features of the down-sampling points according to the local spatial context features and the feature map.
In step 231, the local spatial context feature learning is performed according to the point cloud spatial information of the down-sampling point and the corresponding learned features, so as to obtain the local spatial context feature, which specifically includes:
step 2311: and determining the Polar coordinate Representation (LPR) and the geometric distance of the down-sampling points according to the point cloud space information of the down-sampling points, wherein the Polar coordinate Representation of the down-sampling points is used for representing the Local space context information of the down-sampling points.
Determining the polar coordinates of the down-sampling points according to the point cloud space information of the down-sampling points, wherein the determining specifically comprises (as shown in fig. 2 and 3):
step A1: obtaining the initial polar coordinates of the down-sampling point according to the following formula
Wherein, the sampling point p is obtained by using a K Nearest Neighbors (KNN) method based on Euclidean Distance (Euclidean Distance) i The K neighbor comprises K neighbor points For reducing the sampling point p i K-th neighbor point of (1)Relative position coordinates in a rectangular spatial coordinate system, i denotes the down-sampled point p i K denotes a neighbor pointK =1,2, \ 8230, K, K indicates the number of neighboring points. In this embodiment, K is 16.
Step A2: determining the polar coordinate angle alpha of the local space direction according to the local space direction i And beta i (ii) a The local spatial direction is defined by the down-sampling point p i Neighborhood centroid pointing to K nearest neighbor
By dropping the sampling point p i Neighborhood centroid pointing to K nearest neighborDetermining the local spatial direction has two main advantages: (1) the centroid can effectively reflect the general view of the local domain; (2) The averaging operation in the centroid calculation process can effectively weaken the random factors introduced by random downsampling.
Step A3: updating the polar coordinates of the down-sampled points according to the following formulaThe updated polar representation has local rotational invariance (as shown in fig. 3 (a) - (c)):
the polar coordinates are used to characterize a spatial context information with local rotational invariance. In most practical scenes, objects belonging to the same semantic category usually have different posture orientations, such as seats facing different directions in a conference room, so that the features obtained directly based on point learning are orientation-sensitive, and the orientation-sensitive situation can further influence the effect of point cloud semantic segmentation in certain situations. The present invention selects local spatial context information representing points in a spatial polar coordinate system. Compared to a spatial rectangular coordinate system, only the angle is orientation sensitive in a polar coordinate system. In the present invention, updatedAndis an angle relative to the local spatial direction that allows its value to remain unchanged as the pose orientation changes, and thus the local spatial context representation obtained so far is locally rotation invariant.
Step 2312: and obtaining local spatial context characteristics by a neighborhood point characteristic self-adaptive fusion method based on double distances according to the polar coordinates, the geometric distances and the learned characteristics.
The Dual-Distance adaptive fusion (DDAP) method is used for adaptively learning local spatial context features by using neighborhood point features. The distance is an important index for measuring the correlation between the point and the point, and the correlation between the point and the point is improved along with the reduction of the distance. Wherein the dual distance is a geometric distance of the physical space and a characteristic distance of the characteristic space.
As shown in fig. 4 and 5, step 2312 specifically includes:
step B1: and determining the characteristic distance and the geometric characteristic according to the polar coordinates and the learned characteristics of the down-sampling points.
The geometric features are obtained by connecting polar coordinates and absolute coordinates of the down-sampling points and then processing by using a Multi-Layer Perceptron (MLP) which shares parameters.
And step B2: determining weighted fusion parameters according to the following formula(the invention uses MLP and Softmax of shared parameters to adaptively learn weighted fusion parameters of neighborhood point features):
Wherein softmax () represents a normalized exponential function, MLP () represents a multi-layer perceptron function,the characteristics of the connection are shown as such,the dual-range feature is represented by,representing a join operator;is a neighboring pointThe geometric distance of (a);is a near neighbor pointIs calculated from the learned feature g i And g k Determining; λ is the weight of the adjusted feature distance term, mean () represents the averaging function;representing neighbor pointsAnd the characteristics are determined according to the geometric characteristics and the learned characteristics. In this embodiment, λ is 0.1.
And step B3: fusing each neighborhood point feature with the weighting parameterFusing to obtain local spatial context characteristics f iL :
Where, denotes the dot product operator, i denotes the down-sampled point p i K denotes a neighbor pointK =1,2, \ 8230, K, K indicates the number of neighboring points.
As shown in fig. 6, in step 240, a down-sampled point p is determined according to the following equation i Global spatial context feature f iG :
Wherein,to join the operators, (x) i ,y i ,z i ) To lower the sampling point p i Spatial coordinates in a spatial rectangular coordinate system, r i Is a volume ratio, v i To lower the sampling point p i Volume of the neighborhood minimum circumscribed sphere, v g The minimum circumscribed sphere volume of the point cloud to be identified.
The local spatial context Feature can effectively describe context information between points in a neighborhood, and in order to obtain a more discriminative spatial context Feature, the Global spatial context Feature learning (GCF) is performed in the invention for learning the Global spatial context Feature of the point.
Further, in the spatial context feature learning, local spatial context feature learning is performed twice in succession to expand a local spatial context receptive field, the obtained local spatial context feature is added to a feature map learned directly by using MLP based on point features to obtain a local feature, and the local feature is connected to a global spatial context feature to obtain a final spatial context feature (as shown in fig. 7).
Further, in step 300, the four decoding feature layers are respectively decoded step by step to obtain corresponding decoding features.
Specifically, step 300 includes:
step 310: and performing up-sampling on the down-sampling points corresponding to the cloud features of each point to obtain a plurality of up-sampling points and corresponding point cloud features.
In the present embodiment, the up-sampling process is performed using a nearest neighbor interpolation algorithm.
Step 320: and determining corresponding decoding characteristics according to each pair of the up-sampling points and the point cloud characteristics based on the MLP of the shared parameters.
In step 400, a semantic segmentation prediction result of the 3D point cloud to be identified is determined according to the decoding characteristics through the three full-connected layers and the semantic segmentation network model.
The large-scale point cloud semantic segmentation system firstly trains a semantic segmentation network, the training adopts cross entropy loss function training, and learning network parameters are iteratively optimized by using an Adam optimizer. In the present embodiment, the initial learning rate is set to 10 -2 And the learning rate after each iteration is reduced to 95% of the original rate. And then, performing semantic segmentation on the large-scale point cloud by using the trained model.
The feature dimension is changed from d to 8 through a full connection layer for extracting point-by-point features. Through four characteristic coding layers, the information scale participating in operation is changed from initial Nxd Through four feature decoding layers and three full connection layers, the information scale is changed into Nxc, wherein c is the category number of semantic segmentation, and the Nxc semantic segmentation prediction information is obtained.
In addition, the invention also provides a large-scale point cloud semantic segmentation system which can improve the semantic segmentation precision.
As shown in fig. 8, the apparatus for improving semantic segmentation accuracy according to the present invention includes an extraction unit 1, an encoding unit 2, a decoding unit 3, and a prediction unit 4.
Specifically, the method comprises the following steps: the extraction unit 1 is used for extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
the encoding unit 2 is used for gradually encoding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
the decoding unit 3 is used for gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
the prediction unit 4 is configured to determine a semantic segmentation prediction result of the to-be-identified 3D point cloud based on a semantic segmentation network model according to each decoding feature.
In addition, the invention also provides the following scheme:
a large-scale point cloud semantic segmentation system, comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
gradually coding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
Further, the invention also provides the following scheme:
a computer readable storage medium storing one or more programs that, when executed by an electronic device that includes a plurality of application programs, cause the electronic device to:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
gradually coding each point-by-point feature based on the point cloud space information of each point to be identified to obtain a corresponding point cloud feature;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
Compared with the prior art, the large-scale point cloud semantic segmentation system and the computer-readable storage medium have the same beneficial effects as the large-scale point cloud semantic segmentation method, and are not repeated herein.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.
Claims (7)
1. A large-scale point cloud semantic segmentation method is characterized by comprising the following steps:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified consists of a plurality of points to be identified;
based on the point cloud space information of each point to be identified, gradually encoding each point-by-point feature to obtain a corresponding point cloud feature:
performing point cloud down-sampling processing on each point to be identified to obtain a plurality of down-sampling points;
screening out the characteristics corresponding to the down-sampling points from the point-by-point characteristics, wherein the screened characteristics are learned characteristics;
aiming at each down-sampling point, determining corresponding local spatial features according to point cloud spatial information of the down-sampling point and corresponding learned features;
determining corresponding global spatial context characteristics according to the point cloud spatial information of the down-sampling points;
determining corresponding spatial context characteristics according to the local spatial characteristics and the global spatial context characteristics of the down-sampling points, wherein the spatial context characteristics of the down-sampling points are point cloud characteristics;
determining corresponding local spatial features according to the point cloud spatial information of the down-sampling points and the corresponding learned features, specifically comprising:
according to the point cloud space information of the down-sampling points and the corresponding learned characteristics, local space context characteristics are learned to obtain local space context characteristics;
obtaining a feature map based on a shared parameter multi-layer perceptron MLP according to the learned features;
obtaining the local spatial features of the down-sampling points according to the local spatial context features and the feature map;
according to the point cloud spatial information of the down-sampling points and the corresponding learned characteristics, local spatial context characteristics are learned to obtain local spatial context characteristics, and the method specifically comprises the following steps:
determining the polar coordinates and the geometric distance of the down-sampling points according to the point cloud space information of the down-sampling points, wherein the polar coordinates of the down-sampling points represent the local space context information of the down-sampling points;
obtaining local spatial context characteristics by a neighborhood point characteristic self-adaptive fusion method based on double distances according to the polar coordinates, the geometric distances and the learned characteristics;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
2. The large-scale point cloud semantic segmentation method according to claim 1, wherein the determining the polar coordinates of the downsampling points according to the point cloud space information of the downsampling points specifically comprises:
obtaining the initial polar coordinates of the down-sampling point according to the following formula
Wherein, a K nearest neighbor KNN method based on Euclidean distance is used for obtaining a down-sampling point p i The K neighbor comprises K neighbor points To lower the sampling point p i K-th neighbor point of (1)Relative position coordinates in a rectangular spatial coordinate system, i denotes a down-sampled point p i K denotes a neighbor pointK, K represents the number of neighboring points;
determining the polar coordinate angle alpha of the local space direction according to the local space direction i And beta i (ii) a The local spatial direction is defined by the down-sampling point p i Neighborhood centroid pointing to K nearest neighbor
Updating the polar seat of the down-sampling point according to the following formulaSignThe updated polar representation has local rotational invariance:
3. the large-scale point cloud semantic segmentation method according to claim 1, wherein the local spatial context features are obtained by a double-distance-based neighborhood point feature adaptive fusion method according to polar coordinates, geometric distances and learned features, and specifically comprises:
determining a characteristic distance and a geometric characteristic according to the polar coordinates and the learned characteristics of the down-sampling points;
Wherein softmax () represents a normalized exponential function, MLP () represents a multi-layer perceptron function,the characteristics of the connection are shown as such,the dual-range feature is represented by,represents the join operator;is a near neighbor pointThe geometric distance of (a);is a neighboring pointAverage L1 feature distance of (a), based on the learned feature g i And g k Determining; λ is the weight of the adjusted feature distance term, mean () represents the averaging function;representing neighbor pointsFeatures determined from the geometric features and learned features;
fusing each neighborhood point feature with the weighting parameterFusing to obtain local spatial context characteristics f iL :
4. The large-scale point cloud semantic segmentation method according to claim 1, characterized in that the down-sampled point p is determined according to the following formula i Global spatial context feature f iG :
Wherein,to connect operators, (x) i ,y i ,z i ) To lower the sampling point p i Spatial coordinates in a spatial rectangular coordinate system, r i Is a volume ratio, v i To lower the sampling point p i Volume of the neighborhood minimum circumscribed sphere, v g And the minimum circumscribed sphere volume of the point cloud to be identified.
5. A large-scale point cloud semantic segmentation system, comprising:
the device comprises an extraction unit, a recognition unit and a processing unit, wherein the extraction unit is used for extracting point-by-point characteristics of a point cloud to be recognized, and the point cloud to be recognized is composed of a plurality of points to be recognized;
the coding unit is used for gradually coding each point-by-point feature based on the point cloud space information of each point to be identified to obtain the corresponding point cloud feature:
performing point cloud down-sampling processing on each point to be identified to obtain a plurality of down-sampling points;
screening out the corresponding characteristics of the down-sampling points from the point-by-point characteristics, wherein the screened out characteristics are learned characteristics;
aiming at each down-sampling point, determining corresponding local spatial features according to point cloud spatial information of the down-sampling point and corresponding learned features;
determining corresponding global spatial context characteristics according to the point cloud spatial information of the down-sampling points;
determining corresponding spatial context characteristics according to the local spatial characteristics and the global spatial context characteristics of the down-sampling points, wherein the spatial context characteristics of the down-sampling points are point cloud characteristics;
the method for determining the corresponding local spatial features according to the point cloud spatial information of the down-sampling points and the corresponding learned features specifically comprises the following steps:
according to the point cloud space information of the down sampling points and the corresponding learned characteristics, local space context characteristics are learned to obtain local space context characteristics;
obtaining a feature map based on a shared parameter multi-layer perceptron MLP according to the learned features;
obtaining the local spatial features of the down-sampling points according to the local spatial context features and the feature map;
according to the point cloud spatial information of the down-sampling points and the corresponding learned characteristics, local spatial context characteristics are learned to obtain local spatial context characteristics, and the method specifically comprises the following steps:
determining the polar coordinates and the geometric distance of the down-sampling points according to the point cloud space information of the down-sampling points, wherein the polar coordinates of the down-sampling points represent the local space context information of the down-sampling points;
obtaining local spatial context characteristics by a neighborhood point characteristic self-adaptive fusion method based on double distances according to the polar coordinates, the geometric distances and the learned characteristics;
the decoding unit is used for gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and the prediction unit is used for determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
6. A large-scale point cloud semantic segmentation system comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified is composed of a plurality of points to be identified;
based on the point cloud space information of each point to be identified, gradually encoding each point-by-point feature to obtain a corresponding point cloud feature:
performing point cloud down-sampling processing on each point to be identified to obtain a plurality of down-sampling points;
screening out the corresponding characteristics of the down-sampling points from the point-by-point characteristics, wherein the screened out characteristics are learned characteristics;
aiming at each down-sampling point, determining corresponding local spatial characteristics according to the point cloud spatial information of the down-sampling point and the corresponding learned characteristics;
determining corresponding global spatial context characteristics according to the point cloud spatial information of the down-sampling points;
determining corresponding spatial context characteristics according to the local spatial characteristics and the global spatial context characteristics of the down-sampling points, wherein the spatial context characteristics of the down-sampling points are point cloud characteristics;
determining corresponding local spatial features according to the point cloud spatial information of the down-sampling points and the corresponding learned features, specifically comprising:
according to the point cloud space information of the down sampling points and the corresponding learned characteristics, local space context characteristics are learned to obtain local space context characteristics;
obtaining a feature map based on a shared parameter multi-layer perceptron MLP according to the learned features;
obtaining the local spatial features of the down-sampling points according to the local spatial context features and the feature map;
the local spatial context feature learning is carried out according to the point cloud spatial information of the down-sampling points and the corresponding learned features to obtain the local spatial context features, and the method specifically comprises the following steps:
determining the polar coordinates and the geometric distance of the down-sampling points according to the point cloud space information of the down-sampling points, wherein the polar coordinates of the down-sampling points represent the local space context information of the down-sampling points;
obtaining local spatial context characteristics by a neighborhood point characteristic self-adaptive fusion method based on double distances according to the polar coordinates, the geometric distances and the learned characteristics;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
7. A computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to:
extracting point-by-point characteristics of a point cloud to be identified, wherein the point cloud to be identified consists of a plurality of points to be identified;
based on the point cloud space information of each point to be identified, gradually encoding each point-by-point feature to obtain a corresponding point cloud feature:
performing point cloud down-sampling processing on each point to be identified to obtain a plurality of down-sampling points;
screening out the characteristics corresponding to the down-sampling points from the point-by-point characteristics, wherein the screened characteristics are learned characteristics;
aiming at each down-sampling point, determining corresponding local spatial characteristics according to the point cloud spatial information of the down-sampling point and the corresponding learned characteristics;
determining corresponding global spatial context characteristics according to the point cloud spatial information of the down-sampling points;
determining corresponding spatial context characteristics according to the local spatial characteristics and the global spatial context characteristics of the down-sampling points, wherein the spatial context characteristics of the down-sampling points are point cloud characteristics;
determining corresponding local spatial features according to the point cloud spatial information of the down-sampling points and the corresponding learned features, specifically comprising:
according to the point cloud space information of the down-sampling points and the corresponding learned characteristics, local space context characteristics are learned to obtain local space context characteristics;
obtaining a feature map based on the shared parameter multi-layer perceptron MLP according to the learned features;
obtaining the local spatial features of the down-sampling points according to the local spatial context features and the feature map;
the local spatial context feature learning is carried out according to the point cloud spatial information of the down-sampling points and the corresponding learned features to obtain the local spatial context features, and the method specifically comprises the following steps:
determining the polar coordinates and the geometric distance of the down-sampling points according to the point cloud spatial information sum of the down-sampling points, wherein the polar coordinates of the down-sampling points represent the local spatial context information of the down-sampling points;
obtaining local spatial context characteristics by a neighborhood point characteristic self-adaptive fusion method based on double distances according to the polar coordinates, the geometric distances and the learned characteristics;
gradually decoding the cloud characteristics of each point to obtain corresponding decoding characteristics;
and determining a semantic segmentation prediction result of the 3D point cloud to be recognized based on a semantic segmentation network model according to each decoding characteristic.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110309423.1A CN113011430B (en) | 2021-03-23 | 2021-03-23 | Large-scale point cloud semantic segmentation method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110309423.1A CN113011430B (en) | 2021-03-23 | 2021-03-23 | Large-scale point cloud semantic segmentation method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113011430A CN113011430A (en) | 2021-06-22 |
CN113011430B true CN113011430B (en) | 2023-01-20 |
Family
ID=76405543
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110309423.1A Active CN113011430B (en) | 2021-03-23 | 2021-03-23 | Large-scale point cloud semantic segmentation method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113011430B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113989504A (en) * | 2021-10-28 | 2022-01-28 | 广东工业大学 | Semantic segmentation method for three-dimensional point cloud data |
CN114782684B (en) * | 2022-03-08 | 2023-04-07 | 中国科学院半导体研究所 | Point cloud semantic segmentation method and device, electronic equipment and storage medium |
CN114898094B (en) * | 2022-04-22 | 2024-07-12 | 湖南大学 | Point cloud upsampling method and device, computer equipment and storage medium |
CN115169556B (en) * | 2022-07-25 | 2023-08-04 | 美的集团(上海)有限公司 | Model pruning method and device |
WO2024113078A1 (en) * | 2022-11-28 | 2024-06-06 | 中国科学院深圳先进技术研究院 | Local context feature extraction module for semantic segmentation in 3d point cloud scenario |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111489358A (en) * | 2020-03-18 | 2020-08-04 | 华中科技大学 | Three-dimensional point cloud semantic segmentation method based on deep learning |
CN111860138A (en) * | 2020-06-09 | 2020-10-30 | 中南民族大学 | Three-dimensional point cloud semantic segmentation method and system based on full-fusion network |
CN112396137A (en) * | 2020-12-14 | 2021-02-23 | 南京信息工程大学 | Point cloud semantic segmentation method fusing context semantics |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108229479B (en) * | 2017-08-01 | 2019-12-31 | 北京市商汤科技开发有限公司 | Training method and device of semantic segmentation model, electronic equipment and storage medium |
US11004202B2 (en) * | 2017-10-09 | 2021-05-11 | The Board Of Trustees Of The Leland Stanford Junior University | Systems and methods for semantic segmentation of 3D point clouds |
CN112149677A (en) * | 2020-09-14 | 2020-12-29 | 上海眼控科技股份有限公司 | Point cloud semantic segmentation method, device and equipment |
-
2021
- 2021-03-23 CN CN202110309423.1A patent/CN113011430B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111489358A (en) * | 2020-03-18 | 2020-08-04 | 华中科技大学 | Three-dimensional point cloud semantic segmentation method based on deep learning |
CN111860138A (en) * | 2020-06-09 | 2020-10-30 | 中南民族大学 | Three-dimensional point cloud semantic segmentation method and system based on full-fusion network |
CN112396137A (en) * | 2020-12-14 | 2021-02-23 | 南京信息工程大学 | Point cloud semantic segmentation method fusing context semantics |
Non-Patent Citations (1)
Title |
---|
SCF-Net: Learning Spatial Contextual Features for Large-Scale Point Cloud Segmentation;Siqi Fan 等;《2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)》;20211102;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113011430A (en) | 2021-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113011430B (en) | Large-scale point cloud semantic segmentation method and system | |
CN111798475B (en) | Indoor environment 3D semantic map construction method based on point cloud deep learning | |
CN109685152B (en) | Image target detection method based on DC-SPP-YOLO | |
CN111191566B (en) | Optical remote sensing image multi-target detection method based on pixel classification | |
CN108734210B (en) | Object detection method based on cross-modal multi-scale feature fusion | |
CN111862289B (en) | Point cloud up-sampling method based on GAN network | |
CN110738697A (en) | Monocular depth estimation method based on deep learning | |
CN113139470B (en) | Glass identification method based on Transformer | |
CN111667535B (en) | Six-degree-of-freedom pose estimation method for occlusion scene | |
CN113191387A (en) | Cultural relic fragment point cloud classification method combining unsupervised learning and data self-enhancement | |
CN112819080B (en) | High-precision universal three-dimensional point cloud identification method | |
CN113420590B (en) | Robot positioning method, device, equipment and medium in weak texture environment | |
CN111368759A (en) | Monocular vision-based semantic map construction system for mobile robot | |
CN109242019A (en) | A kind of water surface optics Small object quickly detects and tracking | |
CN114723764A (en) | Parameterized edge curve extraction method for point cloud object | |
CN116563682A (en) | Attention scheme and strip convolution semantic line detection method based on depth Hough network | |
CN117522990B (en) | Category-level pose estimation method based on multi-head attention mechanism and iterative refinement | |
CN116721206A (en) | Real-time indoor scene vision synchronous positioning and mapping method | |
CN117078956A (en) | Point cloud classification segmentation network based on point cloud multi-scale parallel feature extraction and attention mechanism | |
CN117351198A (en) | Point cloud semantic segmentation method based on dynamic convolution | |
CN115944868A (en) | Control method for ship-borne fire water monitor | |
CN115235505A (en) | Visual odometer method based on nonlinear optimization | |
CN117036966B (en) | Learning method, device, equipment and storage medium for point feature in map | |
CN116704464A (en) | Three-dimensional target detection method, system and storage medium based on auxiliary task learning network | |
CN118238832B (en) | Intelligent driving method and device based on visual perception |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |