WO2022088676A1 - 三维点云语义分割方法、装置、设备及介质 - Google Patents
三维点云语义分割方法、装置、设备及介质 Download PDFInfo
- Publication number
- WO2022088676A1 WO2022088676A1 PCT/CN2021/097548 CN2021097548W WO2022088676A1 WO 2022088676 A1 WO2022088676 A1 WO 2022088676A1 CN 2021097548 W CN2021097548 W CN 2021097548W WO 2022088676 A1 WO2022088676 A1 WO 2022088676A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- point cloud
- data
- feature vector
- semantic category
- deep learning
- Prior art date
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 72
- 238000000034 method Methods 0.000 title claims abstract description 60
- 238000013528 artificial neural network Methods 0.000 claims abstract description 96
- 238000012549 training Methods 0.000 claims abstract description 89
- 239000013598 vector Substances 0.000 claims description 189
- 238000013135 deep learning Methods 0.000 claims description 114
- 230000008447 perception Effects 0.000 claims description 45
- 238000004364 calculation method Methods 0.000 claims description 27
- 230000006870 function Effects 0.000 claims description 27
- 238000004590 computer program Methods 0.000 claims description 16
- 238000000605 extraction Methods 0.000 claims description 16
- 238000012545 processing Methods 0.000 claims description 11
- 238000005070 sampling Methods 0.000 claims description 8
- 238000013473 artificial intelligence Methods 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 description 15
- 230000008569 process Effects 0.000 description 10
- 238000011176 pooling Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 239000003086 colorant Substances 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Definitions
- the present application relates to the technical field of artificial intelligence, and in particular, to a method, apparatus, device and medium for semantic segmentation of three-dimensional point clouds.
- 3D point cloud semantic segmentation technologies include: deep learning segmentation technology using voxel method, deep learning segmentation technology using multi-view method, and deep learning segmentation technology using point cloud method.
- the network has limited input for multi-view images, and a fixed number of multi-views may not be able to fully represent the 3D model, resulting in the loss of information about the target structure, such as self-occlusion of objects, etc.
- the above two-dimensional image itself will also lose accuracy, so that it cannot be used for semantic segmentation of point clouds on complex and fine structures.
- the deep learning segmentation technology using the point cloud method is a deep learning method that studies the direct input of point cloud data for processing. , which makes it difficult to apply to the semantic segmentation of complex target object point clouds.
- the purpose is to solve the technical problem that the existing 3D point cloud semantic segmentation technology is difficult to apply to the point cloud semantic segmentation of complex target objects.
- the main purpose of this application is to provide a 3D point cloud semantic segmentation method, device, equipment and medium, which aims to solve the technical problem that the prior art 3D point cloud semantic segmentation technology is difficult to apply to the point cloud semantic segmentation of complex target objects.
- the present application proposes a 3D point cloud semantic segmentation method, the method includes: obtaining 3D point cloud data to be predicted; using preset space cells to perform point cloud division on the 3D point cloud data to be predicted and quantitative discrimination to obtain the target point cloud data; input the target point cloud data into the point cloud semantic category prediction model for probability prediction of semantic categories, and obtain the point cloud semantic category probability prediction value of the target point cloud data, the
- the point cloud semantic category prediction model is a model trained based on the PointSIFT neural network module and the PointNet++ neural network; according to the point cloud semantic category probability prediction value of the target point cloud data, the target of each point in the target point cloud data is determined. Semantic categories.
- the present application also proposes a 3D point cloud semantic segmentation device, the device includes: a point cloud acquisition module for acquiring the 3D point cloud data to be predicted; a point cloud segmentation processing module for using a preset space cell to The three-dimensional point cloud data to be predicted is divided and quantified, and the target point cloud data is obtained; the probability prediction module is used to input the target point cloud data into the point cloud semantic category prediction model to perform probability prediction of semantic categories, and obtain The point cloud semantic category probability prediction value of the target point cloud data, and the point cloud semantic category prediction model is a model obtained based on PointSIFT neural network module and PointNet++ neural network training; Semantic category determination module, for according to the target point The point cloud semantic category probability prediction value of the cloud data determines the target semantic category of each point in the target point cloud data.
- the present application also proposes a computer device, comprising a memory and a processor, wherein the memory stores a computer program, and the processor implements the following method steps when executing the computer program: acquiring 3D point cloud data to be predicted; using a preset The spatial unit performs point cloud division and quantitative discrimination on the three-dimensional point cloud data to be predicted, and obtains target point cloud data; inputs the target point cloud data into the point cloud semantic category prediction model to perform probability prediction of semantic categories, and obtains the target point cloud data.
- the point cloud semantic category probability prediction value of the target point cloud data, the point cloud semantic category prediction model is based on the model obtained by PointSIFT neural network module and PointNet++ neural network training; according to the point cloud semantic category probability of the target point cloud data
- the predicted value determines the target semantic category of each point in the target point cloud data.
- the present application also proposes a computer-readable storage medium on which a computer program is stored.
- the computer program is executed by a processor, the following method steps are implemented: acquiring three-dimensional point cloud data to be predicted; The three-dimensional point cloud data to be predicted is divided and quantified, and the target point cloud data is obtained; the target point cloud data is input into the point cloud semantic category prediction model, and the probability prediction of the semantic category is performed to obtain the target point cloud data.
- the point cloud semantic category probability prediction value, the point cloud semantic category prediction model is a model obtained based on PointSIFT neural network module and PointNet++ neural network training; According to the point cloud semantic category probability prediction value of the target point cloud data, determine the Describe the target semantic category of each point in the target point cloud data.
- the three-dimensional point cloud semantic segmentation method, device, device and medium of the present application obtain the target point cloud data by using preset space cells to perform point cloud division and quantitative discrimination on the three-dimensional point cloud data to be predicted, thereby realizing the target point cloud data for complex large-scale targets.
- the point cloud of the object is divided into a fast and accurate logic to ensure a good representation of the target object, thereby improving the recognition accuracy of the point cloud semantic segmentation;
- the probability of entering the target point cloud data into the point cloud semantic category prediction model for the semantic category Prediction is based on the PointSIFT neural network module and the PointNet++ neural network training model, because the PointNet++ neural network is based on the extension of the PointNet feature extraction block, and a hierarchical structure is added to process local features.
- Better segmentation results so that the point cloud semantic category prediction model can better handle the fine features of complex target objects; and because the scale perception of the PointSIFT neural network module can select the most representative shape scale, while the PointSIFT neural network module can choose the most representative shape scale.
- the direction encoding of the directional encoding can comprehensively perceive the point cloud information in different directions, thereby improving the accuracy of the semantic category prediction of the point cloud semantic category prediction model.
- FIG. 1 is a schematic flowchart of a method for semantic segmentation of a 3D point cloud according to an embodiment of the present application
- FIG. 2 is a schematic block diagram of the structure of a 3D point cloud semantic segmentation device according to an embodiment of the present application
- FIG. 3 is a schematic structural block diagram of a computer device according to an embodiment of the present application.
- Semantic segmentation in this application is a classification at the pixel level, and pixels belonging to the same category are all classified into one category. Therefore, semantic segmentation understands images from the pixel level. For example, in the following photos, the pixels belonging to people should be divided into one category, the pixels belonging to motorcycles should also be divided into one category, and there are also background pixels. Note that semantic segmentation is different from instance segmentation. For example, if there are many people in a photo, for semantic segmentation, as long as the pixels of the people are classified into one category, but the instance segmentation also needs to classify the pixels of different people. into different categories. That is to say, instance segmentation goes further than semantic segmentation.
- the PointNet of this application is essentially a network structure, which inputs point cloud data according to certain rules, and obtains classification results or segmentation results through layer-by-layer calculations.
- the special feature is the existence of two transformation matrices (inputtransform&featuretransform). According to the text, these two transformation matrices can maintain the spatial invariance of point cloud data during the deep learning process.
- the PointNet++ of this application is an improvement on PointNet, considering the extraction of local features of point clouds, so as to better classify and segment point clouds.
- RGB color mode of this application is a color standard in the industry, which is obtained by changing the three color channels of red (R), green (G), and blue (B) and superimposing them on each other.
- RGB is the color representing the three channels of red, green, and blue.
- This standard includes almost all colors that can be perceived by human vision, and is one of the most widely used color systems.
- the point cloud of the present application is a collection of point data on the appearance surface of the product obtained by measuring instruments in reverse engineering.
- the number of points obtained by using a three-dimensional coordinate measuring machine is relatively small, and the distance between points is relatively large, which is called sparse point. Cloud; and the point cloud obtained by using a 3D laser scanner or a camera scanner, the number of points is relatively large and dense, which is called a dense point cloud.
- the present application proposes a 3D point cloud semantic segmentation method, which is applied to the field of artificial intelligence technology.
- the method is further applied to the field of artificial intelligence neural network technology.
- the method first uses spatial cells to perform point cloud division and quantitative discrimination on the three-dimensional point cloud data to be predicted, so as to ensure a good representation of the target object, and then uses the model based on the PointSIFT neural network module and the PointNet++ neural network training to perform semantic analysis. Probabilistic prediction of categories to improve the recognition accuracy of point cloud segmentation.
- the three-dimensional point cloud semantic segmentation method includes:
- S2 Use preset space cells to perform point cloud division and quantitative discrimination on the to-be-predicted 3D point cloud data to obtain target point cloud data;
- S3 Input the target point cloud data into a point cloud semantic category prediction model for probability prediction of semantic categories, and obtain a point cloud semantic category probability prediction value of the target point cloud data, where the point cloud semantic category prediction model is based on Models trained by PointSIFT neural network module and PointNet++ neural network;
- S4 Determine the target semantic category of each point in the target point cloud data according to the point cloud semantic category probability prediction value of the target point cloud data.
- the target point cloud data is obtained by dividing and quantifying the 3D point cloud data to be predicted by using the preset space cell, thereby realizing the fast and accurate logical division of the point cloud of the complex and large-scale target object, ensuring that It has a good representation of the target object, thereby improving the recognition accuracy of point cloud semantic segmentation; input the target point cloud data into the point cloud semantic category prediction model for probability prediction of the semantic category, and the point cloud semantic category prediction model is based on the PointSIFT neural network.
- the model obtained by the module and the PointNet++ neural network training because the PointNet++ neural network is based on the extension of the PointNet feature extraction block, adding a hierarchical structure for processing local features, and achieving better segmentation results, so that the point cloud semantic category prediction
- the model can better handle the fine features of complex target objects; and because the scale perception of the PointSIFT neural network module can select the most representative shape scale, and the direction encoding of the PointSIFT neural network module can comprehensively perceive the point cloud information in different directions. , thereby improving the accuracy of semantic category prediction by the point cloud semantic category prediction model.
- the 3D point cloud data to be predicted can be obtained from the database.
- the three-dimensional point cloud data to be predicted refers to a set of point data obtained from the appearance surface of the target object.
- Methods for extracting point data sets from the appearance surface of the target object include, but are not limited to, three-dimensional camera shooting and radar scanning.
- the three-dimensional point cloud data to be predicted includes: point description data of multiple points.
- the point description data includes: three-dimensional coordinates of the point.
- the three-dimensional coordinate of a point is the coordinate data of the point in the three-dimensional coordinate system, which is expressed as (x, y, z).
- the point description data further includes: the color value of the point.
- the color value of the point can be expressed in the RGB color mode.
- the step of acquiring the three-dimensional point cloud data to be predicted includes: S11: acquiring all three-dimensional point cloud data of the target object; S12: randomly selecting a point from all the three-dimensional point cloud data of the target object as a selection point S13: from all the three-dimensional point cloud data of the target object, extract the three-dimensional point cloud data within the preset range centered on the selected point, and use the extracted three-dimensional point cloud data as the Describe the 3D point cloud data to be predicted.
- For S11 get all the point cloud data of the target object from the database.
- For S12 randomly select a point from the point cloud corresponding to all the three-dimensional point cloud data of the target object as a selection point.
- the selected points in the point cloud corresponding to all the three-dimensional point cloud data of the target object and the points within the preset range around the selected point are used as the target point cloud, and the point description data corresponding to the target point cloud is used as the target point cloud. Predict 3D point cloud data.
- a value corresponding to 1% of the point cloud volume corresponding to all three-dimensional point cloud data of the target object is used as the preset range.
- Point cloud volume refers to the volume of the smallest straight parallelepiped that can accommodate all point clouds.
- Straight parallelepipeds include: cuboids and cubes.
- a preset space cell to perform point cloud division on the three-dimensional point cloud data to be predicted, that is, dividing the points in the point cloud corresponding to the three-dimensional point cloud data to be predicted into a preset space cell, Each point belongs to only one preset space cell; then quantify the points in the preset space cell, and when the quantitative judgment meets the requirements, the preset space cell is regarded as an effective space cell; finally, from the effective space cell Point selection is performed in the grid, and the point description data corresponding to the selected point is used as the target point cloud data of an effective space cell, that is, each effective space cell corresponds to a target point cloud data.
- the target point cloud data includes point description data of a plurality of points (that is, the three-dimensional coordinates of the points).
- the point description data of the target point cloud data includes: the three-dimensional coordinates of the point and the color value of the point, thereby helping to improve the accuracy of probabilistic prediction of the semantic category of the target point cloud data.
- each point in the point cloud corresponding to the target point cloud data includes a plurality of semantic category probability prediction values.
- the specific number of the plurality of semantic category probability prediction values is the same as the number of semantic categories.
- Semantic category is a classification of points determined according to the role of the target object and/or the application scene.
- the semantic categories include but are not limited to: bottom segment structure, side segment structure, deck segment structure, and bulkhead structure, which are not specifically limited by examples herein.
- the model to be trained is obtained according to the PointSIFT neural network module and the PointNet++ neural network, the training sample is used to train the to-be-trained model, and the trained to-be-trained model is used as a point cloud semantic category prediction model.
- the target semantic category of the point is determined.
- the above-mentioned steps of using preset space cells to perform point cloud division and quantitative discrimination on the to-be-predicted 3D point cloud data to obtain target point cloud data include:
- S22 Calculate the total volume of the plurality of space cells to be processed to obtain the total volume of the space cells
- S23 Perform volume calculation on the point cloud in the to-be-processed space cell to obtain the point cloud volume of the to-be-processed space cell;
- S25 Determine whether the point cloud volume ratio of each of the to-be-processed space cells is greater than a preset ratio threshold
- S27 Select points from the valid space cells to obtain the target point cloud data.
- the volume of each space cell to be processed is calculated, and the volumes of all space cells to be processed are added to obtain the total volume of space cells.
- volume calculation is performed on the point cloud in each of the plurality of spatial cells to be processed.
- the smallest straight parallelepiped that can accommodate all the points in the space cell to be processed is found, the volume of the found straight parallelepiped is calculated, and the calculated volume is used as the point cloud volume of the space cell to be processed.
- each of the space cells to be processed sequentially divide the point cloud volume of each of the space cells to be processed by the total volume of the space cells to obtain the point cloud volume ratios of the space cells to be processed, that is, each The to-be-processed space cell corresponds to a point cloud volume ratio.
- the preset scale threshold is a scale value.
- the space cell to be processed is regarded as an effective space cell, which is beneficial to ensure a good representation of the target object.
- the to-be-processed space cell corresponding to the point cloud volume ratio of the to-be-processed space cell is discarded .
- a preset number of points are selected from the point cloud of the effective space unit, and the point description data (that is, the three-dimensional coordinates of the points) corresponding to the selected points are used as the target point cloud data.
- the preset number is 8192.
- the preset number is 16384, thereby realizing point cloud increment.
- the above-mentioned steps of selecting points from the effective space cells to obtain the target point cloud data include:
- S271 randomly select points according to a preset number on the point cloud in the effective space unit to obtain point cloud data to be processed;
- S272 Perform center point calculation on the to-be-processed point cloud data to obtain center point coordinate data
- S273 subtract the coordinate data of the center point from the coordinate data of each point in the point cloud data to be processed to obtain the coordinate difference value of each point in the point cloud data to be processed;
- S274 Perform standard deviation calculation according to the coordinate data of all points of the point cloud data to be processed and the coordinate data of the center point, to obtain the point cloud standard deviation of the point cloud data to be processed;
- S275 Divide the coordinate difference of each point in the point cloud data to be processed by the standard deviation of the point cloud to obtain the target point cloud data.
- This embodiment realizes the normalization operation of the point cloud data to be processed, which is beneficial to improve the accuracy of semantic recognition.
- center point calculation is performed according to the three-dimensional coordinates of all point description data in the point cloud data to be processed to obtain center point coordinate data, that is, the center point coordinate data is coordinate data in a three-dimensional coordinate system.
- each coordinate difference includes an x-difference, a y-difference, and a z-difference at the same time.
- the number of coordinate differences can be one or more.
- standard deviation calculation is performed according to the x-axis coordinates of the coordinate data of all points of the point cloud data to be processed and the x-axis coordinates of the center point coordinate data to obtain the x standard deviation; according to the point cloud data to be processed
- the y-axis coordinates of the coordinate data of all points and the y-axis coordinates of the center point coordinate data carry out standard deviation calculation to obtain the y standard deviation;
- the standard deviation of the z-axis coordinate of the center point coordinate data is calculated to obtain the z standard deviation; the x standard deviation, the y standard deviation, and the z standard deviation are used as the standard deviation of the point cloud. That is to say, the standard deviation of the point cloud also includes an x standard deviation, a y standard deviation, and a z standard deviation.
- the target z value is used as the three-dimensional coordinates of the point of the point description data of the point, that is, the target point cloud data includes the point description data of multiple points, and the three-dimensional coordinates of the point description data of each point simultaneously include A target x value, a target y value, and a target z value.
- the method before the step of inputting the target point cloud data into a point cloud semantic category prediction model to perform probability prediction of semantic categories, and obtaining the point cloud semantic category probability prediction value of the target point cloud data, the method further includes: :
- training samples include: point cloud sample data and point cloud semantic category calibration data;
- S032 Input the point cloud sample data of the training sample into the model to be trained to perform probability prediction of semantic categories, and obtain the sample semantic category probability prediction data of the training sample, wherein the model to be trained is based on the PointSIFT neural network module and the model determined by the PointNet++ neural network training;
- S033 Train the model to be trained according to the sample semantic category probability prediction data and the point cloud semantic category calibration data, and use the to-be-trained model after training as the point cloud semantic category prediction model.
- This embodiment realizes that the model to be trained is determined according to the training of the PointSIFT neural network module and the PointNet++ neural network, and the point cloud semantic category prediction model is obtained by training the model to be trained, because the PointNet++ neural network is based on the PointNet feature extraction block.
- Each training sample includes a point cloud sample data and a point cloud semantic category calibration data.
- the point cloud sample data includes point description data of multiple points (that is, the three-dimensional coordinates of the points), and the point cloud semantic category calibration data includes the semantic category calibration values of the multiple points. It can be understood that each point in the point cloud sample data corresponds to a semantic category calibration value in the point cloud semantic category calibration data.
- the semantic category calibration value can be expressed as a vector, for example, there are 5 semantic categories, the semantic category calibration value vector corresponds to point A of the point cloud sample data, and the semantic category calibration value corresponding to point A is [01000], [01000 ] indicates that the second semantic category is the calibration result of the semantic category of the point by professionals.
- the semantic category calibration value is the result of semantic category calibration performed by professionals on the point cloud sample data according to the point description data of the point.
- the point cloud sample data of all the training samples are sequentially input into the model to be trained to perform probability prediction of semantic categories, so as to obtain sample semantic category probability prediction data of a plurality of the training samples. That is, each training sample corresponds to a sample semantic category probability prediction data.
- the model to be trained is determined according to the PointSIFT neural network module and the point set abstraction module and the feature propagation module of the PointNet++ neural network.
- the PointSIFT neural network module is used for orientation encoding and scale perception.
- the point set abstraction module is used for downsampling, the feature propagation module is used for upsampling, and the downsampling and upsampling processes are aligned.
- PointSIFT neural network modules are interspersed between adjacent point set abstraction modules and feature propagation modules. After upsampling, the model to be trained obtains the sample semantic category probability prediction data through a fully connected layer.
- SA point set abstraction module
- SA refers to Set Abstraction, and the specific method can be selected from the prior art, which will not be repeated here.
- FP refers to feature propagation
- the specific method can be selected from the existing technology, which will not be repeated here.
- the model to be trained sequentially includes: a multi-layer perceptron, a first deep learning module, a first downsampling layer, a second deep learning module, a second downsampling layer, a third deep learning module, a Three downsampling layers, the fourth deep learning module, the first upsampling layer, the fifth deep learning module, the second upsampling layer, the sixth deep learning module, the third upsampling layer, the seventh deep learning module, the discarding layer, Fully connected layer, the first deep learning module, the second deep learning module, the third deep learning module, the fourth deep learning module, the fifth deep learning module, the sixth deep learning module
- the module and the seventh deep learning module adopt the PointSIFT neural network module
- the first downsampling layer, the second downsampling layer and the third downsampling layer adopt the point set abstraction of the PointNet++ neural network module
- the first upsampling layer, the second upsampling layer and the third upsampling layer adopt the feature propagation module of the PointNet++ neural network
- the step of inputting the point cloud sample data of the training sample into the model to be trained to perform probability prediction of semantic categories, and obtaining the sample semantic category probability prediction data of the training sample includes:
- S03201 Input the point cloud sample data of the training sample into the multi-layer perceptron for feature extraction to obtain a first feature vector
- S03202 Inputting the first feature vector into the first deep learning module for direction encoding and scale perception to obtain a second feature vector;
- S03203 Input the second feature vector into the first downsampling layer for downsampling to obtain a third feature vector
- S03204 Input the third feature vector into the second deep learning module for direction encoding and scale perception to obtain a fourth feature vector
- S03206 Input the fifth feature vector into the third deep learning module for direction encoding and scale perception, to obtain a sixth feature vector;
- S03209 Input the eighth feature vector into the first upsampling layer for upsampling to obtain a ninth feature vector
- S03210 Input the ninth feature vector into the fifth deep learning module for direction encoding and scale perception, to obtain the tenth feature vector;
- S03212 Input the eleventh feature vector into the sixth deep learning module for direction encoding and scale perception, to obtain the twelfth feature vector;
- S03214 Input the thirteenth feature vector into the seventh deep learning module for direction encoding and scale perception, to obtain the fourteenth feature vector;
- S03215 Input the fourteenth feature vector into the discarding layer for random discarding to obtain the fifteenth feature vector
- S03216 Input the fifteenth feature vector into the fully connected layer for connection to obtain sample semantic category probability prediction data of the training sample.
- the point set abstraction module is used for downsampling
- the three feature propagation modules are used for upsampling
- a hierarchical structure is added to process local features, and good segmentation results are obtained, so that the point cloud semantic category prediction model can be It can better handle the fine features of complex target objects; and because the scale perception of the seven PointSIFT neural network modules can select the most representative shape scale, and the PointSIFT neural network modules are interspersed in the adjacent point set abstraction module and feature propagation module. Orientation encoding can comprehensively perceive point cloud information in different directions, thereby improving the accuracy of semantic category prediction.
- the input layer converts the input data into three-channel feature vectors. For example, convert the input point description data of 16384 points (that is, the three-dimensional coordinates of the points) into 16384 ⁇ 3 feature vectors, where (16384 in 16384 ⁇ 3 is the number of rows of the feature vector, which is also the number of points, 3 is the number of columns of the feature vector and is also the feature dimension, the three feature dimensions describe the x-axis, y-axis, and z-axis coordinate data of the point), which is not specifically limited in this example.
- the point cloud sample data of the training sample (16384 ⁇ 3, 16384 is the number of rows of the feature vector, which is also the number of points, 3 is the number of columns of the feature vector, which is also the feature dimension)
- the first feature vector ( The size is 16384 ⁇ 64, 16384 is the number of rows of the feature vector, which is also the number of points, 64 is the number of columns of the feature vector, which is also the feature dimension)
- the third feature vector size is 2048 ⁇ 128, 2048 is the number of rows of the feature vector , is also the number of points, 128 is the number of columns of the feature vector, and is also the feature dimension)
- the fifth feature vector dimension is 256 ⁇ 256, 256 is the number of rows of the feature vector and the number of points, and 256 is the number of columns of the feature vector , which is also the feature dimension
- the seventh feature vector (size is 64 ⁇ 512, 64 is the number of rows of the feature vector, which is also the number of points, and 512 is the number of columns of the feature vector
- the sample semantic category probability prediction data of the training sample (the size is 16384 ⁇ c, 16384 is the number of rows of the feature vector, which is also the number of points, and c is the number of columns of the feature vector, which is also the number of semantic categories), This example is not specifically limited.
- the input point description data of the low-dimensional point cloud is mapped into a point-by-point high-dimensional feature vector, and the symmetry invariance is maintained.
- the point cloud sample data is x
- D represents the feature dimension that measures each point
- the density of N in the discrete metric space is non-uniform.
- the symmetric function g is realized by the maximum pooling, that is, each dimension of the D-dimensional feature will select the sum of the corresponding eigenvalues or the largest eigenvalue among the N points.
- the formula is to use the multi-layer perceptron MLP as the h function for feature extraction, and input the set of a series of single-valued functions into the maximum pooling function (that is, the symmetric function g) in a high-dimensional space, and by ⁇
- the network further digests the point cloud information to obtain the attributes of the point cloud set, the formula is:
- the ⁇ () and h() functions belong to the network structure of the multi-layer perceptron MLP.
- the SIFT feature descriptor considers two basic features of morphological expression: one is direction encoding, which assigns directions to each point after obtaining the matching feature point positions; the other is scale perception, It can select the most suitable size for feature extraction based on the data input to the PointSIFT neural network module.
- PointSIFT is a neural network module, which can realize self-optimization according to the pre-training process.
- the basic module of PointSIFT is the orientation encoding unit, the Orientation-encoding unit, or OE unit for short, which can perform convolution in 8 directions and extract features.
- the three-dimensional space is divided into eight subspaces with the P n point as the center, and each subspace contains eight different directional information.
- the feature of the neighbor point characterizing the subspace can be obtained. It can be understood that the neighbor point of P n
- the number of K n is 8, that is, each subspace corresponds to a nearest neighbor point K n . If there is no target point within the search radius within a certain subspace, it can be represented by the feature vector Qn.
- a third-order directional convolution is performed along the x-axis, y-axis, and z-axis respectively, and the feature code of the searched neighbor K n is included in the tensor Among them, the three dimensions of R a ⁇ b ⁇ c correspond to the x-axis, y-axis, and z-axis, and the third-order directional convolution formula is as follows:
- N 1 g[Conv x (A x ,N)] ⁇ R 2 ⁇ 2 ⁇ 1 ⁇ d
- N 2 g[Conv y (A y ,N)] ⁇ R 2 ⁇ 1 ⁇ 1 ⁇ d
- N 3 g[Conv z (A z ,N)] ⁇ R 1 ⁇ 1 ⁇ 1 ⁇ d
- a x , A y , A z are parameters to be updated of the to-be-predicted model.
- each point P n will be transformed into a d-dimensional vector that contains the shape information in the neighborhood around P n . It can be seen that by stacking multiple direction coding units by convolution, the direction coding units of different convolutional layers can perceive the scale information in each direction, and then use shortcuts (shortcuts or direct connections) to encode the directions of the previous layers. The units are connected to extract the final scale-invariant feature information, thereby solving the problem of point cloud disorder and invariance. Shortcuts include: add (addition) or concat (vector concatenation).
- the model to be trained is trained according to the sample semantic category probability prediction data and the point cloud semantic category calibration data, and the to-be-trained model after training is used as the point cloud semantic category prediction model steps, including:
- S0331 Input the sample semantic category probability prediction data and the point cloud semantic category calibration data into a loss function for calculation to obtain a loss value of the model to be trained, and update the parameters of the model to be trained according to the loss value, The updated model to be trained is used for the next calculation of the sample semantic category probability prediction data;
- the loss function adopts a cross entropy function.
- This embodiment realizes the training of the to-be-trained model.
- the first convergence condition means that the magnitude of the loss value calculated twice adjacently satisfies the Lipschitz condition (the Lipschitz continuity condition).
- the number of iterations refers to the number of times that the model to be trained is used to calculate the probability prediction data of the semantic category of the sample, that is, the number of iterations increases by 1 for one calculation.
- the second convergence condition is a preset number of times. Among them, the cross entropy function loss is:
- y i represents the i-th component after the point cloud semantic category calibration data is converted into a one-hot vector
- the sample semantic category probability prediction data of the training sample is expressed as
- the above-mentioned step of determining the target semantic category of each point in the target point cloud data according to the point cloud semantic category probability prediction value of the target point cloud data includes:
- S41 Extract the semantic category probability prediction value of the same point from the point cloud semantic category probability prediction value to obtain the target semantic category probability prediction value;
- S42 Find the maximum value from the predicted value of the probability of the target semantic category, and use the semantic category corresponding to the found maximum value as the target semantic category of the point corresponding to the predicted value of the probability of the target semantic category.
- the target semantic category of each point is determined according to the predicted value of the semantic category probability of the point cloud.
- the maximum value is found from the probability prediction values of all target semantic categories corresponding to the same point, and the semantic category corresponding to the found maximum value is used as the target semantic category of the point.
- the present application also proposes a 3D point cloud semantic segmentation device, the device includes:
- the point cloud acquisition module 100 is used to acquire the three-dimensional point cloud data to be predicted;
- the point cloud segmentation processing module 200 is configured to perform point cloud division and quantitative discrimination on the to-be-predicted 3D point cloud data by using a preset space cell to obtain target point cloud data;
- the probability prediction module 300 is configured to input the target point cloud data into a point cloud semantic category prediction model for probability prediction of semantic categories, and obtain a point cloud semantic category probability prediction value of the target point cloud data, the point cloud semantic
- the category prediction model is a model trained based on the PointSIFT neural network module and the PointNet++ neural network;
- the semantic category determination module 400 is configured to determine the target semantic category of each point in the target point cloud data according to the point cloud semantic category probability prediction value of the target point cloud data.
- the target point cloud data is obtained by dividing and quantifying the 3D point cloud data to be predicted by using the preset space cell, thereby realizing the fast and accurate logical division of the point cloud of the complex and large-scale target object, ensuring that It has a good representation of the target object, thereby improving the recognition accuracy of point cloud semantic segmentation; input the target point cloud data into the point cloud semantic category prediction model for probability prediction of the semantic category, and the point cloud semantic category prediction model is based on the PointSIFT neural network.
- the model obtained by the module and the PointNet++ neural network training because the PointNet++ neural network is based on the extension of the PointNet feature extraction block, adding a hierarchical structure for processing local features, and achieving better segmentation results, so that the point cloud semantic category prediction
- the model can better handle the fine features of complex target objects; and because the scale perception of the PointSIFT neural network module can select the most representative shape scale, and the direction encoding of the PointSIFT neural network module can comprehensively perceive the point cloud information in different directions. , thereby improving the accuracy of semantic category prediction by the point cloud semantic category prediction model.
- the point cloud segmentation processing module 200 includes: a point cloud division sub-module, a quantization discrimination sub-module, and a point selection sub-module; the point cloud division sub-module is used for adopting the preset space cell Discretely divide the three-dimensional point cloud data to be predicted to obtain a plurality of space cells to be processed; the quantitative discrimination sub-module is used to calculate the total volume of the plurality of space cells to be processed to obtain the space cells The total volume, calculate the volume of the point cloud in the space cell to be processed, obtain the point cloud volume of the space cell to be processed, and divide the point cloud volume of each space cell to be processed by all The total volume of the space cells is obtained, the volume ratio of the point clouds of a plurality of the space cells to be processed is obtained, and it is judged whether the volume ratio of the point clouds of each of the space cells to be processed is greater than the preset ratio threshold.
- the to-be-processed space cell corresponding to the point cloud volume ratio of the to-be-processed space cell is used as an effective space cell; the point selection The sub-module is used for selecting points from the effective space cells to obtain the target point cloud data.
- the point selection sub-module includes: a point cloud determination unit to be processed and a normalization processing unit; the point cloud determination unit to be processed is configured to select the point cloud in the effective space cell by pressing A preset number of points are randomly selected to obtain point cloud data to be processed; the normalization processing unit is used to perform center point calculation on the point cloud data to be processed, to obtain center point coordinate data, and the to-be-processed point cloud data is obtained.
- the coordinate data of each point in the point cloud data is subtracted from the coordinate data of the center point to obtain the coordinate difference value of each point in the point cloud data to be processed.
- the standard deviation calculation is performed on the coordinate data and the coordinate data of the center point to obtain the point cloud standard deviation of the point cloud data to be processed, and the coordinate difference value of each point in the point cloud data to be processed is divided by the point. Cloud standard deviation to obtain the target point cloud data.
- the apparatus further includes: a model training module; the model training module includes: a sample acquisition sub-module and a training sub-module; the sample acquisition sub-module is used to acquire a plurality of training samples, and the training The samples include: point cloud sample data, point cloud semantic category calibration data; the training sub-module is used to input the point cloud sample data of the training sample into the model to be trained for probability prediction of semantic categories, and obtain the The sample semantic category probability prediction data of the training sample, wherein the model to be trained is a model determined according to the PointSIFT neural network module and the PointNet++ neural network training, and the sample semantic category probability prediction data and the point cloud are based on the model.
- the semantic category calibration data is used to train the to-be-trained model, and the to-be-trained model after training is used as the point cloud semantic category prediction model.
- the model to be trained sequentially includes: a multi-layer perceptron, a first deep learning module, a first downsampling layer, a second deep learning module, a second downsampling layer, a third deep learning module, a Three downsampling layers, the fourth deep learning module, the first upsampling layer, the fifth deep learning module, the second upsampling layer, the sixth deep learning module, the third upsampling layer, the seventh deep learning module, the discarding layer, Fully connected layer, the first deep learning module, the second deep learning module, the third deep learning module, the fourth deep learning module, the fifth deep learning module, the sixth deep learning module
- the module and the seventh deep learning module adopt the PointSIFT neural network module
- the first downsampling layer, the second downsampling layer and the third downsampling layer adopt the point set abstraction of the PointNet++ neural network module
- the first upsampling layer, the second upsampling layer and the third upsampling layer adopt the feature propagation module of the PointNet++ neural network
- the training sub-module includes: a sample prediction unit; the sample prediction unit is used to input the point cloud sample data of the training sample into the multi-layer perceptron for feature extraction to obtain a first feature vector,
- the first feature vector is input into the first deep learning module for direction encoding and scale perception to obtain a second feature vector
- the second feature vector is input into the first downsampling layer for downsampling to obtain a third feature vector , input the third feature vector into the second deep learning module for direction coding and scale perception, and obtain a fourth feature vector, input the fourth feature vector into the second downsampling layer for downsampling, and obtain the first feature vector
- the up-sampling layer performs up-sampling to obtain the eleventh feature vector, and the eleventh feature vector is input into the sixth deep learning module for direction encoding and scale perception, and the twelfth feature vector is obtained.
- the feature vector is input into the third up-sampling layer for up-sampling to obtain the thirteenth feature vector, and the thirteenth feature vector is input into the seventh deep learning module for direction encoding and scale perception to obtain the fourteenth feature vector , input the fourteenth feature vector into the discarding layer for random discarding to obtain a fifteenth feature vector, input the fifteenth feature vector into the fully connected layer for connection, and obtain the sample semantics of the training sample Class probability prediction data.
- the training sub-module includes: a training unit; the training unit is configured to input the sample semantic category probability prediction data and the point cloud semantic category calibration data into a loss function for calculation, to obtain the The loss value of the model to be trained, the parameters of the model to be trained are updated according to the loss value, the updated model to be trained is used to calculate the probability prediction data of the sample semantic category next time, and the above method steps are repeatedly executed Until the loss value reaches the first convergence condition or the number of iterations reaches the second convergence condition, the model to be trained whose loss value reaches the first convergence condition or the number of iterations reaches the second convergence condition is determined as the point cloud A semantic category prediction model, wherein the loss function adopts a cross-entropy function.
- the semantic category determination module 400 includes: a target target semantic category probability prediction value extraction submodule, a target semantic category determination submodule; the target target semantic category probability prediction value extraction submodule is used to extract from the point The semantic category probability prediction value of the same point is extracted from the cloud semantic category probability prediction value to obtain the target semantic category probability prediction value; the target semantic category determination sub-module is used to find the largest probability prediction value from the target semantic category probability prediction value. value, the semantic category corresponding to the found maximum value is taken as the target semantic category of the point corresponding to the probability prediction value of the target semantic category.
- an embodiment of the present application further provides a computer device.
- the computer device may be a server, and its internal structure may be as shown in FIG. 3 .
- the computer device includes a processor, memory, a network interface, and a database connected by a system bus. Among them, the processor of the computer design is used to provide computing and control capabilities.
- the memory of the computer device includes a non-volatile storage medium, an internal memory.
- the nonvolatile storage medium stores an operating system, a computer program, and a database.
- the memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium.
- the database of the computer equipment is used for storing data such as the three-dimensional point cloud semantic segmentation method.
- the network interface of the computer device is used to communicate with an external terminal through a network connection.
- the computer program implements a three-dimensional point cloud semantic segmentation method when executed by the processor.
- the three-dimensional point cloud semantic segmentation method includes: obtaining the three-dimensional point cloud data to be predicted; using a preset space cell to perform point cloud division and quantitative discrimination on the three-dimensional point cloud data to be predicted, to obtain target point cloud data;
- the target point cloud data is input into the point cloud semantic category prediction model to predict the probability of semantic categories, and the point cloud semantic category probability prediction value of the target point cloud data is obtained, and the point cloud semantic category prediction model is based on the PointSIFT neural network module.
- the target point cloud data is obtained by dividing and quantifying the 3D point cloud data to be predicted by using a preset space cell, thereby realizing the fast and accurate logical division of the point cloud of the complex and large-scale target object, ensuring that It has a good representation of the target object, thereby improving the recognition accuracy of point cloud semantic segmentation;
- the target point cloud data is input into the point cloud semantic category prediction model for probability prediction of semantic categories, and the point cloud semantic category prediction model is based on PointSIFT neural network.
- the model obtained by the module and the PointNet++ neural network training because the PointNet++ neural network is based on the extension of the PointNet feature extraction block, adding a hierarchical structure for processing local features, and achieving better segmentation results, so that point cloud semantic category prediction
- the model can better handle the fine features of complex target objects; and because the scale perception of the PointSIFT neural network module can select the most representative shape scale, and the direction encoding of the PointSIFT neural network module can comprehensively perceive the point cloud information in different directions. , thereby improving the accuracy of semantic category prediction by the point cloud semantic category prediction model.
- An embodiment of the present application further provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements a method for semantic segmentation of a 3D point cloud, including the steps of: acquiring 3D point cloud data to be predicted; Use preset space cells to perform point cloud division and quantitative discrimination on the three-dimensional point cloud data to be predicted, and obtain target point cloud data; input the target point cloud data into the point cloud semantic category prediction model for probability prediction of semantic categories , obtain the point cloud semantic category probability prediction value of the target point cloud data, and the point cloud semantic category prediction model is a model obtained based on PointSIFT neural network module and PointNet++ neural network training; according to the point cloud of the target point cloud data The semantic category probability prediction value determines the target semantic category of each point in the target point cloud data.
- the 3D point cloud semantic segmentation method implemented above obtains the target point cloud data by using the preset space cell to perform point cloud division and quantitative discrimination on the 3D point cloud data to be predicted, thereby realizing the point cloud for complex large-scale target objects. And accurate logical division ensures a good representation of the target object, thereby improving the recognition accuracy of point cloud semantic segmentation; input the target point cloud data into the point cloud semantic category prediction model to predict the probability of semantic category, point cloud semantic category
- the prediction model is based on the model trained by the PointSIFT neural network module and the PointNet++ neural network. Because the PointNet++ neural network is based on the extension of the PointNet feature extraction block, a hierarchical structure is added to process local features, and good segmentation results have been achieved.
- the point cloud semantic category prediction model can better deal with the fine features of complex target objects; and because the scale perception of the PointSIFT neural network module can select the most representative shape scale, and the direction encoding of the PointSIFT neural network module can be comprehensive. The point cloud information in different directions is perceived, thereby improving the accuracy of semantic category prediction by the point cloud semantic category prediction model.
- the computer-readable storage medium may be non-volatile or volatile.
- Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
- Volatile memory may include random access memory (RAM) or external cache memory.
- RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double-rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
本申请涉及人工智能技术领域,揭示了一种三维点云语义分割方法、装置、设备及介质,其中方法包括:采用预设空间单元格对待预测三维点云数据进行点云划分及量化判别得到目标点云数据;将目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测得到目标点云数据的点云语义类别概率预测值,点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型;根据点云语义类别概率预测值确定目标点云数据中每个点的目标语义类别。实现了针对复杂大尺度目标物体的点云进行快速且精确的逻辑划分,提高了点云分割的识别精度,而且可以较好的处理复杂目标物体的精细特征,提高了语义类别预测的准确度。
Description
本申请要求于2020年10月29日提交中国专利局、申请号为2020111821784,发明名称为“三维点云语义分割方法、装置、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及到人工智能技术领域,特别是涉及到一种三维点云语义分割方法、装置、设备及介质。
近些年,随着自动驾驶、医疗诊断、增强和混合现实等依托点云的智能应用日益兴起,关于三维点云语义分割技术在深度学习方面的研究与应用显得尤为迫切与重要。现有三维点云语义分割技术包括:采用体素方法的深度学习分割技术、采用多视图方法的深度学习分割技术、采用点云方法的深度学习分割技术。
发明人意识到采用体素方法的深度学习分割技术,因为体素数据在表征物体时,为保证目标信息完整,往往具有较大的分辨率,当空间复杂度高时将导致对计算资源的开销过大;而为了保证计算效率,往往需要降低分辨率,而降低分辨率又会导致精度损失,使神经网络预测结构较为密集的目标物体往往性能不佳,导致该分割技术很难应用于复杂目标物体的点云语义分割中。
采用多视图方法的深度学习分割技术,网络对于多视角图片的输入有限,固定数量的多视图可能无法将三维模型完全表示出来,造成目标结构的信息丢失,比如,物体的自遮挡等,再加上二维图片本身也会有损失精度,从而无法在复杂、精细结构上进行点云的语义分割应用。
采用点云方法的深度学习分割技术,是研究直接输入点云数据进行处理的深度学习方法,对待预测三维点云数据的稀疏性问题做出改善,但仍未脱离从二维图像提取特征的方式,导致难以应用于复杂目标物体点云语义分割中。
旨在解决现有技术的三维点云语义分割技术难以应用在复杂目标物体的点云语义分割的技术问题。
本申请的主要目的为提供一种三维点云语义分割方法、装置、设备及介质,旨在解决现有技术的三维点云语义分割技术难以应用在复杂目标物体的点云语义分割的技术问题。
为了实现上述发明目的,本申请提出一种三维点云语义分割方法,所述方法包括:获取待预测三维点云数据;采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据;将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值,所述点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型;根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别。
本申请还提出了一种三维点云语义分割装置,所述装置包括:点云获取模块,用于获取待预测三维点云数据;点云分割处理模块,用于采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据;概率预测模块,用于将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值,所述点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型; 语义类别确定模块,用于根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别。
本申请还提出了一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现如下方法步骤:获取待预测三维点云数据;采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据;将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值,所述点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型;根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别。
本申请还提出了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如下方法步骤:获取待预测三维点云数据;采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据;将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值,所述点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型;根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别。
本申请的三维点云语义分割方法、装置、设备及介质,通过采用预设空间单元格对待预测三维点云数据进行点云划分及量化判别得到目标点云数据,从而实现了针对复杂大尺度目标物体的点云进行快速且精确的逻辑划分,确保对目标物体具有良好的表征,从而提高了点云语义分割的识别精度;将目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型,因为PointNet++神经网络基于对PointNet特征提取块进行了延伸,加入了分层结构用于处理局部特征,取得了较好的分割结果,从而使点云语义类别预测模型可以较好的处理复杂目标物体的精细特征;又因为PointSIFT神经网络模块的尺度感知可以选择最具代表性的形状尺度,而PointSIFT神经网络模块的方向编码可以全面地感知不同方向的点云信息,从而提高了点云语义类别预测模型进行语义类别预测的准确度。
图1为本申请一实施例的三维点云语义分割方法的流程示意图;
图2为本申请一实施例的三维点云语义分割装置的结构示意框图;
图3为本申请一实施例的计算机设备的结构示意框图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
本申请使用的专业术语解释如下:
本申请的语义分割,是在像素级别上的分类,属于同一类的像素都要被归为一类,因此语义分割是从像素级别来理解图像的。比如说如下的照片,属于人的像素都要分成一类,属于摩托车的像素也要分成一类,除此之外还有背景像素也被分为一类。注意语义分割不同于实例分割,举例来说,如果一张照片中有多个人,对于语义分割来说,只要将所由人的像素都归为一类,但是实例分割还要将不同人的像素归为不同的类。也就是说实例分割比语义分割更进一步。
本申请的PointNet,其本质就是一种网络结构,按一定的规则输入点云数据,经过一层层地计算,得出分类结果或者分割结果。其中比较特殊的地方在于两个转换矩阵(inputtransform&featuretransform)的存在,根据文中所说,这两个转换矩阵可以在深度学习过程中保持点云数据的空间不变性。
本申请的PointNet++,是在PointNet上做出了改进,考虑了点云局部特征提取,从而更好地进行点云分类和分割。
本申请的RGB色彩模式,是工业界的一种颜色标准,是通过对红(R)、绿(G)、蓝(B)三个颜色通道的变化以及它们相互之间的叠加来得到各式各样的颜色的,RGB即是代表红、绿、蓝三个通道的颜色,这个标准几乎包括了人类视力所能感知的所有颜色,是运用最广的颜色系统之一。
本申请的点云,是在逆向工程中通过测量仪器得到的产品外观表面的点数据集合,通常使用三维坐标测量机所得到的点数量比较少,点与点的间距也比较大,叫稀疏点云;而使用三维激光扫描仪或照相式扫描仪得到的点云,点数量比较大并且比较密集,叫密集点云。
为了解决现有技术的三维点云语义分割技术难以应用在复杂目标物体的点云语义分割的技术问题,本申请提出了三维点云语义分割方法,所述方法应用于人工智能技术领域,所述方法进一步应用于人工智能的神经网络技术领域。所述方法通过先采用空间单元格对待预测三维点云数据进行点云划分及量化判别,确保对目标物体进行良好的表征,再进行采用基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型进行语义类别的概率预测,以提升点云分割的识别精度。
参照图1,所述三维点云语义分割方法包括:
S1:获取待预测三维点云数据;
S2:采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据;
S3:将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值,所述点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型;
S4:根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别。
本实施例通过采用预设空间单元格对待预测三维点云数据进行点云划分及量化判别得到目标点云数据,从而实现了针对复杂大尺度目标物体的点云进行快速且精确的逻辑划分,确保对目标物体具有良好的表征,从而提高了点云语义分割的识别精度;将目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型,因为PointNet++神经网络基于对PointNet特征提取块进行了延伸,加入了分层结构用于处理局部特征,取得了较好的分割结果,从而使点云语义类别预测模型可以较好的处理复杂目标物体的精细特征;又因为PointSIFT神经网络模块的尺度感知可以选择最具代表性的形状尺度,而PointSIFT神经网络模块的方向编码可以全面地感知不同方向的点云信息,从而提高了点云语义类别预测模型进行语义类别预测的准确度。
对于S1,可以从数据库中获取待预测三维点云数据。所述待预测三维点云数据,是指从目标物体外观表面获取的点数据集合。从目标物体外观表面提取点数据集合的方法包括但不限于:三维相机拍摄、雷达扫描。所述待预测三维点云数 据包括:多个点的点描述数据。点描述数据包括:点的三维坐标。点的三维坐标是点在三维坐标系下的坐标数据,表述为(x,y,z)。
优选的,点描述数据还包括:点的颜色值。点的颜色值可以采用RGB色彩模式表述。
优选的,所述获取待预测三维点云数据的步骤,包括:S11:获取目标物体的所有三维点云数据;S12:从所述目标物体的所有三维点云数据中随机选择一个点作为选取点;S13:从所述目标物体的所有三维点云数据中,提取出以所述选取点为中心的预设范围内的所述三维点云数据,将提取出的所述三维点云数据作为所述待预测三维点云数据。
对于S11,从数据库中获取目标物体的所有点云数据。对于S12,从所述目标物体的所有三维点云数据对应的点云中随机选择一个点作为选取点。对于S13,将所述目标物体的所有三维点云数据对应的点云中的选取点及选取点周围预设范围内的点作为目标点云,将目标点云对应的点描述数据作为所述待预测三维点云数据。
优选的,将所述目标物体的所有三维点云数据对应的点云体积的1%对应的数值作为预设范围。
点云体积,是指可以容纳所有点云的最小直平行六面体的体积。直平行六面体包括:长方体、立方体。
对于S2,采用预设空间单元格对所述待预测三维点云数据进行点云划分,也就是将所述待预测三维点云数据对应的点云中的点划分到预设空间单元格中,每个点只属于一个预设空间单元格;然后对预设空间单元格中的点进行量化判别,当量化判别符合要求时将该预设空间单元格作为有效空间单元格;最后从有效空间单元格中进行点的选取,将选取的点对应的点描述数据作为一个有效空间单元格的目标点云数据,也就是说,每个有效空间单元格对应一个目标点云数据。
所述目标点云数据包括多个点的点描述数据(也就是点的三维坐标)。
优选的,所述目标点云数据的点描述数据包括:点的三维坐标、点的颜色值,从而有利于提高对所述目标点云数据进行语义类别的概率预测的准确性。
对于S3,将所述目标点云数据对应的点云中所有的点描述数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据对应的点云中每个点的语义类别概率预测值,将所述目标点云数据对应的点云中所有点的语义类别概率预测值作为所述目标点云数据的点云语义类别概率预测值。
可以理解的是,所述目标点云数据对应的点云中每个点包括多个语义类别概率预测值。所述多个语义类别概率预测值的具体数量和语义类别数量相同。
语义类别,是根据目标物体的作用和/或应用场景确定的点的分类。比如,当目标物体是船舶时,语义类别包括但不限于:底部分段结构、船舷分段结构、甲板分段结构、舱壁结构,在此举例不做具体限定。
其中,根据PointSIFT神经网络模块和PointNet++神经网络得到待训练模型,采用训练样本对待训练模型进行训练,将训练后的待训练模型作为点云语义类别预测模型。
对于S4,根据所述目标点云数据对应的点云中每个点的所述点云语义类别概率预测值,确定该点的目标语义类别。
在一个实施例中,上述采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据的步骤,包括:
S21:采用所述预设空间单元格对所述待预测三维点云数据进行离散划分, 得到多个待处理空间单元格;
S22:对所述多个待处理空间单元格进行总体积计算,得到空间单元格总体积;
S23:对所述待处理空间单元格中的点云进行体积计算,得到所述待处理空间单元格的点云体积;
S24:分别将每一个所述待处理空间单元格的点云体积除以所述空间单元格总体积,得到多个所述待处理空间单元格的点云体积比例;
S25:判断每一个所述待处理空间单元格的点云体积比例是否大于预设比例阈值;
S26:当存在所述待处理空间单元格的点云体积比例大于所述预设比例阈值时,将所述待处理空间单元格的点云体积比例对应的所述待处理空间单元格作为有效空间单元格;
S27:从所述有效空间单元格中进行点的选取,得到所述目标点云数据。
对于S21,找出可以容纳所述待预测三维点云数据对应的点云的最小直平行六面体,采用预设空间单元格的尺寸将该直平行六面体依次划分为多个待处理空间单元格,从而实现将所述待预测三维点云数据对应的点云中的点划分到所述多个待处理空间单元格中。其中,多个待处理空间单元格中相邻的待处理空间单元格不重叠,所述待预测三维点云数据对应的点云中每个点只划分到一个待处理空间单元格中。预设空间单元格的尺寸包括:长度、宽度、高度。
对于S22,计算每个待处理空间单元格的体积,将所有待处理空间单元格的体积进行相加,得到空间单元格总体积。
对于S23,对多个待处理空间单元格中每个待处理空间单元格中的点云进行体积计算。其中,找出可以容纳所述待处理空间单元格中所有点的最小直平行六面体,计算找出的直平行六面体的体积,将计算得到的体积作为所述待处理空间单元格的点云体积。
对于S24,依次将每一个所述待处理空间单元格的点云体积除以所述空间单元格总体积,得到多个所述待处理空间单元格的点云体积比例,也就是说,每一个所述待处理空间单元格对应一个点云体积比例。
对于S25,预设比例阈值是一个比例值。
对于S26,通过在所述待处理空间单元格的点云体积比例大于所述预设比例阈值时,将待处理空间单元格作为有效空间单元格,有利于确保对目标物体具有良好的表征。
优选的,当所述待处理空间单元格的点云体积比例小于或等于所述预设比例阈值时,将所述待处理空间单元格的点云体积比例对应的所述待处理空间单元格丢弃。
对于S27,从所述有效空间单元格的点云中选取预设数量的点,将选取的点对应的点描述数据(也就是点的三维坐标)作为所述目标点云数据。
优选的,所述预设数量为8192。
优选的,所述预设数量为16384,从而实现了点云增量。
在一个实施例中,上述从所述有效空间单元格中进行点的选取,得到所述目标点云数据的步骤,包括:
S271:对所述有效空间单元格中的点云按预设数量进行点的随机选取,得到待处理点云数据;
S272:对所述待处理点云数据进行中心点计算,得到中心点坐标数据;
S273:将所述待处理点云数据中的每个点的坐标数据减去所述中心点坐标数据,得到所述待处理点云数据中的每个点的坐标差值;
S274:根据所述待处理点云数据的所有点的坐标数据和所述中心点坐标数据进行标准差计算,得到所述待处理点云数据的点云标准差;
S275:将所述待处理点云数据中的每个点的坐标差值除以所述点云标准差,得到所述目标点云数据。
本实施例实现了对待处理点云数据进行归一化操作,有利于提高语义识别的准确性。
对于S271,从所述有效空间单元格中的点云中随机选取出预设数量的点,将选取出的点对应的点描述数据(也就是点的三维坐标)作为待处理点云数据。也就是说,待处理点云数据中点描述数据的数量与预设数量相同。
对于S272,根据所述待处理点云数据中所有点描述数据的三维坐标进行中心点计算,得到中心点坐标数据,也就是说,中心点坐标数据是三维坐标系下的坐标数据。
对于S273,将所述待处理点云数据中的每个点的坐标数据中x轴坐标减去所述中心点坐标数据的x轴坐标,得到x差值;将所述待处理点云数据中的每个点的坐标数据中y轴坐标减去所述中心点坐标数据的y轴坐标,得到y差值;将所述待处理点云数据中的每个点的坐标数据中z轴坐标减去所述中心点坐标数据的z轴坐标,得到z差值;将x差值、y差值、z差值作为坐标差值。也就是说,每个坐标差值中同时包括一个x差值、一个y差值、一个z差值。坐标差值的数量可以为一个或多个。
对于S274,根据所述待处理点云数据的所有点的坐标数据的x轴坐标和所述中心点坐标数据的x轴坐标进行标准差计算,得到x标准差;根据所述待处理点云数据的所有点的坐标数据的y轴坐标和所述中心点坐标数据的y轴坐标进行标准差计算,得到y标准差;根据所述待处理点云数据的所有点的坐标数据的z轴坐标和所述中心点坐标数据的z轴坐标进行标准差计算,得到z标准差;将x标准差、y标准差、z标准差作为点云标准差。也就是说,点云标准差中同时包括一个x标准差、一个y标准差、一个z标准差。
对于S275,将所述待处理点云数据中的每个点的坐标差值中的x差值除以所述点云标准差中的x标准差,得到每个点的目标x值;将所述待处理点云数据中的每个点的坐标差值中的y差值除以所述点云标准差中的y标准差,得到每个点的目标y值;将所述待处理点云数据中的每个点的坐标差值中的z差值除以所述点云标准差中的z标准差,得到每个点的目标z值;将同一个点的目标x值、目标y值、目标z值作为该点的点描述数据的所述点的三维坐标,也就是说,所述目标点云数据包括多个点的点描述数据,每个点的点描述数据的三维坐标同时包括一个目标x值、一个目标y值、一个目标z值。
在一个实施例中,上述将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值的步骤之前,还包括:
S031:获取多个训练样本,所述训练样本包括:点云样本数据、点云语义类别标定数据;
S032:将所述训练样本的所述点云样本数据输入待训练模型中进行语义类别的概率预测,得到所述训练样本的样本语义类别概率预测数据,其中,所述待训练模型是根据所述PointSIFT神经网络模块和所述PointNet++神经网络训练确定的 模型;
S033:根据所述样本语义类别概率预测数据和所述点云语义类别标定数据对所述待训练模型进行训练,将训练结束的待训练模型作为所述点云语义类别预测模型。
本实施例实现了根据所述PointSIFT神经网络模块和所述PointNet++神经网络训练确定待训练模型,在对待训练模型进行训练得到点云语义类别预测模型,因为PointNet++神经网络基于对PointNet特征提取块进行了延伸,加入了分层结构用于处理局部特征,取得了较好的分割结果,从而使点云语义类别预测模型可以较好的处理复杂目标物体的精细特征;又因为PointSIFT神经网络模块的尺度感知可以选择最具代表性的形状尺度,而PointSIFT神经网络模块通过尺度感知和方向编码的关键属性,实现对三维点云在不同方向上进行尺度不变的信息编码,完成点云分割,从而提高了点云语义类别预测模型进行语义类别预测的准确度。
对于S031,可以从数据库中获取多个训练样本。每个训练样本包括一个点云样本数据、一个点云语义类别标定数据。点云样本数据中包括多个点的点描述数据(也就是点的三维坐标),点云语义类别标定数据包括多个点的语义类别标定值。可以理解的是,点云样本数据中的每个点对应点云语义类别标定数据中的一个语义类别标定值。
优选的,语义类别标定值可以表述为向量,比如,语义类别共有5个,语义类别标定值向量对应的点云样本数据的点A,点A对应的语义类别标定值为[01000],[01000]表示是第2个语义类别是专业人员对该点的语义类别的标定结果。
语义类别标定值是专业人员对点云样本数据的点根据该点的点描述数据进行的语义类别的标定结果。
对于S032,将所有所述训练样本的所述点云样本数据依次输入待训练模型中进行语义类别的概率预测,得到多个所述训练样本的样本语义类别概率预测数据。也就是说,每个训练样本对应一个样本语义类别概率预测数据。
根据所述PointSIFT神经网络模块和所述PointNet++神经网络的点集抽象模块及特征传播模块,确定待训练模型。PointSIFT神经网络模块用于进行方向编码及尺度感知。点集抽象模块用于进行下采样,特征传播模块用于进行上采样,下采样和上采样的过程采用对齐的方式。PointSIFT神经网络模块穿插在相邻的点集抽象模块和特征传播模块之间。待训练模型在上采样后通过一个全连接层得到样本语义类别概率预测数据。
对于点集抽象模块,又称为SA模块,SA是指Set Abstraction,具体方式可以从现有技术中选择,在此不做赘述。
对于特征传播模块,又称为FP模块,FP是指feature propagation,具体方式可以从现有技术中选择,在此不做赘述。
对于S033,根据所述样本语义类别概率预测数据和所述点云语义类别标定数据进行损失值计算及更新待训练模型的参数,满足训练结束条件时将完成参数更新的待训练模型作为所述点云语义类别预测模型。
在一个实施例中,所述待训练模型依次包括:多层感知器、第一深度学习模块、第一下采样层、第二深度学习模块、第二下采样层、第三深度学习模块、第三下采样层、第四深度学习模块、第一上采样层、第五深度学习模块、第二上采样层、第六深度学习模块、第三上采样层、第七深度学习模块、丢弃层、全连接层,所述第一深度学习模块、所述第二深度学习模块、所述第三深度学习模块、所述第四深度学习模块、所述第五深度学习模块、所述第六深度学习模块及所述 第七深度学习模块采用所述PointSIFT神经网络模块,所述第一下采样层、所述第二下采样层及所述第三下采样层采用所述PointNet++神经网络的点集抽象模块,所述第一上采样层、所述第二上采样层及所述第三上采样层采用所述PointNet++神经网络的特征传播模块;以及,
所述将所述训练样本的所述点云样本数据输入待训练模型中进行语义类别的概率预测,得到所述训练样本的样本语义类别概率预测数据的步骤,包括:
S03201:将所述训练样本的所述点云样本数据输入所述多层感知器进行特征提取,得到第一特征向量;
S03202:将所述第一特征向量输入所述第一深度学习模块进行方向编码及尺度感知,得到第二特征向量;
S03203:将所述第二特征向量输入所述第一下采样层进行下采样,得到第三特征向量;
S03204:将所述第三特征向量输入所述第二深度学习模块进行方向编码及尺度感知,得到第四特征向量;
S03205:将所述第四特征向量输入所述第二下采样层进行下采样,得到第五特征向量;
S03206:将所述第五特征向量输入所述第三深度学习模块进行方向编码及尺度感知,得到第六特征向量;
S03207:将所述第六特征向量输入所述第三下采样层进行下采样,得到第七特征向量;
S03208:将所述第七特征向量输入所述第四深度学习模块进行方向编码及尺度感知,得到第八特征向量;
S03209:将所述第八特征向量输入所述第一上采样层进行上采样,得到第九特征向量;
S03210:将所述第九特征向量输入所述第五深度学习模块进行方向编码及尺度感知,得到第十特征向量;
S03211:将所述第十特征向量输入所述第二上采样层进行上采样,得到第十一特征向量;
S03212:将所述第十一特征向量输入所述第六深度学习模块进行方向编码及尺度感知,得到第十二特征向量;
S03213:将所述第十二特征向量输入所述第三上采样层进行上采样,得到第十三特征向量;
S03214:将所述第十三特征向量输入所述第七深度学习模块进行方向编码及尺度感知,得到第十四特征向量;
S03215:将所述第十四特征向量输入所述丢弃层进行随机丢弃,得到第十五特征向量;
S03216:将所述第十五特征向量输入所述全连接层进行连接,得到所述训练样本的样本语义类别概率预测数据。
本实施例通过点集抽象模块进行下采样,通过三个特征传播模块进行上采样,加入了分层结构用于处理局部特征,取得了较好的分割结果,从而使点云语义类别预测模型可以较好的处理复杂目标物体的精细特征;又因为七个PointSIFT神经网络模块的尺度感知可以选择最具代表性的形状尺度,而PointSIFT神经网络模块穿插在相邻的点集抽象模块和特征传播模块方向编码可以全面地感知不同方向的点云信息,从而提高了进行语义类别预测的准确度。
所述输入层将输入的数据转换为三通道的特征向量。比如,将输入的16384点的点描述数据(也就是点的三维坐标)转换为16384×3的特征向量,其中,16384×3中的(16384是特征向量的行数,也是点的数量,3是特征向量的列数,也是特征维度,3个特征维度描述点的x轴、y轴、z轴坐标数据),在此举例不做具体限定。比如,将所述训练样本的所述点云样本数据(16384×3,16384是特征向量的行数,也是点的数量,3是特征向量的列数,也是特征维度)、第一特征向量(尺寸为16384×64,16384是特征向量的行数,也是点的数量,64是特征向量的列数,也是特征维度)、第三特征向量(尺寸为2048×128,2048是特征向量的行数,也是点的数量,128是特征向量的列数,也是特征维度)、第五特征向量(尺寸为256×256,256是特征向量的行数,也是点的数量,256是特征向量的列数,也是特征维度)、第七特征向量(尺寸为64×512,64是特征向量的行数,也是点的数量,512是特征向量的列数,也是特征维度)、第九特征向量(尺寸为256×512,256是特征向量的行数,也是点的数量,512是特征向量的列数,也是特征维度)、第十一特征向量(尺寸为2048×256,2048是特征向量的行数,也是点的数量,256是特征向量的列数,也是特征维度)、第十三特征向量(尺寸为16384×128,16384是特征向量的行数,也是点的数量,128是特征向量的列数,也是特征维度),训练样本的样本语义类别概率预测数据(尺寸为16384×c,16384是特征向量的行数,也是点的数量,c是特征向量的列数,也是语义类别的数量),在此举例不做具体限定。
对于多层感知器,通过多层感知机函数MLP和最大池化对称函数,输入的低维点云的点描述数据,映射成逐点的高维特征向量,并保持对称不变性。首先假设所述点云样本数据为x,x=(N,D)存在于一个离散度量空间R
n,并且有
代表点云数量的集合,D代表度量每个点的特征维度,且离散度量空间中N的密度是非均匀的。为了从无序点云中获取不丢失的几何信息,需构建一个对称函数g(也就是最大池化对称函数),并把每个带有点描述数据的点映射到冗余的高维空间中。这里将所述点云样本数据x及其所包含的特征信息作为输入,由变换函数f实现对点云数量的集合N中每个点逐一标签并分割。在上述假设基础上,可以定义为存在一系列的无序点云数据集{x
1,x
2,……,x
n}(也就是所述点云样本数据),且x
i∈R
D,式为:f(x
1,x
2,……,x
n)≈g(h(x
1),h(x
2),……,h(x
n))
其中,对称函数g由最大池化实现,即D维特征的每一维都会选取N个点中对应的特征值总和或最大特征值。整体上,该式是将多层感知机MLP作为h函数用以特征提取,在高维空间下将该一系列单值函数的集合输入最大池化函数(也就是对称函数g),并由γ网络进一步消化点云信息,从而获取到点云集合的属性,式为:
其中,γ()与h()函数属于多层感知器MLP的网络结构。
对于PointSIFT神经网络模块,SIFT这一特征描述子考虑了形态表达的两个基本特征:一是方向编码,它会在获取到匹配的特征点位置后,为各点分配方向;二是尺度感知,它能根据输入PointSIFT神经网络模块的数据选取最适合进行特征提取的尺寸大小。区别于人工设计的SIFT,PointSIFT是一个神经网络模块,它可以根据前置训练过程实现自我优化。PointSIFT的基本模块是方向编码单元,即Orientation-encoding unit,简称OE单元,它可在8个方向上进行卷积并提取特征。
为了更好地获取点云的特征信息,基于PointSIFT从不同的方向进行信息堆叠。首先,以P
n点为中心将三维空间划分为八个子空间,各子空间包含有八个不同 的方向信息。对于P
n中心点和对应的n×d维特征向量Q
n,通过寻找距离P
n的最近邻点K
n,即可获得表征该子空间的近邻点特征,可以理解的是P
n的近邻点K
n的数量为8个,也就是每个子空间对应一个最近邻点K
n。若在某个子空间范围内,搜索半径内不存在目标点,则可用特征向量Q
n表示。同时,为了使卷积能够感知到方向信息,分别沿x轴、y轴、z轴进行三阶定向卷积,并把搜索到近邻点K
n的特征编码计入张量
其中,R
a×b×c这三个维度对应x轴、y轴、z轴,三阶定向卷积公式如下:
N
1=g[Conv
x(A
x,N)]∈R
2×2×1×d
N
2=g[Conv
y(A
y,N)]∈R
2×1×1×d
N
3=g[Conv
z(A
z,N)]∈R
1×1×1×d
其中,A
x,A
y,A
z是待预测模型的待更新的参数。
在三次卷积堆叠后,每个点P
n将转化成一个d维向量,该向量会包含着P
n附近邻域内的形状信息。可以看到,通过卷积堆叠多个方向编码单元,不同卷积层的方向编码单元即可感知到各方向的尺度信息,再通过shortcuts方式(捷径或直连方式)将前面各层的方向编码单元连接起来,提取到最终的尺度不变的特征信息,从而解决点云无序性和不变性问题。shortcuts方式包括:add(相加)或concat(向量串联)。
在一个实施例中,上述根据所述样本语义类别概率预测数据和所述点云语义类别标定数据对所述待训练模型进行训练,将训练结束的待训练模型作为所述点云语义类别预测模型的步骤,包括:
S0331:将所述样本语义类别概率预测数据和所述点云语义类别标定数据输入损失函数进行计算,得到所述待训练模型的损失值,根据所述损失值更新所述待训练模型的参数,更新后的所述待训练模型被用于下一次计算所述样本语义类别概率预测数据;
S0332:重复执行上述方法步骤直至所述损失值达到第一收敛条件或迭代次数达到第二收敛条件,将所述损失值达到第一收敛条件或迭代次数达到第二收敛条件的所述待训练模型,确定为所述点云语义类别预测模型;
其中,所述损失函数采用交叉熵函数。
本实施例实现了对待训练模型的训练。
所述第一收敛条件是指相邻两次计算的损失值的大小满足lipschitz条件(利普希茨连续条件)。所述迭代次数是指所述待训练模型被用于计算所述样本语义类别概率预测数据的次数,也就是说,计算一次,迭代次数增加1。第二收敛条件,是预设次数值。其中,交叉熵函数loss为:
y
i表示点云语义类别标定数据转化为独热向量后的第i个分量;
在一个实施例中,上述根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别的步骤,包括:
S41:从所述点云语义类别概率预测值中进行同一点的语义类别概率预测值提取,得到目标语义类别概率预测值;
S42:从所述目标语义类别概率预测值中找出最大值,将找出的最大值对应的语义类别作为所述目标语义类别概率预测值对应的点的所述目标语义类别。
本实施例实现了根据点云语义类别概率预测值确定每个点的目标语义类别。
对于S41,从所述点云语义类别概率预测值中,提取出同一点对应的所有语义类别概率预测值,将提取得的语义类别概率预测值作为目标语义类别概率预测值。
对于S42,从同一点对应的所有目标语义类别概率预测值中找出最大值,将找到的最大值对应的语义类别作为该点的目标语义类别。
参照图2,本申请还提出了一种三维点云语义分割装置,所述装置包括:
点云获取模块100,用于获取待预测三维点云数据;
点云分割处理模块200,用于采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据;
概率预测模块300,用于将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值,所述点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型;
语义类别确定模块400,用于根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别。
本实施例通过采用预设空间单元格对待预测三维点云数据进行点云划分及量化判别得到目标点云数据,从而实现了针对复杂大尺度目标物体的点云进行快速且精确的逻辑划分,确保对目标物体具有良好的表征,从而提高了点云语义分割的识别精度;将目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型,因为PointNet++神经网络基于对PointNet特征提取块进行了延伸,加入了分层结构用于处理局部特征,取得了较好的分割结果,从而使点云语义类别预测模型可以较好的处理复杂目标物体的精细特征;又因为PointSIFT神经网络模块的尺度感知可以选择最具代表性的形状尺度,而PointSIFT神经网络模块的方向编码可以全面地感知不同方向的点云信息,从而提高了点云语义类别预测模型进行语义类别预测的准确度。
在一个实施例中,所述点云分割处理模块200包括:点云划分子模块、量化判别子模块、点选取子模块;所述点云划分子模块,用于采用所述预设空间单元格对所述待预测三维点云数据进行离散划分,得到多个待处理空间单元格;所述量化判别子模块,用于对所述多个待处理空间单元格进行总体积计算,得到空间单元格总体积,对所述待处理空间单元格中的点云进行体积计算,得到所述待处理空间单元格的点云体积,分别将每一个所述待处理空间单元格的点云体积除以 所述空间单元格总体积,得到多个所述待处理空间单元格的点云体积比例,判断每一个所述待处理空间单元格的点云体积比例是否大于预设比例阈值,当存在所述待处理空间单元格的点云体积比例大于所述预设比例阈值时,将所述待处理空间单元格的点云体积比例对应的所述待处理空间单元格作为有效空间单元格;所述点选取子模块,用于从所述有效空间单元格中进行点的选取,得到所述目标点云数据。
在一个实施例中,所述点选取子模块包括:待处理点云确定单元、归一化处理单元;所述待处理点云确定单元,用于对所述有效空间单元格中的点云按预设数量进行点的随机选取,得到待处理点云数据;所述归一化处理单元,用于对所述待处理点云数据进行中心点计算,得到中心点坐标数据,将所述待处理点云数据中的每个点的坐标数据减去所述中心点坐标数据,得到所述待处理点云数据中的每个点的坐标差值,根据所述待处理点云数据的所有点的坐标数据和所述中心点坐标数据进行标准差计算,得到所述待处理点云数据的点云标准差,将所述待处理点云数据中的每个点的坐标差值除以所述点云标准差,得到所述目标点云数据。
在一个实施例中,所述装置还包括:模型训练模块;所述模型训练模块包括:样本获取子模块、训练子模块;所述样本获取子模块,用于获取多个训练样本,所述训练样本包括:点云样本数据、点云语义类别标定数据;所述训练子模块,用于将所述训练样本的所述点云样本数据输入待训练模型中进行语义类别的概率预测,得到所述训练样本的样本语义类别概率预测数据,其中,所述待训练模型是根据所述PointSIFT神经网络模块和所述PointNet++神经网络训练确定的模型,根据所述样本语义类别概率预测数据和所述点云语义类别标定数据对所述待训练模型进行训练,将训练结束的待训练模型作为所述点云语义类别预测模型。
在一个实施例中,所述待训练模型依次包括:多层感知器、第一深度学习模块、第一下采样层、第二深度学习模块、第二下采样层、第三深度学习模块、第三下采样层、第四深度学习模块、第一上采样层、第五深度学习模块、第二上采样层、第六深度学习模块、第三上采样层、第七深度学习模块、丢弃层、全连接层,所述第一深度学习模块、所述第二深度学习模块、所述第三深度学习模块、所述第四深度学习模块、所述第五深度学习模块、所述第六深度学习模块及所述第七深度学习模块采用所述PointSIFT神经网络模块,所述第一下采样层、所述第二下采样层及所述第三下采样层采用所述PointNet++神经网络的点集抽象模块,所述第一上采样层、所述第二上采样层及所述第三上采样层采用所述PointNet++神经网络的特征传播模块;以及,
所述训练子模块包括:样本预测单元;所述样本预测单元,用于将所述训练样本的所述点云样本数据输入所述多层感知器进行特征提取,得到第一特征向量,将所述第一特征向量输入所述第一深度学习模块进行方向编码及尺度感知,得到第二特征向量,将所述第二特征向量输入所述第一下采样层进行下采样,得到第三特征向量,将所述第三特征向量输入所述第二深度学习模块进行方向编码及尺度感知,得到第四特征向量,将所述第四特征向量输入所述第二下采样层进行下采样,得到第五特征向量,将所述第五特征向量输入所述第三深度学习模块进行方向编码及尺度感知,得到第六特征向量,将所述第六特征向量输入所述第三下采样层进行下采样,得到第七特征向量,将所述第七特征向量输入所述第四深度学习模块进行方向编码及尺度感知,得到第八特征向量,将所述第八特征向量输入所述第一上采样层进行上采样,得到第九特征向量,将所述第九特征向量输入 所述第五深度学习模块进行方向编码及尺度感知,得到第十特征向量,将所述第十特征向量输入所述第二上采样层进行上采样,得到第十一特征向量,将所述第十一特征向量输入所述第六深度学习模块进行方向编码及尺度感知,得到第十二特征向量,将所述第十二特征向量输入所述第三上采样层进行上采样,得到第十三特征向量,将所述第十三特征向量输入所述第七深度学习模块进行方向编码及尺度感知,得到第十四特征向量,将所述第十四特征向量输入所述丢弃层进行随机丢弃,得到第十五特征向量,将所述第十五特征向量输入所述全连接层进行连接,得到所述训练样本的样本语义类别概率预测数据。
在一个实施例中,所述训练子模块包括包括:训练单元;所述训练单元,用于将所述样本语义类别概率预测数据和所述点云语义类别标定数据输入损失函数进行计算,得到所述待训练模型的损失值,根据所述损失值更新所述待训练模型的参数,更新后的所述待训练模型被用于下一次计算所述样本语义类别概率预测数据,重复执行上述方法步骤直至所述损失值达到第一收敛条件或迭代次数达到第二收敛条件,将所述损失值达到第一收敛条件或迭代次数达到第二收敛条件的所述待训练模型,确定为所述点云语义类别预测模型,其中,所述损失函数采用交叉熵函数。
在一个实施例中,语义类别确定模块400包括:目标目标语义类别概率预测值提取子模块、目标语义类别确定子模块;所述目标目标语义类别概率预测值提取子模块,用于从所述点云语义类别概率预测值中进行同一点的语义类别概率预测值提取,得到目标语义类别概率预测值;所述目标语义类别确定子模块,用于从所述目标语义类别概率预测值中找出最大值,将找出的最大值对应的语义类别作为所述目标语义类别概率预测值对应的点的所述目标语义类别。
参照图3,本申请实施例中还提供一种计算机设备,该计算机设备可以是服务器,其内部结构可以如图3所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设计的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于储存三维点云语义分割方法等数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种三维点云语义分割方法。所述三维点云语义分割方法,包括:获取待预测三维点云数据;采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据;将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值,所述点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型;根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别。本实施例通过采用预设空间单元格对待预测三维点云数据进行点云划分及量化判别得到目标点云数据,从而实现了针对复杂大尺度目标物体的点云进行快速且精确的逻辑划分,确保对目标物体具有良好的表征,从而提高了点云语义分割的识别精度;将目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型,因为PointNet++神经网络基于对PointNet特征提取块进行了延伸,加入了分层结构用于处理局部特征,取得了较好的分割结果,从而使点云语义类别预测模型可以较好的处理复杂目标物体的精细特征;又因为 PointSIFT神经网络模块的尺度感知可以选择最具代表性的形状尺度,而PointSIFT神经网络模块的方向编码可以全面地感知不同方向的点云信息,从而提高了点云语义类别预测模型进行语义类别预测的准确度。
本申请一实施例还提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现一种三维点云语义分割方法,包括步骤:获取待预测三维点云数据;采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据;将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值,所述点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型;根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别。上述执行的三维点云语义分割方法,通过采用预设空间单元格对待预测三维点云数据进行点云划分及量化判别得到目标点云数据,从而实现了针对复杂大尺度目标物体的点云进行快速且精确的逻辑划分,确保对目标物体具有良好的表征,从而提高了点云语义分割的识别精度;将目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型,因为PointNet++神经网络基于对PointNet特征提取块进行了延伸,加入了分层结构用于处理局部特征,取得了较好的分割结果,从而使点云语义类别预测模型可以较好的处理复杂目标物体的精细特征;又因为PointSIFT神经网络模块的尺度感知可以选择最具代表性的形状尺度,而PointSIFT神经网络模块的方向编码可以全面地感知不同方向的点云信息,从而提高了点云语义类别预测模型进行语义类别预测的准确度。
所述计算机可读存储介质可以是非易失性,也可以是易失性。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的和实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可以包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双速据率SDRAM(SSRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上所述仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。
Claims (20)
- 一种三维点云语义分割方法,其中,所述方法包括:获取待预测三维点云数据;采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据;将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值,所述点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型;根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别。
- 根据权利要求1所述的三维点云语义分割方法,其中,所述采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据的步骤,包括:采用所述预设空间单元格对所述待预测三维点云数据进行离散划分,得到多个待处理空间单元格;对所述多个待处理空间单元格进行总体积计算,得到空间单元格总体积;对所述待处理空间单元格中的点云进行体积计算,得到所述待处理空间单元格的点云体积;分别将每一个所述待处理空间单元格的点云体积除以所述空间单元格总体积,得到多个所述待处理空间单元格的点云体积比例;判断每一个所述待处理空间单元格的点云体积比例是否大于预设比例阈值;当存在所述待处理空间单元格的点云体积比例大于所述预设比例阈值时,将所述待处理空间单元格的点云体积比例对应的所述待处理空间单元格作为有效空间单元格;从所述有效空间单元格中进行点的选取,得到所述目标点云数据。
- 根据权利要求2所述的三维点云语义分割方法,其中,所述从所述有效空间单元格中进行点的选取,得到所述目标点云数据的步骤,包括:对所述有效空间单元格中的点云按预设数量进行点的随机选取,得到待处理点云数据;对所述待处理点云数据进行中心点计算,得到中心点坐标数据;将所述待处理点云数据中的每个点的坐标数据减去所述中心点坐标数据,得到所述待处理点云数据中的每个点的坐标差值;根据所述待处理点云数据的所有点的坐标数据和所述中心点坐标数据进行标准差计算,得到所述待处理点云数据的点云标准差;将所述待处理点云数据中的每个点的坐标差值除以所述点云标准差,得到所述目标点云数据。
- 根据权利要求1所述的三维点云语义分割方法,其中,所述将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值的步骤之前,还包括:获取多个训练样本,所述训练样本包括:点云样本数据、点云语义类别标定数据;将所述训练样本的所述点云样本数据输入待训练模型中进行语义类别的概率预测,得到所述训练样本的样本语义类别概率预测数据,其中,所述待训练模 型是根据所述PointSIFT神经网络模块和所述PointNet++神经网络训练确定的模型;根据所述样本语义类别概率预测数据和所述点云语义类别标定数据对所述待训练模型进行训练,将训练结束的待训练模型作为所述点云语义类别预测模型。
- 根据权利要求4所述的三维点云语义分割方法,其中,所述待训练模型依次包括:多层感知器、第一深度学习模块、第一下采样层、第二深度学习模块、第二下采样层、第三深度学习模块、第三下采样层、第四深度学习模块、第一上采样层、第五深度学习模块、第二上采样层、第六深度学习模块、第三上采样层、第七深度学习模块、丢弃层、全连接层,所述第一深度学习模块、所述第二深度学习模块、所述第三深度学习模块、所述第四深度学习模块、所述第五深度学习模块、所述第六深度学习模块及所述第七深度学习模块采用所述PointSIFT神经网络模块,所述第一下采样层、所述第二下采样层及所述第三下采样层采用所述PointNet++神经网络的点集抽象模块,所述第一上采样层、所述第二上采样层及所述第三上采样层采用所述PointNet++神经网络的特征传播模块;以及,所述将所述训练样本的所述点云样本数据输入待训练模型中进行语义类别的概率预测,得到所述训练样本的样本语义类别概率预测数据的步骤,包括:将所述训练样本的所述点云样本数据输入所述多层感知器进行特征提取,得到第一特征向量;将所述第一特征向量输入所述第一深度学习模块进行方向编码及尺度感知,得到第二特征向量;将所述第二特征向量输入所述第一下采样层进行下采样,得到第三特征向量;将所述第三特征向量输入所述第二深度学习模块进行方向编码及尺度感知,得到第四特征向量;将所述第四特征向量输入所述第二下采样层进行下采样,得到第五特征向量;将所述第五特征向量输入所述第三深度学习模块进行方向编码及尺度感知,得到第六特征向量;将所述第六特征向量输入所述第三下采样层进行下采样,得到第七特征向量;将所述第七特征向量输入所述第四深度学习模块进行方向编码及尺度感知,得到第八特征向量;将所述第八特征向量输入所述第一上采样层进行上采样,得到第九特征向量;将所述第九特征向量输入所述第五深度学习模块进行方向编码及尺度感知,得到第十特征向量;将所述第十特征向量输入所述第二上采样层进行上采样,得到第十一特征向量;将所述第十一特征向量输入所述第六深度学习模块进行方向编码及尺度感知,得到第十二特征向量;将所述第十二特征向量输入所述第三上采样层进行上采样,得到第十三特征向量;将所述第十三特征向量输入所述第七深度学习模块进行方向编码及尺度感知,得到第十四特征向量;将所述第十四特征向量输入所述丢弃层进行随机丢弃,得到第十五特征向量;将所述第十五特征向量输入所述全连接层进行连接,得到所述训练样本的样本语义类别概率预测数据。
- 根据权利要求4所述的三维点云语义分割方法,其中,所述根据所述样本 语义类别概率预测数据和所述点云语义类别标定数据对所述待训练模型进行训练,将训练结束的待训练模型作为所述点云语义类别预测模型的步骤,包括:将所述样本语义类别概率预测数据和所述点云语义类别标定数据输入损失函数进行计算,得到所述待训练模型的损失值,根据所述损失值更新所述待训练模型的参数,更新后的所述待训练模型被用于下一次计算所述样本语义类别概率预测数据;重复执行上述方法步骤直至所述损失值达到第一收敛条件或迭代次数达到第二收敛条件,将所述损失值达到第一收敛条件或迭代次数达到第二收敛条件的所述待训练模型,确定为所述点云语义类别预测模型;其中,所述损失函数采用交叉熵函数。
- 根据权利要求1所述的三维点云语义分割方法,其中,所述根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别的步骤,包括:从所述点云语义类别概率预测值中进行同一点的语义类别概率预测值提取,得到目标语义类别概率预测值;从所述目标语义类别概率预测值中找出最大值,将找出的最大值对应的语义类别作为所述目标语义类别概率预测值对应的点的所述目标语义类别。
- 一种三维点云语义分割装置,其中,所述装置包括:点云获取模块,用于获取待预测三维点云数据;点云分割处理模块,用于采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据;概率预测模块,用于将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值,所述点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型;语义类别确定模块,用于根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别。
- 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其中,所述处理器执行所述计算机程序时实现如下方法步骤:获取待预测三维点云数据;采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据;将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值,所述点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型;根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别。
- 根据权利要求9所述的计算机设备,其中,所述采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据的步骤,包括:采用所述预设空间单元格对所述待预测三维点云数据进行离散划分,得到多个待处理空间单元格;对所述多个待处理空间单元格进行总体积计算,得到空间单元格总体积;对所述待处理空间单元格中的点云进行体积计算,得到所述待处理空间单元 格的点云体积;分别将每一个所述待处理空间单元格的点云体积除以所述空间单元格总体积,得到多个所述待处理空间单元格的点云体积比例;判断每一个所述待处理空间单元格的点云体积比例是否大于预设比例阈值;当存在所述待处理空间单元格的点云体积比例大于所述预设比例阈值时,将所述待处理空间单元格的点云体积比例对应的所述待处理空间单元格作为有效空间单元格;从所述有效空间单元格中进行点的选取,得到所述目标点云数据。
- 根据权利要求10所述的计算机设备,其中,所述从所述有效空间单元格中进行点的选取,得到所述目标点云数据的步骤,包括:对所述有效空间单元格中的点云按预设数量进行点的随机选取,得到待处理点云数据;对所述待处理点云数据进行中心点计算,得到中心点坐标数据;将所述待处理点云数据中的每个点的坐标数据减去所述中心点坐标数据,得到所述待处理点云数据中的每个点的坐标差值;根据所述待处理点云数据的所有点的坐标数据和所述中心点坐标数据进行标准差计算,得到所述待处理点云数据的点云标准差;将所述待处理点云数据中的每个点的坐标差值除以所述点云标准差,得到所述目标点云数据。
- 根据权利要求9所述的计算机设备,其中,所述将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值的步骤之前,还包括:获取多个训练样本,所述训练样本包括:点云样本数据、点云语义类别标定数据;将所述训练样本的所述点云样本数据输入待训练模型中进行语义类别的概率预测,得到所述训练样本的样本语义类别概率预测数据,其中,所述待训练模型是根据所述PointSIFT神经网络模块和所述PointNet++神经网络训练确定的模型;根据所述样本语义类别概率预测数据和所述点云语义类别标定数据对所述待训练模型进行训练,将训练结束的待训练模型作为所述点云语义类别预测模型。
- 根据权利要求12所述的计算机设备,其中,所述待训练模型依次包括:多层感知器、第一深度学习模块、第一下采样层、第二深度学习模块、第二下采样层、第三深度学习模块、第三下采样层、第四深度学习模块、第一上采样层、第五深度学习模块、第二上采样层、第六深度学习模块、第三上采样层、第七深度学习模块、丢弃层、全连接层,所述第一深度学习模块、所述第二深度学习模块、所述第三深度学习模块、所述第四深度学习模块、所述第五深度学习模块、所述第六深度学习模块及所述第七深度学习模块采用所述PointSIFT神经网络模块,所述第一下采样层、所述第二下采样层及所述第三下采样层采用所述PointNet++神经网络的点集抽象模块,所述第一上采样层、所述第二上采样层及所述第三上采样层采用所述PointNet++神经网络的特征传播模块;以及,所述将所述训练样本的所述点云样本数据输入待训练模型中进行语义类别的概率预测,得到所述训练样本的样本语义类别概率预测数据的步骤,包括:将所述训练样本的所述点云样本数据输入所述多层感知器进行特征提取,得到第一特征向量;将所述第一特征向量输入所述第一深度学习模块进行方向编码及尺度感知,得到第二特征向量;将所述第二特征向量输入所述第一下采样层进行下采样,得到第三特征向量;将所述第三特征向量输入所述第二深度学习模块进行方向编码及尺度感知,得到第四特征向量;将所述第四特征向量输入所述第二下采样层进行下采样,得到第五特征向量;将所述第五特征向量输入所述第三深度学习模块进行方向编码及尺度感知,得到第六特征向量;将所述第六特征向量输入所述第三下采样层进行下采样,得到第七特征向量;将所述第七特征向量输入所述第四深度学习模块进行方向编码及尺度感知,得到第八特征向量;将所述第八特征向量输入所述第一上采样层进行上采样,得到第九特征向量;将所述第九特征向量输入所述第五深度学习模块进行方向编码及尺度感知,得到第十特征向量;将所述第十特征向量输入所述第二上采样层进行上采样,得到第十一特征向量;将所述第十一特征向量输入所述第六深度学习模块进行方向编码及尺度感知,得到第十二特征向量;将所述第十二特征向量输入所述第三上采样层进行上采样,得到第十三特征向量;将所述第十三特征向量输入所述第七深度学习模块进行方向编码及尺度感知,得到第十四特征向量;将所述第十四特征向量输入所述丢弃层进行随机丢弃,得到第十五特征向量;将所述第十五特征向量输入所述全连接层进行连接,得到所述训练样本的样本语义类别概率预测数据。
- 根据权利要求12所述的计算机设备,其中,所述根据所述样本语义类别概率预测数据和所述点云语义类别标定数据对所述待训练模型进行训练,将训练结束的待训练模型作为所述点云语义类别预测模型的步骤,包括:将所述样本语义类别概率预测数据和所述点云语义类别标定数据输入损失函数进行计算,得到所述待训练模型的损失值,根据所述损失值更新所述待训练模型的参数,更新后的所述待训练模型被用于下一次计算所述样本语义类别概率预测数据;重复执行上述方法步骤直至所述损失值达到第一收敛条件或迭代次数达到第二收敛条件,将所述损失值达到第一收敛条件或迭代次数达到第二收敛条件的所述待训练模型,确定为所述点云语义类别预测模型;其中,所述损失函数采用交叉熵函数。
- 一种计算机可读存储介质,其上存储有计算机程序,其中,所述计算机程序被处理器执行时实现如下方法步骤:获取待预测三维点云数据;采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据;将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值,所述点云语义类别预测模型是基于PointSIFT神经网络模块和PointNet++神经网络训练得到的模型;根据所述目标点云数据的点云语义类别概率预测值,确定所述目标点云数据中每个点的目标语义类别。
- 根据权利要求15所述的计算机可读存储介质,其中,所述采用预设空间单元格对所述待预测三维点云数据进行点云划分及量化判别,得到目标点云数据的步骤,包括:采用所述预设空间单元格对所述待预测三维点云数据进行离散划分,得到多个待处理空间单元格;对所述多个待处理空间单元格进行总体积计算,得到空间单元格总体积;对所述待处理空间单元格中的点云进行体积计算,得到所述待处理空间单元格的点云体积;分别将每一个所述待处理空间单元格的点云体积除以所述空间单元格总体积,得到多个所述待处理空间单元格的点云体积比例;判断每一个所述待处理空间单元格的点云体积比例是否大于预设比例阈值;当存在所述待处理空间单元格的点云体积比例大于所述预设比例阈值时,将所述待处理空间单元格的点云体积比例对应的所述待处理空间单元格作为有效空间单元格;从所述有效空间单元格中进行点的选取,得到所述目标点云数据。
- 根据权利要求16所述的计算机可读存储介质,其中,所述从所述有效空间单元格中进行点的选取,得到所述目标点云数据的步骤,包括:对所述有效空间单元格中的点云按预设数量进行点的随机选取,得到待处理点云数据;对所述待处理点云数据进行中心点计算,得到中心点坐标数据;将所述待处理点云数据中的每个点的坐标数据减去所述中心点坐标数据,得到所述待处理点云数据中的每个点的坐标差值;根据所述待处理点云数据的所有点的坐标数据和所述中心点坐标数据进行标准差计算,得到所述待处理点云数据的点云标准差;将所述待处理点云数据中的每个点的坐标差值除以所述点云标准差,得到所述目标点云数据。
- 根据权利要求15所述的计算机可读存储介质,其中,所述将所述目标点云数据输入点云语义类别预测模型中进行语义类别的概率预测,得到所述目标点云数据的点云语义类别概率预测值的步骤之前,还包括:获取多个训练样本,所述训练样本包括:点云样本数据、点云语义类别标定数据;将所述训练样本的所述点云样本数据输入待训练模型中进行语义类别的概率预测,得到所述训练样本的样本语义类别概率预测数据,其中,所述待训练模型是根据所述PointSIFT神经网络模块和所述PointNet++神经网络训练确定的模型;根据所述样本语义类别概率预测数据和所述点云语义类别标定数据对所述待训练模型进行训练,将训练结束的待训练模型作为所述点云语义类别预测模型。
- 根据权利要求18所述的计算机可读存储介质,其中,所述待训练模型依次包括:多层感知器、第一深度学习模块、第一下采样层、第二深度学习模块、第二下采样层、第三深度学习模块、第三下采样层、第四深度学习模块、第一上采样层、第五深度学习模块、第二上采样层、第六深度学习模块、第三上采样层、第七深度学习模块、丢弃层、全连接层,所述第一深度学习模块、所述第二深度 学习模块、所述第三深度学习模块、所述第四深度学习模块、所述第五深度学习模块、所述第六深度学习模块及所述第七深度学习模块采用所述PointSIFT神经网络模块,所述第一下采样层、所述第二下采样层及所述第三下采样层采用所述PointNet++神经网络的点集抽象模块,所述第一上采样层、所述第二上采样层及所述第三上采样层采用所述PointNet++神经网络的特征传播模块;以及,所述将所述训练样本的所述点云样本数据输入待训练模型中进行语义类别的概率预测,得到所述训练样本的样本语义类别概率预测数据的步骤,包括:将所述训练样本的所述点云样本数据输入所述多层感知器进行特征提取,得到第一特征向量;将所述第一特征向量输入所述第一深度学习模块进行方向编码及尺度感知,得到第二特征向量;将所述第二特征向量输入所述第一下采样层进行下采样,得到第三特征向量;将所述第三特征向量输入所述第二深度学习模块进行方向编码及尺度感知,得到第四特征向量;将所述第四特征向量输入所述第二下采样层进行下采样,得到第五特征向量;将所述第五特征向量输入所述第三深度学习模块进行方向编码及尺度感知,得到第六特征向量;将所述第六特征向量输入所述第三下采样层进行下采样,得到第七特征向量;将所述第七特征向量输入所述第四深度学习模块进行方向编码及尺度感知,得到第八特征向量;将所述第八特征向量输入所述第一上采样层进行上采样,得到第九特征向量;将所述第九特征向量输入所述第五深度学习模块进行方向编码及尺度感知,得到第十特征向量;将所述第十特征向量输入所述第二上采样层进行上采样,得到第十一特征向量;将所述第十一特征向量输入所述第六深度学习模块进行方向编码及尺度感知,得到第十二特征向量;将所述第十二特征向量输入所述第三上采样层进行上采样,得到第十三特征向量;将所述第十三特征向量输入所述第七深度学习模块进行方向编码及尺度感知,得到第十四特征向量;将所述第十四特征向量输入所述丢弃层进行随机丢弃,得到第十五特征向量;将所述第十五特征向量输入所述全连接层进行连接,得到所述训练样本的样本语义类别概率预测数据。
- 根据权利要求18所述的计算机可读存储介质,其中,所述根据所述样本语义类别概率预测数据和所述点云语义类别标定数据对所述待训练模型进行训练,将训练结束的待训练模型作为所述点云语义类别预测模型的步骤,包括:将所述样本语义类别概率预测数据和所述点云语义类别标定数据输入损失函数进行计算,得到所述待训练模型的损失值,根据所述损失值更新所述待训练模型的参数,更新后的所述待训练模型被用于下一次计算所述样本语义类别概率预测数据;重复执行上述方法步骤直至所述损失值达到第一收敛条件或迭代次数达到第二收敛条件,将所述损失值达到第一收敛条件或迭代次数达到第二收敛条件的所述待训练模型,确定为所述点云语义类别预测模型;其中,所述损失函数采用交叉熵函数。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011182178.4 | 2020-10-29 | ||
CN202011182178.4A CN112287939B (zh) | 2020-10-29 | 2020-10-29 | 三维点云语义分割方法、装置、设备及介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022088676A1 true WO2022088676A1 (zh) | 2022-05-05 |
Family
ID=74354070
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/097548 WO2022088676A1 (zh) | 2020-10-29 | 2021-05-31 | 三维点云语义分割方法、装置、设备及介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112287939B (zh) |
WO (1) | WO2022088676A1 (zh) |
Cited By (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114821074A (zh) * | 2022-07-01 | 2022-07-29 | 湖南盛鼎科技发展有限责任公司 | 机载liDAR点云语义分割方法、电子设备及存储介质 |
CN114882113A (zh) * | 2022-05-23 | 2022-08-09 | 大连理工大学 | 基于同类物体形状对应性的五指机械灵巧手抓取迁移方法 |
CN114882224A (zh) * | 2022-06-06 | 2022-08-09 | 中国电建集团中南勘测设计研究院有限公司 | 模型结构、模型训练方法、单体化方法、设备及介质 |
CN114926636A (zh) * | 2022-05-12 | 2022-08-19 | 合众新能源汽车有限公司 | 一种点云语义分割方法、装置、设备及存储介质 |
CN114998604A (zh) * | 2022-05-09 | 2022-09-02 | 中国地质大学(武汉) | 一种基于局部点云位置关系的点云特征提取方法 |
CN115050019A (zh) * | 2022-06-27 | 2022-09-13 | 西安交通大学 | 三维点云场景中的交互关系检测方法、系统、装置及存储介质 |
CN115082498A (zh) * | 2022-05-24 | 2022-09-20 | 河南中原动力智能制造有限公司 | 一种机器人抓取位姿估计方法、装置、设备及存储介质 |
CN115082726A (zh) * | 2022-05-18 | 2022-09-20 | 同济大学 | 一种基于PointNet优化的座便器陶瓷素坯产品分类方法 |
CN115170585A (zh) * | 2022-07-12 | 2022-10-11 | 上海人工智能创新中心 | 三维点云语义分割方法 |
CN115311274A (zh) * | 2022-10-11 | 2022-11-08 | 四川路桥华东建设有限责任公司 | 一种基于空间变换自注意力模块的焊缝检测方法及系统 |
CN115393597A (zh) * | 2022-10-31 | 2022-11-25 | 之江实验室 | 基于脉冲神经网络与激光雷达点云的语义分割方法及装置 |
CN115457496A (zh) * | 2022-09-09 | 2022-12-09 | 北京百度网讯科技有限公司 | 自动驾驶的挡墙检测方法、装置及车辆 |
CN115841585A (zh) * | 2022-05-31 | 2023-03-24 | 上海人工智能创新中心 | 一种对点云分割网络进行知识蒸馏的方法 |
CN115862013A (zh) * | 2023-02-09 | 2023-03-28 | 南方电网数字电网研究院有限公司 | 基于注意力机制的输配电场景点云语义分割模型训练方法 |
CN115880685A (zh) * | 2022-12-09 | 2023-03-31 | 之江实验室 | 一种基于votenet模型的三维目标检测方法和系统 |
CN115908425A (zh) * | 2023-02-14 | 2023-04-04 | 四川大学 | 一种基于边缘检测的堆石级配信息检测方法 |
CN115953410A (zh) * | 2023-03-15 | 2023-04-11 | 安格利(成都)仪器设备有限公司 | 一种基于目标检测无监督学习的腐蚀坑自动检测方法 |
CN116030200A (zh) * | 2023-03-27 | 2023-04-28 | 武汉零点视觉数字科技有限公司 | 一种基于视觉融合的场景重构方法与装置 |
CN116092038A (zh) * | 2023-04-07 | 2023-05-09 | 中国石油大学(华东) | 一种基于点云的大件运输关键道路空间通行性判定方法 |
CN116206306A (zh) * | 2022-12-26 | 2023-06-02 | 山东科技大学 | 一种类间表征对比驱动的图卷积点云语义标注方法 |
CN116229057A (zh) * | 2022-12-22 | 2023-06-06 | 之江实验室 | 一种基于深度学习的三维激光雷达点云语义分割的方法和装置 |
CN116416586A (zh) * | 2022-12-19 | 2023-07-11 | 香港中文大学(深圳) | 基于rgb点云的地图元素感知方法、终端及存储介质 |
CN116468892A (zh) * | 2023-04-24 | 2023-07-21 | 北京中科睿途科技有限公司 | 三维点云的语义分割方法、装置、电子设备和存储介质 |
CN116524197A (zh) * | 2023-06-30 | 2023-08-01 | 厦门微亚智能科技有限公司 | 一种结合边缘点和深度网络的点云分割方法、装置及设备 |
CN116704137A (zh) * | 2023-07-27 | 2023-09-05 | 山东科技大学 | 一种海上石油钻井平台点云深度学习逆向建模方法 |
CN116824188A (zh) * | 2023-06-05 | 2023-09-29 | 腾晖科技建筑智能(深圳)有限公司 | 一种基于多神经网络集成学习的吊物类型识别方法及系统 |
CN116993728A (zh) * | 2023-09-26 | 2023-11-03 | 中铁水利信息科技有限公司 | 一种基于点云数据的大坝裂缝监测系统及方法 |
CN117473105A (zh) * | 2023-12-28 | 2024-01-30 | 浪潮电子信息产业股份有限公司 | 基于多模态预训练模型的三维内容生成方法及相关组件 |
CN117496309A (zh) * | 2024-01-03 | 2024-02-02 | 华中科技大学 | 建筑场景点云分割不确定性评估方法、系统及电子设备 |
CN117541799A (zh) * | 2024-01-09 | 2024-02-09 | 四川大学 | 基于在线随机森林模型复用的大规模点云语义分割方法 |
CN117576786A (zh) * | 2024-01-16 | 2024-02-20 | 北京大学深圳研究生院 | 基于视觉语言模型的三维人体行为识别网络训练方法 |
CN117710977A (zh) * | 2024-02-02 | 2024-03-15 | 西南石油大学 | 基于点云数据的大坝bim三维模型语义快速提取方法及系统 |
CN118096756A (zh) * | 2024-04-26 | 2024-05-28 | 南京航空航天大学 | 一种基于三维点云的牵引编织芯模同心度检测方法 |
CN118135289A (zh) * | 2024-02-02 | 2024-06-04 | 青海师范大学 | 一种三维点云处理方法及装置、存储介质、设备 |
CN118247512A (zh) * | 2024-05-27 | 2024-06-25 | 华中科技大学 | 点云语义分割模型建立方法及点云语义分割方法 |
Families Citing this family (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112287939B (zh) * | 2020-10-29 | 2024-05-31 | 平安科技(深圳)有限公司 | 三维点云语义分割方法、装置、设备及介质 |
CN112966696B (zh) * | 2021-02-05 | 2023-10-27 | 中国科学院深圳先进技术研究院 | 一种处理三维点云的方法、装置、设备以及存储介质 |
CN112907735B (zh) * | 2021-03-10 | 2023-07-25 | 南京理工大学 | 一种基于点云的柔性电缆识别与三维重建方法 |
CN113129372B (zh) * | 2021-03-29 | 2023-11-03 | 深圳清元文化科技有限公司 | 基于HoloLens空间映射的三维场景语义分析方法 |
CN112862017B (zh) * | 2021-04-01 | 2023-08-01 | 北京百度网讯科技有限公司 | 点云数据的标注方法、装置、设备和介质 |
CN113837215B (zh) * | 2021-04-27 | 2024-01-12 | 西北工业大学 | 一种基于条件随机场的点云语义与实例分割方法 |
CN113205531B (zh) * | 2021-04-30 | 2024-03-08 | 北京云圣智能科技有限责任公司 | 三维点云分割方法、装置及服务器 |
CN113239829B (zh) * | 2021-05-17 | 2022-10-04 | 哈尔滨工程大学 | 基于空间占用概率特征的跨维度遥感数据目标识别方法 |
CN113298822B (zh) * | 2021-05-18 | 2023-04-18 | 中国科学院深圳先进技术研究院 | 点云数据的选取方法及选取装置、设备、存储介质 |
CN113298781B (zh) * | 2021-05-24 | 2022-09-16 | 南京邮电大学 | 一种基于图像和点云融合的火星表面三维地形检测方法 |
CN113256640B (zh) * | 2021-05-31 | 2022-05-24 | 北京理工大学 | 基于PointNet网络点云分割及虚拟环境生成方法和装置 |
CN113392841B (zh) * | 2021-06-03 | 2022-11-18 | 电子科技大学 | 一种基于多特征信息增强编码的三维点云语义分割方法 |
CN113705655B (zh) * | 2021-08-24 | 2023-07-18 | 北京建筑大学 | 三维点云全自动分类方法及深度神经网络模型 |
CN113781433A (zh) * | 2021-09-10 | 2021-12-10 | 江苏霆升科技有限公司 | 基于体素划分的实时点云目标检测方法 |
CN113888736A (zh) * | 2021-10-22 | 2022-01-04 | 成都信息工程大学 | 基于PointNet++神经网络的三维点云分割方法 |
CN114004934B (zh) * | 2021-11-02 | 2024-07-26 | 国网浙江省电力有限公司湖州供电公司 | 一种基于分组分批归一化的输电线路点云分类方法 |
CN114092580B (zh) * | 2021-11-03 | 2022-10-21 | 华东交通大学 | 一种基于深度学习的三维点云数据压缩方法与系统 |
CN114638954B (zh) * | 2022-02-22 | 2024-04-19 | 深圳元戎启行科技有限公司 | 点云分割模型的训练方法、点云数据分割方法及相关装置 |
CN114612740B (zh) * | 2022-03-01 | 2024-06-21 | 京东科技信息技术有限公司 | 模型生成方法、点云分类方法、装置、设备及存储介质 |
CN114387289B (zh) * | 2022-03-24 | 2022-07-29 | 南方电网数字电网研究院有限公司 | 输配电架空线路三维点云语义分割方法和装置 |
CN114648676B (zh) * | 2022-03-25 | 2024-05-24 | 北京百度网讯科技有限公司 | 点云处理模型的训练和点云实例分割方法及装置 |
CN114882046B (zh) * | 2022-03-29 | 2024-08-02 | 驭势科技(北京)有限公司 | 三维点云数据的全景分割方法、装置、设备及介质 |
CN114927215B (zh) * | 2022-04-27 | 2023-08-25 | 苏州大学 | 基于体表点云数据直接预测肿瘤呼吸运动的方法及系统 |
CN114926690A (zh) * | 2022-05-31 | 2022-08-19 | 广东省核工业地质局测绘院 | 一种基于计算机视觉的点云自动化分类方法 |
CN117635810A (zh) * | 2022-08-17 | 2024-03-01 | 北京字跳网络技术有限公司 | 立体模型处理方法、装置、设备及介质 |
CN115205717B (zh) * | 2022-09-14 | 2022-12-20 | 广东汇天航空航天科技有限公司 | 障碍物点云数据处理方法以及飞行设备 |
WO2024108341A1 (zh) * | 2022-11-21 | 2024-05-30 | 深圳先进技术研究院 | 基于点云理解的自动排牙方法、装置、设备及存储介质 |
CN115908734B (zh) * | 2022-11-25 | 2023-07-07 | 贵州电网有限责任公司信息中心 | 电网地图更新方法、装置、设备及存储介质 |
CN115546785A (zh) * | 2022-11-29 | 2022-12-30 | 中国第一汽车股份有限公司 | 三维目标检测方法及装置 |
CN116030190B (zh) * | 2022-12-20 | 2023-06-20 | 中国科学院空天信息创新研究院 | 一种基于点云与目标多边形的目标三维模型生成方法 |
CN116091777A (zh) * | 2023-02-27 | 2023-05-09 | 阿里巴巴达摩院(杭州)科技有限公司 | 点云全景分割及其模型训练方法、电子设备 |
CN116413740B (zh) * | 2023-06-09 | 2023-09-05 | 广汽埃安新能源汽车股份有限公司 | 一种激光雷达点云地面检测方法及装置 |
CN116721221B (zh) * | 2023-08-08 | 2024-01-12 | 浪潮电子信息产业股份有限公司 | 基于多模态的三维内容生成方法、装置、设备及存储介质 |
CN117152363B (zh) * | 2023-10-30 | 2024-02-13 | 浪潮电子信息产业股份有限公司 | 基于预训练语言模型的三维内容生成方法、装置及设备 |
CN117291845B (zh) * | 2023-11-27 | 2024-03-19 | 成都理工大学 | 一种点云地面滤波方法、系统、电子设备及存储介质 |
CN118334278B (zh) * | 2024-06-17 | 2024-08-27 | 之江实验室 | 一种点云数据处理方法、装置、存储介质及设备 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109635685A (zh) * | 2018-11-29 | 2019-04-16 | 北京市商汤科技开发有限公司 | 目标对象3d检测方法、装置、介质及设备 |
CN111199206A (zh) * | 2019-12-30 | 2020-05-26 | 上海眼控科技股份有限公司 | 三维目标检测方法、装置、计算机设备及存储介质 |
CN111784699A (zh) * | 2019-04-03 | 2020-10-16 | Tcl集团股份有限公司 | 一种对三维点云数据进行目标分割方法、装置及终端设备 |
CN112287939A (zh) * | 2020-10-29 | 2021-01-29 | 平安科技(深圳)有限公司 | 三维点云语义分割方法、装置、设备及介质 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11004202B2 (en) * | 2017-10-09 | 2021-05-11 | The Board Of Trustees Of The Leland Stanford Junior University | Systems and methods for semantic segmentation of 3D point clouds |
CN109711410A (zh) * | 2018-11-20 | 2019-05-03 | 北方工业大学 | 一种三维物体快速分割和识别方法、装置及系统 |
CN109829399B (zh) * | 2019-01-18 | 2022-07-05 | 武汉大学 | 一种基于深度学习的车载道路场景点云自动分类方法 |
CN111310765A (zh) * | 2020-02-14 | 2020-06-19 | 北京经纬恒润科技有限公司 | 激光点云语义分割方法和装置 |
-
2020
- 2020-10-29 CN CN202011182178.4A patent/CN112287939B/zh active Active
-
2021
- 2021-05-31 WO PCT/CN2021/097548 patent/WO2022088676A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109635685A (zh) * | 2018-11-29 | 2019-04-16 | 北京市商汤科技开发有限公司 | 目标对象3d检测方法、装置、介质及设备 |
CN111784699A (zh) * | 2019-04-03 | 2020-10-16 | Tcl集团股份有限公司 | 一种对三维点云数据进行目标分割方法、装置及终端设备 |
CN111199206A (zh) * | 2019-12-30 | 2020-05-26 | 上海眼控科技股份有限公司 | 三维目标检测方法、装置、计算机设备及存储介质 |
CN112287939A (zh) * | 2020-10-29 | 2021-01-29 | 平安科技(深圳)有限公司 | 三维点云语义分割方法、装置、设备及介质 |
Cited By (52)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114998604A (zh) * | 2022-05-09 | 2022-09-02 | 中国地质大学(武汉) | 一种基于局部点云位置关系的点云特征提取方法 |
CN114926636A (zh) * | 2022-05-12 | 2022-08-19 | 合众新能源汽车有限公司 | 一种点云语义分割方法、装置、设备及存储介质 |
CN115082726A (zh) * | 2022-05-18 | 2022-09-20 | 同济大学 | 一种基于PointNet优化的座便器陶瓷素坯产品分类方法 |
CN114882113A (zh) * | 2022-05-23 | 2022-08-09 | 大连理工大学 | 基于同类物体形状对应性的五指机械灵巧手抓取迁移方法 |
CN115082498A (zh) * | 2022-05-24 | 2022-09-20 | 河南中原动力智能制造有限公司 | 一种机器人抓取位姿估计方法、装置、设备及存储介质 |
CN115841585A (zh) * | 2022-05-31 | 2023-03-24 | 上海人工智能创新中心 | 一种对点云分割网络进行知识蒸馏的方法 |
CN115841585B (zh) * | 2022-05-31 | 2024-06-11 | 上海人工智能创新中心 | 一种对点云分割网络进行知识蒸馏的方法 |
CN114882224A (zh) * | 2022-06-06 | 2022-08-09 | 中国电建集团中南勘测设计研究院有限公司 | 模型结构、模型训练方法、单体化方法、设备及介质 |
CN114882224B (zh) * | 2022-06-06 | 2024-04-05 | 中国电建集团中南勘测设计研究院有限公司 | 模型结构、模型训练方法、单体化方法、设备及介质 |
CN115050019A (zh) * | 2022-06-27 | 2022-09-13 | 西安交通大学 | 三维点云场景中的交互关系检测方法、系统、装置及存储介质 |
CN114821074A (zh) * | 2022-07-01 | 2022-07-29 | 湖南盛鼎科技发展有限责任公司 | 机载liDAR点云语义分割方法、电子设备及存储介质 |
CN115170585B (zh) * | 2022-07-12 | 2024-06-14 | 上海人工智能创新中心 | 三维点云语义分割方法 |
CN115170585A (zh) * | 2022-07-12 | 2022-10-11 | 上海人工智能创新中心 | 三维点云语义分割方法 |
CN115457496A (zh) * | 2022-09-09 | 2022-12-09 | 北京百度网讯科技有限公司 | 自动驾驶的挡墙检测方法、装置及车辆 |
CN115457496B (zh) * | 2022-09-09 | 2023-12-08 | 北京百度网讯科技有限公司 | 自动驾驶的挡墙检测方法、装置及车辆 |
CN115311274A (zh) * | 2022-10-11 | 2022-11-08 | 四川路桥华东建设有限责任公司 | 一种基于空间变换自注意力模块的焊缝检测方法及系统 |
CN115393597A (zh) * | 2022-10-31 | 2022-11-25 | 之江实验室 | 基于脉冲神经网络与激光雷达点云的语义分割方法及装置 |
CN115880685A (zh) * | 2022-12-09 | 2023-03-31 | 之江实验室 | 一种基于votenet模型的三维目标检测方法和系统 |
CN115880685B (zh) * | 2022-12-09 | 2024-02-13 | 之江实验室 | 一种基于votenet模型的三维目标检测方法和系统 |
CN116416586A (zh) * | 2022-12-19 | 2023-07-11 | 香港中文大学(深圳) | 基于rgb点云的地图元素感知方法、终端及存储介质 |
CN116416586B (zh) * | 2022-12-19 | 2024-04-02 | 香港中文大学(深圳) | 基于rgb点云的地图元素感知方法、终端及存储介质 |
CN116229057A (zh) * | 2022-12-22 | 2023-06-06 | 之江实验室 | 一种基于深度学习的三维激光雷达点云语义分割的方法和装置 |
CN116229057B (zh) * | 2022-12-22 | 2023-10-27 | 之江实验室 | 一种基于深度学习的三维激光雷达点云语义分割的方法和装置 |
CN116206306A (zh) * | 2022-12-26 | 2023-06-02 | 山东科技大学 | 一种类间表征对比驱动的图卷积点云语义标注方法 |
CN115862013B (zh) * | 2023-02-09 | 2023-06-27 | 南方电网数字电网研究院有限公司 | 基于注意力机制的输配电场景点云语义分割模型训练方法 |
CN115862013A (zh) * | 2023-02-09 | 2023-03-28 | 南方电网数字电网研究院有限公司 | 基于注意力机制的输配电场景点云语义分割模型训练方法 |
CN115908425A (zh) * | 2023-02-14 | 2023-04-04 | 四川大学 | 一种基于边缘检测的堆石级配信息检测方法 |
CN115953410A (zh) * | 2023-03-15 | 2023-04-11 | 安格利(成都)仪器设备有限公司 | 一种基于目标检测无监督学习的腐蚀坑自动检测方法 |
CN116030200A (zh) * | 2023-03-27 | 2023-04-28 | 武汉零点视觉数字科技有限公司 | 一种基于视觉融合的场景重构方法与装置 |
CN116092038A (zh) * | 2023-04-07 | 2023-05-09 | 中国石油大学(华东) | 一种基于点云的大件运输关键道路空间通行性判定方法 |
CN116468892A (zh) * | 2023-04-24 | 2023-07-21 | 北京中科睿途科技有限公司 | 三维点云的语义分割方法、装置、电子设备和存储介质 |
CN116824188B (zh) * | 2023-06-05 | 2024-04-09 | 腾晖科技建筑智能(深圳)有限公司 | 一种基于多神经网络集成学习的吊物类型识别方法及系统 |
CN116824188A (zh) * | 2023-06-05 | 2023-09-29 | 腾晖科技建筑智能(深圳)有限公司 | 一种基于多神经网络集成学习的吊物类型识别方法及系统 |
CN116524197B (zh) * | 2023-06-30 | 2023-09-29 | 厦门微亚智能科技股份有限公司 | 一种结合边缘点和深度网络的点云分割方法、装置及设备 |
CN116524197A (zh) * | 2023-06-30 | 2023-08-01 | 厦门微亚智能科技有限公司 | 一种结合边缘点和深度网络的点云分割方法、装置及设备 |
CN116704137A (zh) * | 2023-07-27 | 2023-09-05 | 山东科技大学 | 一种海上石油钻井平台点云深度学习逆向建模方法 |
CN116704137B (zh) * | 2023-07-27 | 2023-10-24 | 山东科技大学 | 一种海上石油钻井平台点云深度学习逆向建模方法 |
CN116993728B (zh) * | 2023-09-26 | 2023-12-01 | 中铁水利信息科技有限公司 | 一种基于点云数据的大坝裂缝监测系统及方法 |
CN116993728A (zh) * | 2023-09-26 | 2023-11-03 | 中铁水利信息科技有限公司 | 一种基于点云数据的大坝裂缝监测系统及方法 |
CN117473105B (zh) * | 2023-12-28 | 2024-04-05 | 浪潮电子信息产业股份有限公司 | 基于多模态预训练模型的三维内容生成方法及相关组件 |
CN117473105A (zh) * | 2023-12-28 | 2024-01-30 | 浪潮电子信息产业股份有限公司 | 基于多模态预训练模型的三维内容生成方法及相关组件 |
CN117496309A (zh) * | 2024-01-03 | 2024-02-02 | 华中科技大学 | 建筑场景点云分割不确定性评估方法、系统及电子设备 |
CN117496309B (zh) * | 2024-01-03 | 2024-03-26 | 华中科技大学 | 建筑场景点云分割不确定性评估方法、系统及电子设备 |
CN117541799B (zh) * | 2024-01-09 | 2024-03-08 | 四川大学 | 基于在线随机森林模型复用的大规模点云语义分割方法 |
CN117541799A (zh) * | 2024-01-09 | 2024-02-09 | 四川大学 | 基于在线随机森林模型复用的大规模点云语义分割方法 |
CN117576786B (zh) * | 2024-01-16 | 2024-04-16 | 北京大学深圳研究生院 | 基于视觉语言模型的三维人体行为识别网络训练方法 |
CN117576786A (zh) * | 2024-01-16 | 2024-02-20 | 北京大学深圳研究生院 | 基于视觉语言模型的三维人体行为识别网络训练方法 |
CN117710977A (zh) * | 2024-02-02 | 2024-03-15 | 西南石油大学 | 基于点云数据的大坝bim三维模型语义快速提取方法及系统 |
CN117710977B (zh) * | 2024-02-02 | 2024-04-26 | 西南石油大学 | 基于点云数据的大坝bim三维模型语义快速提取方法及系统 |
CN118135289A (zh) * | 2024-02-02 | 2024-06-04 | 青海师范大学 | 一种三维点云处理方法及装置、存储介质、设备 |
CN118096756A (zh) * | 2024-04-26 | 2024-05-28 | 南京航空航天大学 | 一种基于三维点云的牵引编织芯模同心度检测方法 |
CN118247512A (zh) * | 2024-05-27 | 2024-06-25 | 华中科技大学 | 点云语义分割模型建立方法及点云语义分割方法 |
Also Published As
Publication number | Publication date |
---|---|
CN112287939A (zh) | 2021-01-29 |
CN112287939B (zh) | 2024-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022088676A1 (zh) | 三维点云语义分割方法、装置、设备及介质 | |
US11810377B2 (en) | Point cloud segmentation method, computer-readable storage medium, and computer device | |
CN110322453B (zh) | 基于位置注意力和辅助网络的3d点云语义分割方法 | |
CN112990010B (zh) | 点云数据处理方法、装置、计算机设备和存储介质 | |
CN111028327B (zh) | 一种三维点云的处理方法、装置及设备 | |
CN112966696A (zh) | 一种处理三维点云的方法、装置、设备以及存储介质 | |
CN114255238A (zh) | 一种融合图像特征的三维点云场景分割方法及系统 | |
US20220292728A1 (en) | Point cloud data processing method and device, computer device, and storage medium | |
CN115170746B (zh) | 一种基于深度学习的多视图三维重建方法、系统及设备 | |
US11615612B2 (en) | Systems and methods for image feature extraction | |
Zhang et al. | Learning rotation-invariant representations of point clouds using aligned edge convolutional neural networks | |
CN114299405A (zh) | 一种无人机图像实时目标检测方法 | |
CN114120067A (zh) | 一种物体识别方法、装置、设备及介质 | |
Ahmad et al. | 3D capsule networks for object classification from 3D model data | |
CN117237623B (zh) | 一种无人机遥感图像语义分割方法及系统 | |
Wang et al. | Calyolov4: lightweight yolov4 target detection based on coordinated attention | |
CN111860668B (zh) | 一种针对原始3d点云处理的深度卷积网络的点云识别方法 | |
CN117671666A (zh) | 一种基于自适应图卷积神经网络的目标识别方法 | |
CN117253222A (zh) | 基于多级信息融合机制的自然场景文本检测方法和装置 | |
CN117132964A (zh) | 模型训练方法、点云编码方法、对象处理方法及装置 | |
CN116452940A (zh) | 一种基于稠密与稀疏卷积融合的三维实例分割方法及系统 | |
CN109583584B (zh) | 可使具有全连接层的cnn接受不定形状输入的方法及系统 | |
CN112990336B (zh) | 基于竞争注意力融合的深度三维点云分类网络构建方法 | |
CN112614199B (zh) | 语义分割图像转换方法、装置、计算机设备和存储介质 | |
Qiang et al. | Hierarchical point cloud transformer: A unified vegetation semantic segmentation model for multisource point clouds based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 23/06/2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21884419 Country of ref document: EP Kind code of ref document: A1 |