CN112149677A - Point cloud semantic segmentation method, device and equipment - Google Patents

Point cloud semantic segmentation method, device and equipment Download PDF

Info

Publication number
CN112149677A
CN112149677A CN202010963729.4A CN202010963729A CN112149677A CN 112149677 A CN112149677 A CN 112149677A CN 202010963729 A CN202010963729 A CN 202010963729A CN 112149677 A CN112149677 A CN 112149677A
Authority
CN
China
Prior art keywords
point cloud
cloud data
semantic segmentation
data
voxelized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010963729.4A
Other languages
Chinese (zh)
Inventor
魏宇飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Eye Control Technology Co Ltd
Original Assignee
Shanghai Eye Control Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Eye Control Technology Co Ltd filed Critical Shanghai Eye Control Technology Co Ltd
Priority to CN202010963729.4A priority Critical patent/CN112149677A/en
Publication of CN112149677A publication Critical patent/CN112149677A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Optical Radar Systems And Details Thereof (AREA)

Abstract

The method, the device and the equipment for point cloud semantic segmentation are used for performing voxelization processing on point cloud data to be detected to obtain voxelized point cloud data; calculating the voxelized point cloud data by using a preset point cloud semantic segmentation model to determine a semantic segmentation result; decoding the semantic segmentation result to determine a decoding result, and calculating the decoding result based on the index of each point of the decoding result in each dimension direction to obtain target point cloud data; and classifying all point cloud data to be detected according to the target point cloud data to determine the category information of all point cloud data to be detected. Therefore, the calculated amount of semantic segmentation is reduced, and the accuracy and the integrity of point cloud semantic segmentation results in all scenes are ensured.

Description

Point cloud semantic segmentation method, device and equipment
Technical Field
The present application relates to the field of computers, and in particular, to a method, an apparatus, and a device for point cloud semantic segmentation.
Background
The semantic segmentation algorithm based on the 2D image mainly has the function of carrying out region division on pixels belonging to different targets on a picture. Similarly, the semantic segmentation algorithm based on 3D point cloud mainly functions to partition the area of the point cloud belonging to different targets in the scene. The existing point cloud semantic segmentation algorithm comprises the following steps: PointNet, PointNet + +, PointSIFT, etc. The method directly processes point cloud data and outputs semantic segmentation categories of each point cloud. Although the algorithm achieves higher segmentation accuracy, the calculation amount is large because the point cloud data is directly processed, so that the practical application of the algorithm is limited.
Disclosure of Invention
An object of the present application is to provide a method, an apparatus and a device for point cloud semantic segmentation, which solve the problem of excessive calculation amount in the prior art for directly processing point cloud data.
According to one aspect of the application, a method for point cloud semantic segmentation is provided, and the method comprises the following steps:
performing voxelization processing on the point cloud data to be detected to obtain voxelized point cloud data;
calculating the voxelized point cloud data by using a preset point cloud semantic segmentation model to determine a semantic segmentation result;
decoding the semantic segmentation result to determine a decoding result, and calculating the decoding result based on the index of each point of the decoding result in each dimension direction to obtain target point cloud data;
and classifying all point cloud data to be detected according to the target point cloud data to determine the category information of all point cloud data to be detected.
Further, before the calculating the voxelized point cloud data by using a preset point cloud semantic segmentation model to determine a semantic segmentation result, the method comprises the following steps:
acquiring training data, and performing voxelization processing on the training data to obtain voxelized training data;
and training the point cloud semantic segmentation model by using the voxelized training data to obtain a preset point cloud semantic segmentation model.
Further, the training of the point cloud semantic segmentation model using the voxelized training data comprises:
classifying the voxelized training data to determine a plurality of category information, generating a label according to the category information and labeling the voxelized training data with the label;
calculating marked voxelized training data by using a high-resolution network to determine a training feature map;
and coding the label to obtain a coded label, and calculating the cross entropy loss of the coded label and the training feature map to train a point cloud semantic segmentation model.
Further, the calculating the voxelized point cloud data by using a preset point cloud semantic segmentation model to determine a semantic segmentation result comprises:
calculating the voxelized point cloud data by using a preset point cloud semantic segmentation model to determine a feature map;
and determining a semantic segmentation result according to the feature map.
Further, the decoding the semantic segmentation result to determine a decoding result includes:
and decoding the semantic segmentation result by using one-hot coding to determine a decoding result.
Further, the calculating the decoding result based on the index of each point of the decoding result in each dimension direction to obtain the target point cloud data includes:
screening all points in the decoding result according to the numerical value corresponding to each point to determine a target point;
taking the voxel serial numbers of the target points on all axes as indexes, and generating an index matrix according to the indexes;
and calculating the index matrix to obtain target point cloud data.
Further, the target point cloud data includes a central point coordinate of the target point cloud, and the step of classifying all the point cloud data to be detected according to the target point cloud data to determine category information of all the point cloud data to be detected includes:
calculating a coordinate threshold of the target point cloud according to the center point coordinates of the target point cloud;
and classifying all point cloud data to be detected according to the coordinate threshold value to determine the category information of all point cloud data to be detected.
Further, classifying all point cloud data to be detected according to the coordinate threshold to determine category information of all point cloud data to be detected, including:
screening point cloud data to be detected in the target point cloud space according to the coordinate threshold;
and classifying the point cloud data to be detected in the target point cloud space into target categories.
According to another aspect of the present application, there is also provided an apparatus for point cloud semantic segmentation, wherein the apparatus includes:
the data processing module is used for carrying out voxelization processing on the point cloud data to be detected to obtain voxelized point cloud data;
the identification module is used for calculating the voxelized point cloud data by using a preset point cloud semantic segmentation model so as to determine a semantic segmentation result;
the decoding module is used for decoding the semantic segmentation result to determine a decoding result, and calculating the decoding result based on the index of each point of the decoding result in each dimension direction to obtain target point cloud data;
and the classification module is used for classifying all point cloud data to be detected according to the target point cloud data so as to determine the category information of all point cloud data to be detected.
According to yet another aspect of the application, there is also provided a computer readable medium having computer readable instructions stored thereon, the computer readable instructions being executable by a processor to implement the method of any of the preceding claims.
According to yet another aspect of the present application, there is also provided an apparatus for point cloud semantic segmentation, wherein the apparatus comprises:
one or more processors; and
a memory storing computer readable instructions that, when executed, cause the processor to perform operations of any of the methods described above.
Compared with the prior art, the method and the device have the advantages that the point cloud data to be detected are subjected to voxelization processing, and voxelization point cloud data are obtained; calculating the voxelized point cloud data by using a preset point cloud semantic segmentation model to determine a semantic segmentation result; decoding the semantic segmentation result to determine a decoding result, and calculating the decoding result based on the index of each point of the decoding result in each dimension direction to obtain target point cloud data; and classifying all point cloud data to be detected according to the target point cloud data to determine the category information of all point cloud data to be detected. Therefore, the calculated amount of semantic segmentation is reduced, and the accuracy and the integrity of point cloud semantic segmentation results in all scenes are ensured.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 illustrates a flow diagram of a method of point cloud semantic segmentation provided in accordance with an aspect of the present application;
FIG. 2 is a schematic diagram of a vehicle labeling in a practical application scenario in a preferred embodiment of the present application;
FIG. 3 is a schematic diagram of three-dimensional spatial point cloud data and annotation collected by a laser radar in a preferred embodiment of the present application;
FIG. 4 illustrates a schematic flow diagram of a high resolution network in a preferred embodiment of the present application;
fig. 5 shows a schematic diagram of a framework structure of an apparatus for point cloud semantic segmentation provided according to another aspect of the present application.
The same or similar reference numbers in the drawings identify the same or similar elements.
Detailed Description
The present application is described in further detail below with reference to the attached figures.
In a typical configuration of the present application, the terminal, the device serving the network, and the trusted party each include one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
Fig. 1 shows a schematic flow chart of a method for point cloud semantic segmentation provided according to an aspect of the present application, where the method includes: S11-S14, wherein in the S11, the point cloud data to be detected is subjected to voxelization processing to obtain voxelized point cloud data; step S12, calculating the voxelized point cloud data by using a preset point cloud semantic segmentation model to determine a semantic segmentation result; step S13, decoding the semantic segmentation result to determine a decoding result, and calculating the decoding result based on the index of each point of the decoding result in each dimension direction to obtain target point cloud data; and step S14, classifying all point cloud data to be detected according to the target point cloud data to determine the category information of all point cloud data to be detected. Therefore, the calculated amount of semantic segmentation is reduced, and the accuracy and the integrity of point cloud semantic segmentation results in all scenes are ensured.
Specifically, in step S11, the point cloud data to be detected is subjected to voxelization processing to obtain voxelized point cloud data. The voxelization (voxelization) is a method for dividing a space into a uniform grid-shaped space, discretizes a continuous space, and performs voxelization on point cloud data to be detected to obtain point cloud data consisting of voxels (voxels), that is, voxelization point cloud data, wherein the voxelization point cloud data is voxel data only containing 0 and 1.
Fig. 2 shows a vehicle labeling diagram in an actual application scenario in a preferred embodiment of the present application, and fig. 3 shows three-dimensional space point cloud data and a labeling diagram acquired by a laser radar in a preferred embodiment of the present application. Here, fig. 2 is acquired by a laser radar to obtain fig. 3, the three-dimensional space point cloud data is preferably Kitti point cloud data, the Kitti point cloud data is distributed in a space formed by an x axis (0,70.4), a y axis (-40,40) and a z axis (-3,1), the space is divided into 1408 parts along the x axis, 1600 parts along the y axis and 40 parts along the z axis, and accordingly 1600x1408x40 small spaces with the size of 0.05 long, 0.05 wide and 0.1 high, namely voxels (voxel) are obtained. It should be noted that voxelization is equivalent to custom setting of a resolution, and a user can freely set the resolution according to needs, for example, according to precision needs, or according to the situation of computing resources.
In the above embodiment, an all 0 matrix with a shape of 1600 × 1408 × 40 is first generated. Each data position in the matrix has a one-to-one correspondence with each small space of the point cloud space. And (3) assigning a value to each position in the matrix, and if any point in the voxel (voxel) exists, namely the coordinate of any point in the voxel is in a small space coordinate range corresponding to the voxel, determining that the point cloud exists in the voxel. For example, if the spatial range of a certain voxel is 0.1< x ≦ 0.2, 0.0< y ≦ 0.1, and 0.2< z ≦ 0.3, if the coordinates of any one point fall within the region, it is considered that a point cloud exists in the voxel. The matrix position corresponding to the voxel is 1, otherwise, 0, and by doing so, a three-dimensional voxel data is obtained, where the position of a cloud in the voxel data is 1, and the other positions are 0. Although coordinate information of the point cloud is ignored in the voxelization processing, geometric information of the point cloud is completely reserved, and the shape and the position of the space object in the scene space can be represented, so that calculation is simplified, calculation resource consumption is reduced, and subsequent calculation accuracy is not affected.
Step S12, calculating the voxelized point cloud data using a preset point cloud semantic segmentation model to determine a semantic segmentation result. The preset point cloud semantic segmentation model is preferably a trained three-dimensional (3D) point cloud semantic segmentation model, and the trained 3D point cloud semantic segmentation model is used to calculate the voxelized point cloud data to obtain data corresponding to the point cloud shape, that is, a semantic segmentation result.
And step S13, decoding the semantic segmentation result to determine a decoding result, and calculating the decoding result based on the index of each point of the decoding result in each dimension direction to obtain target point cloud data. Here, the index refers to the number of voxels, for example, if the space is divided into 1600 voxels on the y-axis, the index of the first voxel is 0, the index of the second voxel is 1, and so on. And calculating the decoding result according to the index of each point of the decoding result in each dimension direction to determine the coordinate value range of the voxel space, so as to obtain the target point cloud data.
And step S14, classifying all point cloud data to be detected according to the target point cloud data to determine the category information of all point cloud data to be detected. Here, all point cloud data to be detected are classified according to the coordinate value range of the voxel space corresponding to the target point cloud data, for example, when the coordinate value of the point cloud to be detected is within the coordinate value range of the voxel space corresponding to the target point cloud data, the point cloud data to be detected is classified into the category corresponding to the target point cloud data, and so on, the category information of all point cloud data to be detected is determined.
In a preferred embodiment of the application, before calculating the voxelized point cloud data by using a preset point cloud semantic segmentation model to determine a semantic segmentation result, acquiring training data, and performing voxelization processing on the training data to obtain voxelized training data; and training the point cloud semantic segmentation model by using the voxelized training data to obtain a preset point cloud semantic segmentation model. The preset point cloud semantic segmentation data is preferably a trained 3D point cloud semantic segmentation model, the training data can be Kitti point cloud data, and after the Kitti point cloud data is subjected to voxelization, the 3D point cloud semantic segmentation model is trained by using voxelized Kitti point cloud data to obtain the trained 3D point cloud semantic segmentation model.
In a preferred embodiment of the present application, the voxelized training data is classified to determine a plurality of category information, a label is generated according to the category information, and the voxelized training data is labeled using the label; calculating marked voxelized training data by using a high-resolution network to determine a training feature map; and coding the label to obtain a coded label, and calculating the cross entropy loss of the coded label and the training feature map to train a point cloud semantic segmentation model. Here, the voxelized training data is classified into a plurality of categories to obtain a plurality of category information, and a label is generated from the category information to label the voxelized training data. For example, assuming that only the training data is classified twice, the category information is a vehicle, the point cloud belonging to the vehicle and the point cloud not belonging to the vehicle can be obtained by classifying the training data, and the point cloud label belonging to the vehicle is set to be 1, and the point cloud label not belonging to the vehicle is set to be 0. An all-0 matrix with size data identical to the training data can be pre-generated, if the corresponding voxel has the point cloud belonging to the vehicle, the matrix position corresponding to the vehicle is set to be 1, and the marking processing of the point cloud data belonging to the vehicle is completed.
Next, the labeled voxelized training data is calculated using a high resolution network (HRNet network) to determine a training feature map. Here, HRNet maintains high resolution features throughout. The high-resolution features retain more local features of the point cloud, and the local features are beneficial to judgment of a segmentation boundary; the HRNet multi-scale feature parallel connection structure ensures the simultaneous extraction of the multi-scale features; HRNet is connected with a subnet information exchange mechanism in parallel, so that repeated fusion of multi-scale features is realized; the extraction and fusion of the multi-scale features are beneficial to processing objects with different sizes by a network, and the extraction of global features in different spaces is realized; the global feature is a judgment using a segmentation class. And the final fusion layer fuses different scale features, and the fusion of the local features and the global features is favorable for judging the final segmentation class.
Fig. 4 shows a schematic flow diagram of a high resolution network in a preferred embodiment of the present application, with a size of the voxelized training data of 1600 × 1408 × 40, where the voxelized training data is classified into voxels belonging to vehicles and voxels not belonging to vehicles, where 40 is the dimension in height. The color picture data usually refers to the third dimension as the channel dimension, and thus the training data size is determined to be 1600 × 1408, and the number of channels is 40. The training data is input into an HRNet network to extract features, and depth feature data with the size of 400 x 352 x 128 is obtained through calculation of the HRNet network. With 4 times upsampling, the depth feature data size becomes 1600 × 1408 × 128. The number of channels of the depth feature is adjusted to 80 by one convolutional layer, and the first feature map with the size of 1600 × 1408 × 80 is obtained corresponding to the number of channels 40 of the two classes of voxelized training data. Finally, the first feature map is coded through shape resetting to obtain a feature map with the size of 1600 × 1408 × 40 × 2, and the last dimension can be determined through coding processing of one-hot coding.
After the label is coded, for example, the label is subjected to one-hot coding, cross entropy loss is calculated with the training feature map to train the point cloud semantic segmentation model, for example, the cross entropy loss is optimized by adopting a random gradient descent algorithm, the initial learning rate is set to be 0.001, the learning rate is adjusted to be 0.1 time of the original learning rate every 10 generations, and training is finished for 100 generations.
In a preferred embodiment of the present application, in step S12, the voxelized point cloud data is calculated by using a preset point cloud semantic segmentation model to determine a feature map; and determining a semantic segmentation result according to the feature map. The voxelized point cloud data is calculated by a preset point cloud semantic segmentation model, and then a feature map is determined, for example, the voxelized point cloud data is calculated by a trained 3D point cloud semantic segmentation model to obtain a feature map, and the feature map is calculated by the 3D point cloud semantic segmentation model to obtain a semantic segmentation result.
In a preferred embodiment of the present application, in step S13, the semantic segmentation result is decoded using one-hot encoding to determine a decoding result. Here, the range of the data corresponding to the semantic division result is 0 to 1, and the last dimension of the size is a result obtained after encoding processing by one-hot encoding (one-hot encoding), so that one-hot decoding is performed first. For example, the data size and shape obtained by one-hot decoding of a semantic segmentation result having a size of 1600 × 1408 × 40 × 2 is 1600 × 1408 × 40. It should be noted that there are two possibilities, for example, 0 or 1, for the value of each position of the decoded semantic segmentation result under the binary classification. 0 indicates that the location does not belong to a vehicle, and 1 indicates that the location belongs to a vehicle.
In a preferred embodiment of the present application, in step S13, all the points in the decoding result are screened according to the corresponding numerical value of each point to determine a target point; taking the voxel serial numbers of the target points on all axes as indexes, and generating an index matrix according to the indexes; and calculating the index matrix to obtain target point cloud data. Here, the corresponding numerical value of each point in the decoding result is used to distinguish the category of each point, for example, 1 represents that the point belongs to a vehicle, 0 represents that the point does not belong to the vehicle, all the points in the decoding result are classified according to the numerical value to determine the target point, for example, the target point is a point belonging to the vehicle, and all the points in the decoding result are screened by using the numerical value 1 to determine the point belonging to the vehicle. It should be noted that 0 and 1 are only examples of two categories, and in practical applications, multiple categories may be classified, and different values may be set for each category.
Then, the voxel number of each target point on each axis is used as an index, and the voxel number is a numerical value corresponding to the sequence of the voxel to which the target point belongs on each axis, for example, if the space is divided into 1600 parts on the y axis, the index of the first voxel is 0, the index of the second voxel is 1, and so on, the corresponding point cloud coordinate value can be converted according to the voxel initial position and the coordinate range corresponding to the voxel size. Then, an index matrix is generated according to the index, and the index matrix is calculated to obtain target point cloud data.
In a preferred embodiment of the present application, the size of the decoded point cloud semantic segmentation result is 1600 × 1408 × 40, and all the points with the value of 1 belong to a vehicle, and then indexes of all the points with the value of 1 in each dimension direction are combined into an N × 3 index matrix. This index matrix is shown below:
Figure BDA0002681476780000101
wherein, the value range of N is [0, 1600 multiplied by 1408 multiplied by 40 ]],indexyAn index on the y-axis representing a point with a value of 1, ranging from 0 to 1599; indexxAn index on the x-axis representing a point with a value of 1, ranging from 0 to 1407; indexzDenotes the index on the z-axis of a point with a value of 1, ranging from 0 to 39.
x=(indexx+0.5)*0.05+0
y=(indexy+0.5)*0.05-40x
z=(indexz+0.5)*0.1-3
Next, the above three equations are sequentially applied to the N × 3 index matrix to perform calculation to determine the coordinates (x, y, z) of the center point of the target point cloud data. By means of the method, calculation amount of semantic segmentation is reduced, and processing efficiency is improved.
In a preferred embodiment of the present application, in step S14, the target point cloud data includes coordinates of a center point of the target point cloud, and a coordinate threshold of the target point cloud is calculated according to the coordinates of the center point of the target point cloud; and classifying all point cloud data to be detected according to the coordinate threshold value to determine the category information of all point cloud data to be detected. The center point coordinates are center point coordinates of point cloud voxels where all target points are located, a coordinate threshold of the target point cloud is calculated according to all the center point coordinates, for example, the coordinate threshold of the target point cloud is calculated according to the calculated center point coordinates corresponding to all voxels belonging to a vehicle, when the point cloud to be tested is located within the coordinate threshold of the target point cloud, the point cloud to be tested is determined to belong to the vehicle, and the category information is the vehicle.
In a preferred embodiment of the present application, in step S14, screening out point cloud data to be detected in the target point cloud space according to the coordinate threshold; and classifying the point cloud data to be detected in the target point cloud space into target categories. Determining a target point cloud space range according to the coordinate threshold, when the coordinate value of the point cloud data to be detected is within the coordinate threshold, the point cloud data to be detected is within the target point cloud space, then classifying the point cloud data to be detected in the target point cloud space into a target category, for example, if the target category is a vehicle, when the point cloud to be detected is within the coordinate threshold of the target point cloud, determining that the point cloud to be detected belongs to the vehicle, and the category information is the vehicle.
In a preferred embodiment of the present application, whether each point cloud to be tested belongs to a vehicle is calculated according to center point coordinates corresponding to a plurality of voxels belonging to the vehicle. For a certain voxel with the length of 0.05, the width of 0.05 and the height of 0.1, as long as the point cloud to be tested falls in the voxel, namely whether the coordinate value corresponding to the point cloud to be tested falls in the coordinate value range corresponding to the voxel space, if so, the point cloud to be tested is judged to belong to the vehicle. If not, the point cloud to be tested does not belong to the vehicle. After the point clouds to be tested are processed in the way, segmentation categories are marked on all the point clouds to be tested, and the classification of all the point clouds to be tested is completed.
Fig. 5 is a schematic diagram illustrating a framework structure of an apparatus for point cloud semantic segmentation according to another aspect of the present application, wherein the apparatus includes: the data processing module 100 is configured to perform voxelization processing on the point cloud data to be detected to obtain voxelized point cloud data; an identification module 200, configured to calculate the voxelized point cloud data using a preset point cloud semantic segmentation model to determine a semantic segmentation result; the decoding module 300 is configured to decode the semantic segmentation result to determine a decoding result, and calculate the decoding result based on an index of each point of the decoding result in each dimension direction to obtain target point cloud data; and the classification module 400 is configured to classify all point cloud data to be detected according to the target point cloud data to determine category information of all point cloud data to be detected. Therefore, the calculated amount of semantic segmentation is reduced, and the accuracy and the integrity of point cloud semantic segmentation results in all scenes are ensured.
It should be noted that the content executed by the data processing module 100, the identification module 200, the decoding module 300, and the classification module 400 is the same as or corresponding to the content in the above steps S11, S12, S13, and S14, and for brevity, the description is omitted here.
Furthermore, a computer-readable medium is provided, on which computer-readable instructions are stored, and the computer-readable instructions can be executed by a processor to implement the aforementioned method for point cloud semantic segmentation.
According to still another aspect of the present application, there is also provided an apparatus for point cloud semantic segmentation, wherein the apparatus includes:
one or more processors; and
a memory having computer-readable instructions stored thereon that, when executed, cause the processor to perform the operations of one of the methods for point cloud semantic segmentation described previously.
For example, the computer readable instructions, when executed, cause the one or more processors to:
performing voxelization processing on the point cloud data to be detected to obtain voxelized point cloud data; calculating the voxelized point cloud data by using a preset point cloud semantic segmentation model to determine a semantic segmentation result; decoding the semantic segmentation result to determine a decoding result, and calculating the decoding result based on the index of each point of the decoding result in each dimension direction to obtain target point cloud data; and classifying all point cloud data to be detected according to the target point cloud data to determine the category information of all point cloud data to be detected.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.
It should be noted that the present application may be implemented in software and/or a combination of software and hardware, for example, implemented using Application Specific Integrated Circuits (ASICs), general purpose computers or any other similar hardware devices. In one embodiment, the software programs of the present application may be executed by a processor to implement the steps or functions described above. Likewise, the software programs (including associated data structures) of the present application may be stored in a computer readable recording medium, such as RAM memory, magnetic or optical drive or diskette and the like. Additionally, some of the steps or functions of the present application may be implemented in hardware, for example, as circuitry that cooperates with the processor to perform various steps or functions.
In addition, some of the present application may be implemented as a computer program product, such as computer program instructions, which when executed by a computer, may invoke or provide methods and/or techniques in accordance with the present application through the operation of the computer. Program instructions which invoke the methods of the present application may be stored on a fixed or removable recording medium and/or transmitted via a data stream on a broadcast or other signal-bearing medium and/or stored within a working memory of a computer device operating in accordance with the program instructions. An embodiment according to the present application comprises an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform a method and/or a solution according to the aforementioned embodiments of the present application.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the apparatus claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Claims (11)

1. A method of point cloud semantic segmentation, wherein the method comprises:
performing voxelization processing on the point cloud data to be detected to obtain voxelized point cloud data;
calculating the voxelized point cloud data by using a preset point cloud semantic segmentation model to determine a semantic segmentation result;
decoding the semantic segmentation result to determine a decoding result, and calculating the decoding result based on the index of each point of the decoding result in each dimension direction to obtain target point cloud data;
and classifying all point cloud data to be detected according to the target point cloud data to determine the category information of all point cloud data to be detected.
2. The method of claim 1, wherein prior to said calculating the voxelized point cloud data using a preset point cloud semantic segmentation model to determine a semantic segmentation result, comprising:
acquiring training data, and performing voxelization processing on the training data to obtain voxelized training data;
and training the point cloud semantic segmentation model by using the voxelized training data to obtain a preset point cloud semantic segmentation model.
3. The method of claim 2, wherein the training of the point cloud semantic segmentation model using the voxelized training data comprises:
classifying the voxelized training data to determine a plurality of category information, generating a label according to the category information and labeling the voxelized training data with the label;
calculating marked voxelized training data by using a high-resolution network to determine a training feature map;
and coding the label to obtain a coded label, and calculating the cross entropy loss of the coded label and the training feature map to train a point cloud semantic segmentation model.
4. The method of claim 1, wherein the computing the voxelized point cloud data using a preset point cloud semantic segmentation model to determine a semantic segmentation result comprises:
calculating the voxelized point cloud data by using a preset point cloud semantic segmentation model to determine a feature map;
and determining a semantic segmentation result according to the feature map.
5. The method of claim 1, wherein the decoding the semantic segmentation result to determine a decoding result comprises:
and decoding the semantic segmentation result by using one-hot coding to determine a decoding result.
6. The method of claim 1, wherein the calculating the decoding result based on the index of each point of the decoding result in each dimension direction to obtain the target point cloud data comprises:
screening all points in the decoding result according to the numerical value corresponding to each point to determine a target point;
taking the voxel serial numbers of the target points on all axes as indexes, and generating an index matrix according to the indexes;
and calculating the index matrix to obtain target point cloud data.
7. The method of claim 6, wherein the target point cloud data comprises center point coordinates of a target point cloud, and classifying all point cloud data to be detected according to the target point cloud data to determine category information of all point cloud data to be detected comprises:
calculating a coordinate threshold of the target point cloud according to the center point coordinates of the target point cloud;
and classifying all point cloud data to be detected according to the coordinate threshold value to determine the category information of all point cloud data to be detected.
8. The method of claim 7, wherein classifying all point cloud data to be detected according to the coordinate threshold to determine category information of all point cloud data to be detected comprises:
screening point cloud data to be detected in the target point cloud space according to the coordinate threshold;
and classifying the point cloud data to be detected in the target point cloud space into target categories.
9. An apparatus for point cloud semantic segmentation, wherein the apparatus comprises:
the data processing module is used for carrying out voxelization processing on the point cloud data to be detected to obtain voxelized point cloud data;
the identification module is used for calculating the voxelized point cloud data by using a preset point cloud semantic segmentation model so as to determine a semantic segmentation result;
the decoding module is used for decoding the semantic segmentation result to determine a decoding result, and calculating the decoding result based on the index of each point of the decoding result in each dimension direction to obtain target point cloud data;
and the classification module is used for classifying all point cloud data to be detected according to the target point cloud data so as to determine the category information of all point cloud data to be detected.
10. A computer readable medium having computer readable instructions stored thereon which are executable by a processor to implement the method of any one of claims 1 to 8.
11. An apparatus for point cloud semantic segmentation, wherein the apparatus comprises:
one or more processors; and
a memory storing computer readable instructions that, when executed, cause the processor to perform the operations of the method of any of claims 1 to 8.
CN202010963729.4A 2020-09-14 2020-09-14 Point cloud semantic segmentation method, device and equipment Pending CN112149677A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010963729.4A CN112149677A (en) 2020-09-14 2020-09-14 Point cloud semantic segmentation method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010963729.4A CN112149677A (en) 2020-09-14 2020-09-14 Point cloud semantic segmentation method, device and equipment

Publications (1)

Publication Number Publication Date
CN112149677A true CN112149677A (en) 2020-12-29

Family

ID=73892727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010963729.4A Pending CN112149677A (en) 2020-09-14 2020-09-14 Point cloud semantic segmentation method, device and equipment

Country Status (1)

Country Link
CN (1) CN112149677A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113011430A (en) * 2021-03-23 2021-06-22 中国科学院自动化研究所 Large-scale point cloud semantic segmentation method and system
CN113375556A (en) * 2021-06-18 2021-09-10 盎锐(上海)信息科技有限公司 Full-stack actual measurement system, measurement method and laser radar
CN113569856A (en) * 2021-07-13 2021-10-29 盎锐(上海)信息科技有限公司 Model semantic segmentation method for actual measurement actual quantity and laser radar
CN114387289A (en) * 2022-03-24 2022-04-22 南方电网数字电网研究院有限公司 Semantic segmentation method and device for three-dimensional point cloud of power transmission and distribution overhead line
CN114638953A (en) * 2022-02-22 2022-06-17 深圳元戎启行科技有限公司 Point cloud data segmentation method and device and computer readable storage medium
CN114743001A (en) * 2022-04-06 2022-07-12 合众新能源汽车有限公司 Semantic segmentation method and device, electronic equipment and storage medium
CN115131562A (en) * 2022-07-08 2022-09-30 北京百度网讯科技有限公司 Three-dimensional scene segmentation method, model training method and device and electronic equipment
CN117321438A (en) * 2021-04-14 2023-12-29 利尼芝物流有限责任公司 Point cloud filtering

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109410307A (en) * 2018-10-16 2019-03-01 大连理工大学 A kind of scene point cloud semantic segmentation method
US20190147250A1 (en) * 2017-11-15 2019-05-16 Uber Technologies, Inc. Semantic Segmentation of Three-Dimensional Data
CN109829399A (en) * 2019-01-18 2019-05-31 武汉大学 A kind of vehicle mounted road scene point cloud automatic classification method based on deep learning
CN111144304A (en) * 2019-12-26 2020-05-12 上海眼控科技股份有限公司 Vehicle target detection model generation method, vehicle target detection method and device
DE102018128531A1 (en) * 2018-11-14 2020-05-14 Valeo Schalter Und Sensoren Gmbh System and method for analyzing a three-dimensional environment represented by a point cloud through deep learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190147250A1 (en) * 2017-11-15 2019-05-16 Uber Technologies, Inc. Semantic Segmentation of Three-Dimensional Data
CN109410307A (en) * 2018-10-16 2019-03-01 大连理工大学 A kind of scene point cloud semantic segmentation method
DE102018128531A1 (en) * 2018-11-14 2020-05-14 Valeo Schalter Und Sensoren Gmbh System and method for analyzing a three-dimensional environment represented by a point cloud through deep learning
CN109829399A (en) * 2019-01-18 2019-05-31 武汉大学 A kind of vehicle mounted road scene point cloud automatic classification method based on deep learning
CN111144304A (en) * 2019-12-26 2020-05-12 上海眼控科技股份有限公司 Vehicle target detection model generation method, vehicle target detection method and device

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113011430A (en) * 2021-03-23 2021-06-22 中国科学院自动化研究所 Large-scale point cloud semantic segmentation method and system
CN117321438B (en) * 2021-04-14 2024-06-04 利尼芝物流有限责任公司 Point cloud filtering
US12002156B2 (en) 2021-04-14 2024-06-04 Lineage Logistics, LLC Point cloud filtering
CN117321438A (en) * 2021-04-14 2023-12-29 利尼芝物流有限责任公司 Point cloud filtering
CN113375556A (en) * 2021-06-18 2021-09-10 盎锐(上海)信息科技有限公司 Full-stack actual measurement system, measurement method and laser radar
CN113375556B (en) * 2021-06-18 2024-06-04 盎锐(杭州)信息科技有限公司 Full stack type actual measurement real quantity system, measurement method and laser radar
CN113569856A (en) * 2021-07-13 2021-10-29 盎锐(上海)信息科技有限公司 Model semantic segmentation method for actual measurement actual quantity and laser radar
CN113569856B (en) * 2021-07-13 2024-06-04 盎锐(杭州)信息科技有限公司 Model semantic segmentation method for actual measurement and laser radar
CN114638953A (en) * 2022-02-22 2022-06-17 深圳元戎启行科技有限公司 Point cloud data segmentation method and device and computer readable storage medium
CN114638953B (en) * 2022-02-22 2023-12-22 深圳元戎启行科技有限公司 Point cloud data segmentation method and device and computer readable storage medium
CN114387289A (en) * 2022-03-24 2022-04-22 南方电网数字电网研究院有限公司 Semantic segmentation method and device for three-dimensional point cloud of power transmission and distribution overhead line
CN114743001A (en) * 2022-04-06 2022-07-12 合众新能源汽车有限公司 Semantic segmentation method and device, electronic equipment and storage medium
CN114743001B (en) * 2022-04-06 2024-06-25 合众新能源汽车股份有限公司 Semantic segmentation method, semantic segmentation device, electronic equipment and storage medium
CN115131562B (en) * 2022-07-08 2023-06-13 北京百度网讯科技有限公司 Three-dimensional scene segmentation method, model training method, device and electronic equipment
CN115131562A (en) * 2022-07-08 2022-09-30 北京百度网讯科技有限公司 Three-dimensional scene segmentation method, model training method and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN112149677A (en) Point cloud semantic segmentation method, device and equipment
CN111652217B (en) Text detection method and device, electronic equipment and computer storage medium
CN111681273B (en) Image segmentation method and device, electronic equipment and readable storage medium
CN111860138B (en) Three-dimensional point cloud semantic segmentation method and system based on full fusion network
US8773422B1 (en) System, method, and computer program product for grouping linearly ordered primitives
Hoppe et al. Incremental Surface Extraction from Sparse Structure-from-Motion Point Clouds.
CN116310656B (en) Training sample determining method and device and computer equipment
CN110569379A (en) Method for manufacturing picture data set of automobile parts
CN115457492A (en) Target detection method and device, computer equipment and storage medium
CN114694005A (en) Target detection model training method and device, and target detection method and device
CN112364709A (en) Cabinet intelligent asset checking method based on code identification
CN110969641A (en) Image processing method and device
CN115908332A (en) Method for detecting surface defects of battery pole piece and processor
CN116612280A (en) Vehicle segmentation method, device, computer equipment and computer readable storage medium
CN113111708B (en) Vehicle matching sample generation method, device, computer equipment and storage medium
CN108573510B (en) Grid map vectorization method and device
CN111062385A (en) Network model construction method and system for image text information detection
CN110852353A (en) Intersection classification method and equipment
CN116486153A (en) Image classification method, device, equipment and storage medium
CN116403062A (en) Point cloud target detection method, system, equipment and medium
CN113591543B (en) Traffic sign recognition method, device, electronic equipment and computer storage medium
CN113808142B (en) Ground identification recognition method and device and electronic equipment
CN115797171A (en) Method and device for generating composite image, electronic device and storage medium
CN115937451A (en) Dynamic scene multi-semantic map construction method and device based on visual SLAM
CN113298822B (en) Point cloud data selection method and device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination