CN110322453B - 3D point cloud semantic segmentation method based on position attention and auxiliary network - Google Patents

3D point cloud semantic segmentation method based on position attention and auxiliary network Download PDF

Info

Publication number
CN110322453B
CN110322453B CN201910604264.0A CN201910604264A CN110322453B CN 110322453 B CN110322453 B CN 110322453B CN 201910604264 A CN201910604264 A CN 201910604264A CN 110322453 B CN110322453 B CN 110322453B
Authority
CN
China
Prior art keywords
network
feature
point cloud
training
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910604264.0A
Other languages
Chinese (zh)
Other versions
CN110322453A (en
Inventor
焦李成
冯志玺
张格格
杨淑媛
程曦娜
马清华
张�杰
郭雨薇
丁静怡
唐旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201910604264.0A priority Critical patent/CN110322453B/en
Publication of CN110322453A publication Critical patent/CN110322453A/en
Application granted granted Critical
Publication of CN110322453B publication Critical patent/CN110322453B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a 3D point cloud semantic segmentation method based on position attention and an auxiliary network, which mainly solves the problem of low segmentation precision in the prior art, and the implementation scheme is as follows: acquiring a training set T and a test set V; constructing a 3D point cloud semantic segmentation network, and setting a loss function of the network, wherein the network comprises a feature down-sampling network, a position attention module, a feature up-sampling network and an auxiliary network which are sequentially cascaded; and performing P rounds of supervised training on the segmentation network by using a training set T: adjusting network parameters according to a loss function in the training process of each round, and taking a network model with the highest segmentation precision as a trained network model after P rounds of training are completed; and inputting the test set V into the trained network model for semantic segmentation to obtain a segmentation result of each point. The method improves the semantic segmentation precision of the 3D point cloud, and can be used for automatic driving, robots, 3D scene reconstruction, quality detection, 3D mapping and smart city construction.

Description

基于位置注意力和辅助网络的3D点云语义分割方法3D point cloud semantic segmentation method based on position attention and auxiliary network

技术领域Technical Field

本发明属于数据处理技术领域,特别涉及一种3D点云语义分割方法,可用于自动驾驶、机器人、3D场景重建、质量检测,3D制图及智慧城市建设。The present invention belongs to the field of data processing technology, and in particular relates to a 3D point cloud semantic segmentation method, which can be used for autonomous driving, robots, 3D scene reconstruction, quality inspection, 3D mapping and smart city construction.

背景技术Background Art

近年来,随着激光雷达,RGBD相机等3D传感器在机器人、无人驾驶领域的广泛应用,深度学习在3D点云数据的应用已经成为研究热点之一。3D点云数据是指:在一个三维坐标系统中的一组向量的集合,这些向量通常以x,y,z三维坐标的形式表示,一般用来代表一个物体的外表面形状。另外,除了(x,y,z)代表的几何信息外,可能还含有RGB颜色、强度、灰度值,深度或者返回次数等信息。点云数据通常是由3D扫描设备获得,例如激光雷达,RGBD相机等。这些传感器用自动化的方式测量在物体表面的大量点的信息,然后用某种数据文件输出点云数据。点云数据具有无序性,非结构化的特点并且在3D空间中可能具有不同的稠密度。这使深度学习应用在3D点云数据的研究面临巨大的挑战。In recent years, with the widespread application of 3D sensors such as LiDAR and RGBD cameras in the fields of robotics and unmanned driving, the application of deep learning in 3D point cloud data has become one of the research hotspots. 3D point cloud data refers to: a set of vectors in a three-dimensional coordinate system. These vectors are usually expressed in the form of x, y, and z three-dimensional coordinates, and are generally used to represent the outer surface shape of an object. In addition, in addition to the geometric information represented by (x, y, z), it may also contain RGB color, intensity, grayscale value, depth or return times. Point cloud data is usually obtained by 3D scanning devices, such as LiDAR, RGBD cameras, etc. These sensors measure the information of a large number of points on the surface of an object in an automated way, and then output point cloud data in a certain data file. Point cloud data is disordered and unstructured and may have different densities in 3D space. This makes the application of deep learning in the study of 3D point cloud data face huge challenges.

3D点云语义分割是指对输入的点云数据中的每一个点分配一个类别。在早期的研究工作中,3D点云数据通常被转换为手工体素网格特征或者是多视角的图像特征,然后送入深度学习网络进行特征提取,这样转换特征的方法不仅数据量大、而且计算复杂,如果降低分辨率,则分割精度会下降。因此,使用深度学习的方法直接处理点云数据显得尤其重要。3D point cloud semantic segmentation refers to assigning a category to each point in the input point cloud data. In early research work, 3D point cloud data is usually converted into manual voxel grid features or multi-view image features, and then sent to the deep learning network for feature extraction. This feature conversion method not only has a large amount of data, but also is computationally complex. If the resolution is reduced, the segmentation accuracy will decrease. Therefore, it is particularly important to use deep learning methods to directly process point cloud data.

2017年,Charles R Qi等人在CVPR上发表的名称为“PointNet:Deep Learning onPoint Sets for 3D Classification and Segmentation”的论文,公开了一种直接处理3D点云数据的深度学习框架,该方法使用max-pooling的对称函数解决点云无序性的问题,从而提取每个点的全局特征。但是该方法只考虑了全局特征,忽略了每个点的局部特征。因此,在PointNet不久,Charles R Qi的团队在NIPS发表了名称为“PointNet++:DeepHierarchical Feature Learning on Point Sets in a Metric Space”的论文,PointNet++是PointNet的分层版本,每层都有三个阶段:采样、分组和特征提取。首先选取一些比较重要的点作为每一个局部区域的中心点,然后在这些中心点的周围根据欧氏距离选取k个近邻点。再将k个近邻点作为一个局部点云采用PointNet网络来提取特征,之后对深层特征进行回传,从而得到3D点云数据语义分割结果,该方法较PointNet精度有所提升。In 2017, Charles R Qi et al. published a paper titled "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation" at CVPR, which disclosed a deep learning framework for directly processing 3D point cloud data. This method uses the symmetric function of max-pooling to solve the problem of point cloud disorder, thereby extracting the global features of each point. However, this method only considers global features and ignores the local features of each point. Therefore, shortly after PointNet, Charles R Qi's team published a paper titled "PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space" at NIPS. PointNet++ is a hierarchical version of PointNet, and each layer has three stages: sampling, grouping, and feature extraction. First, some important points are selected as the center points of each local area, and then k neighboring points are selected around these center points according to the Euclidean distance. Then, the k neighboring points are used as a local point cloud to extract features using the PointNet network, and then the deep features are transmitted back to obtain the semantic segmentation results of 3D point cloud data. This method has improved accuracy compared to PointNet.

上述这两种方法与传统方法相比,由于直接处理3D点云数据,计算简单,有效解决了点云无序性的特点并且提升了分割精度,但是,PointNet++由于没有考虑到各个中心点特征之间的关系,也即上下文信息,所以特征表示相对较弱,另外,PointNet++遵从了编码-解码的通用框架,没有考虑底层更多的信息,因此,分割精度不高,仍具有改进的空间。Compared with traditional methods, the above two methods directly process 3D point cloud data, are simple to calculate, effectively solve the problem of point cloud disorder and improve segmentation accuracy. However, PointNet++ does not consider the relationship between the features of each center point, that is, the context information, so the feature representation is relatively weak. In addition, PointNet++ follows the general framework of encoding and decoding and does not consider more underlying information. Therefore, the segmentation accuracy is not high and there is still room for improvement.

发明内容Summary of the invention

本发明的目的在于针对上述现有技术的不足,提出一种基于位置注意力和辅助网络的3D点云数据语义分割方法,以关联上下文特征的位置注意力和重建底层信息的辅助网络,提高分割精度。The purpose of the present invention is to address the deficiencies of the above-mentioned prior art and propose a 3D point cloud data semantic segmentation method based on position attention and auxiliary network, so as to improve the segmentation accuracy by associating the position attention of contextual features and the auxiliary network of reconstructing the underlying information.

为实现上述目的,本发明的技术方案包括如下步骤:To achieve the above object, the technical solution of the present invention includes the following steps:

(1)从ScanNet官网下载3D点云数据的训练文件和测试文件,并对其进行类别统计和切块处理,获取训练集T和测试集V;(1) Download the training and test files of 3D point cloud data from the ScanNet official website, perform category statistics and block processing on them, and obtain the training set T and test set V;

(2)构建3D点云语义分割网络,其包括依次级联的特征下采样网络,位置注意力模块,特征上采样网络和辅助网络;(2) Construct a 3D point cloud semantic segmentation network, which includes a cascaded feature downsampling network, a position attention module, a feature upsampling network, and an auxiliary network;

(3)使用多分类的交叉熵损失函数,作为3D点云语义分割网络的损失函数;(3) Use the multi-classification cross entropy loss function as the loss function of the 3D point cloud semantic segmentation network;

(4)使用训练集T,对3D点云数据语义分割网络进行P轮有监督的训练,P≥500;(4) Using the training set T, perform P rounds of supervised training on the 3D point cloud data semantic segmentation network, where P ≥ 500;

(4a)在每轮训练过程中,根据语义分割网络的损失函数,调整网络参数,得到网络模型;(4a) In each round of training, the network parameters are adjusted according to the loss function of the semantic segmentation network to obtain the network model;

(4b)每隔P1轮,使用测试集的样本对当前网络模型的分割精度进行评估,若当前网络模型的分割精度高于之前保存的网络模型,则进行保存,P1≥2;(4b) Every P 1 rounds, use the samples of the test set to evaluate the segmentation accuracy of the current network model. If the segmentation accuracy of the current network model is higher than that of the previously saved network model, it is saved, P 1 ≥ 2;

(4c)P轮训练完成后,把分割精度最高的网络模型作为训练好的网络模型;(4c) After P rounds of training are completed, the network model with the highest segmentation accuracy is taken as the trained network model;

(5)将测试集V输入到训练好的网络模型中进行语义分割,得到每一个点的分割结果。(5) Input the test set V into the trained network model for semantic segmentation to obtain the segmentation result of each point.

本发明与现有技术相比,具有以下优点:Compared with the prior art, the present invention has the following advantages:

本发明由于构建了3D点云语义分割网络,并通过其中的位置注意力模块,计算其输入数据的各个质心所代表的特征之间的相关性,为3D点云语义分割网络的局部质心特征增加了上下文信息;同时由于通过其中的辅助网络,对3D点云语义分割网络的底层特征进行回传,重建了3D点云语义分割网络的底层信息,有效提高了3D点云语义分割的分割精度。The present invention constructs a 3D point cloud semantic segmentation network, and calculates the correlation between the features represented by each centroid of its input data through the position attention module therein, thereby adding contextual information to the local centroid features of the 3D point cloud semantic segmentation network; at the same time, since the underlying features of the 3D point cloud semantic segmentation network are fed back through the auxiliary network therein, the underlying information of the 3D point cloud semantic segmentation network is reconstructed, thereby effectively improving the segmentation accuracy of the 3D point cloud semantic segmentation.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明的实现流程图;Fig. 1 is an implementation flow chart of the present invention;

图2是本发明中构建的3D点云语义分割网络整体结构图;FIG2 is an overall structural diagram of a 3D point cloud semantic segmentation network constructed in the present invention;

图3是本发明中位置注意力模块结构图。FIG3 is a structural diagram of the position attention module in the present invention.

具体实施方式DETAILED DESCRIPTION

下面结合附图和具体实施例,对本发明作进一步详细描述。The present invention is further described in detail below in conjunction with the accompanying drawings and specific embodiments.

参照图1,本实例的实现步骤包括如下。1 , the implementation steps of this example include the following.

步骤1,获取训练集T和测试集V。Step 1: Get the training set T and the test set V.

1.1)从ScanNet官网下载3D点云数据的训练文件和测试文件,其中训练文件包含f0个点云场景,测试文件中包含f1个点云场景,本实施例中f0=1201,f1=312;1.1) Download the training file and test file of 3D point cloud data from ScanNet official website, where the training file contains f 0 point cloud scenes and the test file contains f 1 point cloud scenes. In this embodiment, f 0 =1201 and f 1 =312;

1.2)使用直方图统计训练文件中所有f0个场景的点云数据各个类别的数目,并计算各个类别的权重wk1.2) Use the histogram to count the number of each category of point cloud data of all f 0 scenes in the training file, and calculate the weight w k of each category:

Figure BDA0002120278500000031
Figure BDA0002120278500000031

其中,Gk表示第k类点云数据的数目,M表示所有点云数据的数目,L表示分割类别数,L≥2,本实施例中L=21;Wherein, G k represents the number of point cloud data of the kth type, M represents the number of all point cloud data, L represents the number of segmentation categories, L ≥ 2, and in this embodiment, L = 21;

1.3)对训练文件中的每个场景,随机选取一个点作为中心点,坐标为(x,y,z),在其周围取(x-0.75,x+0.75),(y-0.75,y+0.75),(z-0.75,z+0.75)范围中的点,组成一个数据块;1.3) For each scene in the training file, randomly select a point as the center point with coordinates (x, y, z), and take points in the range of (x-0.75, x+0.75), (y-0.75, y+0.75), and (z-0.75, z+0.75) around it to form a data block;

1.4)设置采样点数N0,将(1.3)得到的数据块中的点数与采样点数N0进行比较,判断其是否合理:1.4) Set the number of sampling points N 0 , compare the number of points in the data block obtained in (1.3) with the number of sampling points N 0 to determine whether it is reasonable:

若该数据块中的点数大于采样点数N0,则判为该数据块合理,并在该数据块中随机采样N0个点,组成一个样本数据,否则,抛弃该数据块,由此得到训练集T,本实施例中,N0=8192;If the number of points in the data block is greater than the number of sampling points N 0 , the data block is judged to be reasonable, and N 0 points are randomly sampled in the data block to form a sample data. Otherwise, the data block is discarded, thereby obtaining a training set T. In this embodiment, N 0 =8192;

1.5)对于测试文件中所有f1个场景中的每一个场景,使用大小为1.5×1.5×3的立方体窗口进行滑窗切块,对每个数据块,随机采样N0个点,组成一个样本数据,得到测试集V。1.5) For each of all f 1 scenes in the test file, use a cubic window of size 1.5×1.5×3 to perform sliding window cutting. For each data block, randomly sample N 0 points to form a sample data to obtain the test set V.

步骤2,构建3D点云语义分割网络。Step 2: Build a 3D point cloud semantic segmentation network.

参照图2,本步骤的构建的3D点云语义分割网络包括依次级联的特征下采样网络,位置注意力模块,特征上采样网络和辅助网络。Referring to Figure 2, the 3D point cloud semantic segmentation network constructed in this step includes a feature downsampling network, a position attention module, a feature upsampling network and an auxiliary network that are cascaded in sequence.

2.1)设置特征下采样网络:2.1) Set up feature downsampling network:

该特征下采样网络包括n个级联的PointSA模块,每个PointSA模块包括依次级联的点云质心采样以及分组层、点云特征提取层,其中,n≥2,本实施例中该参数设置为n=4;The feature downsampling network includes n cascaded PointSA modules, each PointSA module includes sequentially cascaded point cloud centroid sampling and grouping layers, and point cloud feature extraction layers, wherein n≥2, and in this embodiment, the parameter is set to n=4;

对于第m个PointSA模块的质心采样以及分组层,m=1,2,...,n,首先采用迭代最远点采样法从输入点集中采样

Figure BDA0002120278500000041
个点作为质心点,其次,以
Figure BDA0002120278500000042
个采样的质心点为中心,使用球形搜索算法,在其特定半径rm的范围内搜索
Figure BDA0002120278500000043
个点,组成一个分组。本实施例中第1个PointSA模块,设置
Figure BDA0002120278500000044
r1=0.1;第2个PointSA模块,设置
Figure BDA0002120278500000045
Figure BDA0002120278500000046
r2=0.2;第3个PointSA模块,设置
Figure BDA0002120278500000047
r3=0.4;第4个PointSA模块,设置
Figure BDA0002120278500000048
r4=0.8;For the centroid sampling and grouping layer of the m-th PointSA module, m = 1, 2, ..., n, the iterative farthest point sampling method is first used to sample from the input point set
Figure BDA0002120278500000041
points as the centroid, and secondly,
Figure BDA0002120278500000042
The centroid of the samples is used as the center, and the spherical search algorithm is used to search within the range of its specific radius r m.
Figure BDA0002120278500000043
Points form a group. In this embodiment, the first PointSA module is set
Figure BDA0002120278500000044
r 1 = 0.1; the second PointSA module, set
Figure BDA0002120278500000045
Figure BDA0002120278500000046
r 2 = 0.2; The third PointSA module is set
Figure BDA0002120278500000047
r 3 = 0.4; the fourth PointSA module, set
Figure BDA0002120278500000048
r4 = 0.8;

对于第m个PointSA模块的点云特征提取层,包括3个依次级联的2D卷积层,用于提取质心采样及分组层输出数据的特征,并使用最大池化策略对提取到的区域特征进行池化。本实施例中第1个PointSA模块的点云特征提取层的3个2D卷积层的卷积核大小均为1×1,步长均为1,输出通道数分别是32、32、64;第2个PointSA模块的点云特征提取层的3个2D卷积层的卷积核大小均为1×1,步长均为1,输出通道数分别是64、64、128;第3个PointSA模块的点云特征提取层的3个2D卷积层的卷积核大小均为1×1,步长均为1,输出通道数分别是128、128、256;第4个PointSA模块的点云特征提取层的3个2D卷积层的卷积核大小均为1×1,步长均为1,输出通道数分别是256、256、512;For the point cloud feature extraction layer of the mth PointSA module, it includes three cascaded 2D convolutional layers to extract the features of the centroid sampling and grouping layer output data, and uses the maximum pooling strategy to pool the extracted regional features. In this embodiment, the convolution kernel size of the three 2D convolution layers of the point cloud feature extraction layer of the first PointSA module is 1×1, the step size is 1, and the number of output channels is 32, 32, and 64 respectively; the convolution kernel size of the three 2D convolution layers of the point cloud feature extraction layer of the second PointSA module is 1×1, the step size is 1, and the number of output channels is 64, 64, and 128 respectively; the convolution kernel size of the three 2D convolution layers of the point cloud feature extraction layer of the third PointSA module is 1×1, the step size is 1, and the number of output channels is 128, 128, and 256 respectively; the convolution kernel size of the three 2D convolution layers of the point cloud feature extraction layer of the fourth PointSA module is 1×1, the step size is 1, and the number of output channels is 256, 256, and 512 respectively;

2.2)设置位置注意力模块,用于计算其输入数据F的各个质心所代表的特征之间的相关性,得到位置注意力加强后的特征E:2.2) Set up a position attention module to calculate the correlation between the features represented by each centroid of its input data F, and obtain the feature E after position attention enhancement:

参照图3,该模块工作原理如下:Referring to Figure 3, the working principle of this module is as follows:

2.2.1)输入数据F分别经过第一1D卷积层Q得到第i个质心的特征Qi,i=1,2,...,N,N表示F的质心数量;再经过第二1D卷积层U,得到第j个质心的特征Uj,j=1,2,...,N,再经过第三1D卷积层V得到第j个质心的特征Vj;其中,这三个1D卷积层Q、U、V的卷积核大小均为1,步长均为1,第一1D卷积层Q和第二1D卷积层U的输出特征通道数均为输入数据F特征通道数的

Figure BDA0002120278500000051
第三1D卷积层V的输出特征通道数与输入数据F的特征通道数相同;2.2.1) The input data F passes through the first 1D convolutional layer Q to obtain the feature Qi of the i-th centroid, i = 1, 2, ..., N, N represents the number of centroids of F; then passes through the second 1D convolutional layer U to obtain the feature Uj of the j-th centroid, j = 1, 2, ..., N, and then passes through the third 1D convolutional layer V to obtain the feature Vj of the j-th centroid; wherein the convolution kernel size of these three 1D convolutional layers Q, U, and V is 1, the step size is 1, and the number of output feature channels of the first 1D convolutional layer Q and the second 1D convolutional layer U is twice the number of feature channels of the input data F
Figure BDA0002120278500000051
The number of output feature channels of the third 1D convolutional layer V is the same as the number of feature channels of the input data F;

2.2.2)计算各个质心所代表的特征之间的注意力影响值tij

Figure BDA0002120278500000052
使用tij构成矩阵A:2.2.2) Calculate the attention influence value tij between the features represented by each centroid:
Figure BDA0002120278500000052
Use t ij to construct the matrix A:

Figure BDA0002120278500000053
Figure BDA0002120278500000053

2.2.3)计算位置注意力特征

Figure BDA0002120278500000054
2.2.3) Calculate position attention features
Figure BDA0002120278500000054

2.2.4)输出注意力加强后的特征E:2.2.4) Output the feature E after attention enhancement:

E=[E1;E2;...;Ei;...;EN],E=[E 1 ; E 2 ;...; E i ;...; E N ],

其中,Ei=αJi+Fi表示E中第i个质心的特征,α表示位置注意力特征Ji的权重,Fi表示输入的第i个质心的特征;Among them, E i = αJ i + Fi represents the feature of the i-th centroid in E, α represents the weight of the position attention feature Ji , and Fi represents the feature of the i-th centroid of the input;

2.3)设置特征上采样网络:2.3) Set up the feature upsampling network:

该特征上采样网络包括依次级联的a个PointFP模块、1D卷积层、Dropout层和用于分类的1D卷积层,每个PointFP模块包括依次级联的特征插值层和特征提取层,其中,a≥2,本实施例中该参数设置为a=4;The feature upsampling network includes a PointFP modules, a 1D convolution layer, a Dropout layer, and a 1D convolution layer for classification, which are cascaded in sequence. Each PointFP module includes a feature interpolation layer and a feature extraction layer, which are cascaded in sequence. Where a≥2, in this embodiment, the parameter is set to a=4;

所述a个PointFP模块,其特征插值层和特征提取层的结构有所不同,其中:The structures of the feature interpolation layer and the feature extraction layer of the PointFP module are different, where:

对于第1个PointFP模块,其特征插值层对位置注意力模块的输出特征进行插值,并将插值后的特征与第3个PointSA模块的输出特征进行级联,得到特征插值层的输出特征;其特征提取层,包含2个依次级联的2D卷积层,用来进一步提取该输出特征,2个2D卷积层的卷积核均为卷积核大小均为1×1,步长均为1,输出通道数分别是256、256;For the first PointFP module, its feature interpolation layer interpolates the output features of the position attention module and cascades the interpolated features with the output features of the third PointSA module to obtain the output features of the feature interpolation layer; its feature extraction layer includes two cascaded 2D convolutional layers to further extract the output features. The convolution kernels of the two 2D convolutional layers are both 1×1 in size, 1 in stride, and the number of output channels is 256 and 256 respectively;

对于第2个PointFP模块,其特征插值层对第1个PointFP模块的输出特征进行插值,并将插值后的特征与第2个PointSA模块的输出特征进行级联,得到特征插值层的输出特征;其特征提取层,包含2个依次级联的2D卷积层,用来进一步提取该输出特征,2个2D卷积层的卷积核均为卷积核大小均为1×1,步长均为1,输出通道数分别是256、256;For the second PointFP module, its feature interpolation layer interpolates the output features of the first PointFP module and cascades the interpolated features with the output features of the second PointSA module to obtain the output features of the feature interpolation layer; its feature extraction layer includes two cascaded 2D convolutional layers to further extract the output features. The convolution kernels of the two 2D convolutional layers are both 1×1 in size, 1 in stride, and the number of output channels is 256 and 256 respectively;

对于第3个PointFP模块,其特征插值层对第2个PointFP模块的输出特征进行插值,并将插值后的特征与第1个PointSA模块的输出特征进行级联,得到特征插值层的输出特征;其特征提取层,包含2个依次级联的2D卷积层,用来进一步提取该输出特征,2个2D卷积层的卷积核均为卷积核大小均为1×1,步长均为1,输出通道数分别是256、128;For the third PointFP module, its feature interpolation layer interpolates the output features of the second PointFP module and cascades the interpolated features with the output features of the first PointSA module to obtain the output features of the feature interpolation layer; its feature extraction layer includes two cascaded 2D convolutional layers to further extract the output features. The convolution kernels of the two 2D convolutional layers are both 1×1 in size, 1 in stride, and the number of output channels is 256 and 128 respectively;

对于第4个PointFP模块,其特征插值层对第3个PointFP模块的输出特征进行插值,得到插值后的特征,该插值后的特征作为其特征插值层的输出特征;其特征提取层,包含3个依次级联的2D卷积层,用来进一步提取该输出特征,3个2D卷积层的卷积核均为卷积核大小均为1×1,步长均为1,输出通道数分别是128、128、128。For the fourth PointFP module, its feature interpolation layer interpolates the output features of the third PointFP module to obtain the interpolated features, which are used as the output features of its feature interpolation layer; its feature extraction layer includes three cascaded 2D convolutional layers to further extract the output features. The convolution kernels of the three 2D convolutional layers are all 1×1 in size, with a step size of 1, and the number of output channels are 128, 128, and 128 respectively.

所述1D卷积层,其卷积核大小为1,步长为1,输出特征通道数设置为128;The 1D convolution layer has a convolution kernel size of 1, a step size of 1, and an output feature channel number of 128;

所述Dropout层,其保留概率设置为0.5;The retention probability of the Dropout layer is set to 0.5;

所述用于分类的1D卷积层,其卷积核大小为1,步长为1,输出特征通道数设置为分割的类别数L。The 1D convolution layer used for classification has a convolution kernel size of 1, a step size of 1, and the number of output feature channels is set to the number of segmented categories L.

2.4)设置辅助网络:2.4) Set up auxiliary network:

该辅助网络包括依次级联的b个PointAux模块和用于分类的1D卷积层,每个PointAux模块包括1D卷积层和特征插值层,其中,b≥1,本实施例中b=2;The auxiliary network includes b PointAux modules cascaded in sequence and a 1D convolution layer for classification, each PointAux module includes a 1D convolution layer and a feature interpolation layer, wherein b≥1, and in this embodiment, b=2;

对于第1个PointAux模块,其1D卷积层用来提取第2个PointFP模块输出数据的特征,卷积核大小为1,步长为1,输出特征通道为分割的类别数L;其特征插值层用来对1D卷积层提取到的特征进行插值;For the first PointAux module, its 1D convolution layer is used to extract the features of the output data of the second PointFP module. The convolution kernel size is 1, the step size is 1, and the output feature channel is the number of segmented categories L; its feature interpolation layer is used to interpolate the features extracted by the 1D convolution layer;

对于第2个PointAux模块,其1D卷积层用来提取第1个PointAux模块输出数据的特征,卷积核大小为1,步长为1,输出特征通道为分割的类别数L;其特征插值层用来对1D卷积层提取到的特征进行插值;For the second PointAux module, its 1D convolution layer is used to extract the features of the output data of the first PointAux module, the convolution kernel size is 1, the step size is 1, and the output feature channel is the number of segmented categories L; its feature interpolation layer is used to interpolate the features extracted by the 1D convolution layer;

用于分类的1D卷积层,用于对第2个PointAux模块的输出特征进行分类,其卷积核大小为1,步长为1,输出特征通道数设置为分割的类别数L。The 1D convolutional layer for classification is used to classify the output features of the second PointAux module. Its convolution kernel size is 1, the stride is 1, and the number of output feature channels is set to the number of segmented categories L.

步骤3,设定3D点云语义分割网络的损失函数。Step 3: Set the loss function of the 3D point cloud semantic segmentation network.

本实例将多分类的交叉熵损失函数,作为3D点云语义分割网络的损失函数,其表示公式如下:In this example, the multi-classification cross entropy loss function is used as the loss function of the 3D point cloud semantic segmentation network. The expression formula is as follows:

Figure BDA0002120278500000071
Figure BDA0002120278500000071

其中,C代表训练的样本点数,L代表类别总数,wk为第k类的权重,wa为辅助网络的loss的权重,wa∈[0,1],本实施例中wa=0.5;Wherein, C represents the number of training sample points, L represents the total number of categories, wk is the weight of the kth category, wa is the weight of the auxiliary network loss, wa∈ [0,1], and in this embodiment, wa =0.5;

pi,k代表第i个样本点属于第k类的真实概率,若第i个样本点属于第k类,则概率值为1,否则,概率值为0;p i,k represents the true probability that the i-th sample point belongs to the k-th class. If the i-th sample point belongs to the k-th class, the probability value is 1, otherwise, the probability value is 0;

Figure BDA0002120278500000072
Figure BDA0002120278500000073
分别表示特征上采样网络和辅助网络预测的第i个样本点属于第k类的概率,计算公式如下:
Figure BDA0002120278500000072
and
Figure BDA0002120278500000073
They represent the probability that the i-th sample point predicted by the feature upsampling network and the auxiliary network belongs to the k-th class, respectively. The calculation formula is as follows:

Figure BDA0002120278500000074
Figure BDA0002120278500000074

Figure BDA0002120278500000075
Figure BDA0002120278500000075

其中,

Figure BDA0002120278500000076
分别表示特征上采样网络和辅助网络输出的第i个样本点的第k个通道特征值,计算公式如下:in,
Figure BDA0002120278500000076
They represent the k-th channel feature value of the i-th sample point output by the feature upsampling network and the auxiliary network respectively. The calculation formula is as follows:

Figure BDA0002120278500000077
Figure BDA0002120278500000077

Figure BDA0002120278500000078
Figure BDA0002120278500000078

其中,xi表示第i个样本点的输入特征,f1表示特征上采样网络,θ1表示特征上采样网络的参数,f2表示辅助网络,θ2表示辅助网络的参数。Among them, xi represents the input feature of the i-th sample point, f1 represents the feature upsampling network, θ1 represents the parameters of the feature upsampling network, f2 represents the auxiliary network, and θ2 represents the parameters of the auxiliary network.

步骤4,使用训练集T,对3D点云语义分割网络进行P轮有监督的训练,P≥500。Step 4: Use the training set T to perform P rounds of supervised training on the 3D point cloud semantic segmentation network, where P ≥ 500.

本实施例中取P=1000,其训练步骤如下:In this embodiment, P=1000, and the training steps are as follows:

4.1)在第q轮训练过程中,设置lq为第q轮训练过程的学习率,设置θq为第q轮训练过程的网络模型的参数,根据步骤3设定的损失函数,使用公式

Figure BDA0002120278500000079
调整θq,得到用于第q+1轮训练过程的网络模型参数θq+1,由此得到第q轮训练过程后的网络模型;4.1) In the qth round of training, set lq as the learning rate of the qth round of training, set θq as the parameter of the network model of the qth round of training, and use the formula according to the loss function set in step 3
Figure BDA0002120278500000079
Adjust θ q to obtain the network model parameter θ q+1 for the q+1th round of training process, thereby obtaining the network model after the qth round of training process;

4.2)每隔P1轮,将测试集输入到当前网络模型中,得到测试集中所有点云数据的预测类别,P1≥2,本实施例中,P1=5;4.2) Every P 1 rounds, the test set is input into the current network model to obtain the predicted categories of all point cloud data in the test set, P 1 ≥ 2. In this embodiment, P 1 = 5;

4.3)统计测试集中所有点云数据的预测类别与其真实类别相同的数目,计算分割精度:

Figure BDA0002120278500000081
其中,R表示测试集中所有点云数据的预测类别与其真实类别相同的数目,H表示测试集中所有点云数据的数目;4.3) Count the number of predicted categories of all point cloud data in the test set that are the same as their true categories, and calculate the segmentation accuracy:
Figure BDA0002120278500000081
Among them, R represents the number of predicted categories of all point cloud data in the test set that are the same as their true categories, and H represents the number of all point cloud data in the test set;

4.4)将当前网络模型的分割精度acc与之前保存的网络模型的分割精度acc进行比较,若当前网络模型的分割精度acc高于之前保存的网络模型的分割精度acc,则表明当前网络模型更好,并对其进行保存,否则,不进行保存。4.4) Compare the segmentation accuracy acc of the current network model with the segmentation accuracy acc of the previously saved network model. If the segmentation accuracy acc of the current network model is higher than the segmentation accuracy acc of the previously saved network model, it indicates that the current network model is better and is saved. Otherwise, it is not saved.

4.5)P轮训练完成后,把分割精度最高的网络模型作为训练好的网络模型;4.5) After P rounds of training are completed, the network model with the highest segmentation accuracy is used as the trained network model;

步骤5,将测试集V输入到步骤4.5)得到的训练好的网络模型中进行语义分割,得到每一个点的分割结果。Step 5: Input the test set V into the trained network model obtained in step 4.5) for semantic segmentation to obtain the segmentation result of each point.

以下结合仿真实验,对本发明的技术效果作以说明:The following is a description of the technical effects of the present invention in conjunction with simulation experiments:

1.仿真条件1. Simulation conditions

本发明的仿真实验在以下环境中进行。The simulation experiment of the present invention is carried out in the following environment.

硬件平台:Intel(R)Xeon(R)CPU E5-2650v4@2.20GHz,64GB运行内存,Ubuntu16.04操作系统,GeForce GTX TITAN X;Hardware platform: Intel(R) Xeon(R) CPU E5-2650v4@2.20GHz, 64GB RAM, Ubuntu16.04 operating system, GeForce GTX TITAN X;

软件平台:Tensorflow深度学习框架,Python3.5,实验所采用的数据集是点云数据集ScanNet。Software platform: Tensorflow deep learning framework, Python3.5. The dataset used in the experiment is the point cloud dataset ScanNet.

ScanNet是一个通过RGB-D相机所扫描和重建的室内场景点云数据集。总共包含1513个场景,使用1201个场景作为训练集,312个场景作为测试集,包含的类别数有21类。ScanNet is a point cloud dataset of indoor scenes scanned and reconstructed by RGB-D cameras. It contains a total of 1513 scenes, using 1201 scenes as training sets and 312 scenes as test sets, and contains 21 categories.

2.仿真实验:2. Simulation experiment:

根据本发明获取训练集和测试集,构建3D点云语义分割网络,使用训练集对3D点云语义分割网络进行有监督训练,然后使用训练好的网络模型对测试集中的点进行预测,根据步骤4.3的方法计算3D点云分割网络对测试集V的分割精度。According to the present invention, a training set and a test set are obtained, a 3D point cloud semantic segmentation network is constructed, the training set is used to perform supervised training on the 3D point cloud semantic segmentation network, and then the trained network model is used to predict the points in the test set, and the segmentation accuracy of the 3D point cloud segmentation network for the test set V is calculated according to the method in step 4.3.

比较本发明与现有的PointNet++方法对点云数据做语义分割的精度,并使用分割精度作为评价本发明与现有技术好坏的评价指标,结果如表1所示:The accuracy of semantic segmentation of point cloud data by the present invention is compared with that of the existing PointNet++ method, and the segmentation accuracy is used as an evaluation index to evaluate the quality of the present invention and the existing technology. The results are shown in Table 1:

表1 ScanNet数据集分割精度对比表Table 1 ScanNet dataset segmentation accuracy comparison table

评价指标Evaluation indicators 现有技术Prior art 本发明The present invention 分割精度Segmentation accuracy 0.8360.836 0.8520.852

从表1中可以看出,本发明在ScanNet数据集上的分割精度超过了现有技术Pointnet++,提升了1.6%,表明本发明对3D点云的语义分割效果强于PointNet++。As can be seen from Table 1, the segmentation accuracy of the present invention on the ScanNet dataset exceeds that of the prior art Pointnet++ by 1.6%, indicating that the semantic segmentation effect of the present invention on 3D point clouds is stronger than that of PointNet++.

Claims (7)

1.一种基于位置注意力和辅助网络的3D点云语义分割方法,其特征在于,包括如下:1. A 3D point cloud semantic segmentation method based on position attention and auxiliary network, characterized by comprising the following: (1)从ScanNet官网下载3D点云数据的训练文件和测试文件,并对其进行类别统计和切块处理,获取训练集T和测试集V;(1) Download the training and test files of 3D point cloud data from the ScanNet official website, perform category statistics and block processing on them, and obtain the training set T and test set V; (2)构建3D点云语义分割网络,其包括依次级联的特征下采样网络,位置注意力模块,特征上采样网络和辅助网络;(2) Construct a 3D point cloud semantic segmentation network, which includes a cascaded feature downsampling network, a position attention module, a feature upsampling network, and an auxiliary network; 所述的位置注意力模块,包括3个独立的1D卷积层Q、U、V,用于提取该模块的输入数据F的特征,并计算各个质心所代表的特征之间的注意力影响值tij和注意力加强后的特征E:The position attention module includes three independent 1D convolutional layers Q, U, and V, which are used to extract the features of the input data F of the module and calculate the attention influence value tij between the features represented by each centroid and the feature E after attention enhancement:
Figure FDA0004064341810000011
Figure FDA0004064341810000011
E=[E1;E2;...;Ei;...;EN]E=[E 1 ; E 2 ;...; E i ;...; E N ] 其中,Ui表示位置注意力模块的输入数据F经过1D卷积层U提取的第i个质心的特征,Qj T表示位置注意力模块的输入数据F经过1D卷积层Q提取的第j个质心的特征的转置,N表示F的质心数量,Ei表示E中第i个质心的特征,计算公式为:Among them, U i represents the feature of the i-th centroid extracted by the 1D convolution layer U from the input data F of the position attention module, Q j T represents the transposition of the feature of the j-th centroid extracted by the 1D convolution layer Q from the input data F of the position attention module, N represents the number of centroids of F, and E i represents the feature of the i-th centroid in E. The calculation formula is:
Figure FDA0004064341810000012
Figure FDA0004064341810000012
其中,Vj表示F经过1D卷积层V提取的第j个质心的特征,
Figure FDA0004064341810000013
表示经过位置注意力后的第i个质心的特征,α表示位置注意力特征的权重,Fi表示输入的第i个质心的特征;
Among them, Vj represents the feature of the jth centroid extracted by F through the 1D convolution layer V,
Figure FDA0004064341810000013
represents the feature of the i-th centroid after position attention, α represents the weight of the position attention feature, and F i represents the feature of the i-th centroid of the input;
所述的辅助网络,包括依次级联的b个PointAux模块和用于分类的1D卷积层,每个PointAux模块包括1D卷积层和特征插值层,其中,b≥1;The auxiliary network includes b PointAux modules cascaded in sequence and a 1D convolution layer for classification, each PointAux module includes a 1D convolution layer and a feature interpolation layer, wherein b≥1; (3)使用多分类的交叉熵损失函数,作为3D点云语义分割网络的损失函数;(3) Use the multi-classification cross entropy loss function as the loss function of the 3D point cloud semantic segmentation network; (4)使用训练集T,对3D点云数据语义分割网络进行P轮有监督的训练,P≥500:(4) Using the training set T, perform P rounds of supervised training on the 3D point cloud data semantic segmentation network, where P ≥ 500: (4a)在每轮训练过程中,根据语义分割网络的损失函数,调整网络参数,得到网络模型;(4a) In each round of training, the network parameters are adjusted according to the loss function of the semantic segmentation network to obtain the network model; (4b)每隔P1轮,使用测试集的样本对当前网络模型的分割精度进行评估,若当前网络模型的分割精度高于之前保存的网络模型,则进行保存,P1≥2;(4b) Every P 1 rounds, use the samples of the test set to evaluate the segmentation accuracy of the current network model. If the segmentation accuracy of the current network model is higher than that of the previously saved network model, it is saved, P 1 ≥ 2; (4c)P轮训练完成后,把分割精度最高的网络模型作为训练好的网络模型;(4c) After P rounds of training are completed, the network model with the highest segmentation accuracy is taken as the trained network model; (5)将测试集V输入到训练好的网络模型中进行语义分割,得到每一个点的分割结果。(5) Input the test set V into the trained network model for semantic segmentation to obtain the segmentation result of each point.
2.根据权利要求1所述的方法,其特征在于,(1)中对点云数据进行类别统计和切块处理,实现如下:2. The method according to claim 1 is characterized in that the point cloud data is subjected to category statistics and block processing in (1), which is implemented as follows: (1a)使用直方图统计训练文件中所有f0个场景的点云数据各个类别的数目,并计算各个类别的权重wk(1a) Use the histogram to count the number of each category of point cloud data of all f0 scenes in the training file, and calculate the weight wk of each category:
Figure FDA0004064341810000021
Figure FDA0004064341810000021
其中,Gk表示第k类点云数据的数目,M表示所有点云数据的数目,L表示分割类别数,f0≥1000,L≥2;Where G k represents the number of point cloud data of the kth category, M represents the number of all point cloud data, L represents the number of segmentation categories, f 0 ≥ 1000, L ≥ 2; (1b)对训练文件中的每个场景,随机选取一个点作为中心点,坐标为(x,y,z),在其周围取(x-0.75,x+0.75),(y-0.75,y+0.75),(z-0.75,z+0.75)范围中的点,组成一个数据块,并将该数据块中的点数与采样点数N0进行比较,判断其是否合理:(1b) For each scene in the training file, randomly select a point as the center point with coordinates (x, y, z), and take points in the range of (x-0.75, x+0.75), (y-0.75, y+0.75), and (z-0.75, z+0.75) around it to form a data block. The number of points in the data block is compared with the number of sampling points N0 to determine whether it is reasonable: 若该数据块中的点数大于采样点数N0,则判为该数据块合理,并在该数据块中随机采样N0个点,组成一个样本数据,否则,抛弃该数据块,由此得到训练集T,其中,N0≥4096;If the number of points in the data block is greater than the number of sampling points N 0 , the data block is judged to be reasonable, and N 0 points are randomly sampled in the data block to form a sample data. Otherwise, the data block is discarded, thereby obtaining the training set T, where N 0 ≥ 4096; (1c)对于测试文件中所有f1个场景中的每一个场景,使用大小为1.5×1.5×3的立方体窗口进行滑窗切块,对每个数据块,随机采样N0个点,组成一个样本数据,得到测试集V,f1≥300。(1c) For each of all f 1 scenes in the test file, a cubic window of size 1.5×1.5×3 is used for sliding window dicing. For each data block, N 0 points are randomly sampled to form a sample data to obtain a test set V, f 1 ≥300.
3.根据权利要求1所述的方法,其特征在于,(2)中所述的特征下采样网络,包括n个级联的PointSA模块,每个PointSA模块包括依次级联的点云质心采样以及分组层、点云特征提取层,其中,n≥2。3. The method according to claim 1 is characterized in that the feature downsampling network described in (2) includes n cascaded PointSA modules, each PointSA module includes a point cloud centroid sampling and grouping layer, and a point cloud feature extraction layer cascaded in sequence, where n≥2. 4.根据权利要求1所述的方法,其特征在于,(2)中所述的特征上采样网络,包括依次级联的a个PointFP模块、1D卷积层、Dropout层和用于分类的1D卷积层,每个PointFP模块包括依次级联的特征插值层和特征提取层,其中,a≥2。4. The method according to claim 1 is characterized in that the feature upsampling network described in (2) includes a PointFP modules, a 1D convolution layer, a Dropout layer and a 1D convolution layer for classification that are cascaded in sequence, and each PointFP module includes a feature interpolation layer and a feature extraction layer that are cascaded in sequence, where a≥2. 5.根据权利要求1所述的方法,其特征在于,步骤(3)中所述的3D点云语义分割网络的损失函数,计算公式如下:5. The method according to claim 1, characterized in that the loss function of the 3D point cloud semantic segmentation network in step (3) is calculated as follows:
Figure FDA0004064341810000031
Figure FDA0004064341810000031
其中,C代表训练的样本点数,L代表类别总数,wk为第k类的权重,wa为辅助网络的loss的权重,wa∈[0,1];pi,k代表第i个样本点属于第k类的真实概率,若第i个样本点属于第k类,则概率值为1,否则,概率值为0;
Figure FDA0004064341810000032
Figure FDA0004064341810000033
分别表示特征上采样网络和辅助网络预测的第i个样本点属于第k类的概率,
Figure FDA0004064341810000034
Figure FDA0004064341810000035
的计算公式如下:
Where C represents the number of training sample points, L represents the total number of categories, wk is the weight of the kth category, wa is the weight of the auxiliary network loss, wa∈ [0,1]; pi ,k represents the true probability that the ith sample point belongs to the kth category. If the ith sample point belongs to the kth category, the probability value is 1, otherwise, the probability value is 0;
Figure FDA0004064341810000032
and
Figure FDA0004064341810000033
They represent the probability that the i-th sample point predicted by the feature upsampling network and the auxiliary network belongs to the k-th class, respectively.
Figure FDA0004064341810000034
and
Figure FDA0004064341810000035
The calculation formula is as follows:
Figure FDA0004064341810000036
Figure FDA0004064341810000036
Figure FDA0004064341810000037
Figure FDA0004064341810000037
其中,
Figure FDA0004064341810000038
分别表示特征上采样网络和辅助网络输出的第i个样本点的第k个通道特征值,计算公式如下:
in,
Figure FDA0004064341810000038
They represent the k-th channel feature value of the i-th sample point output by the feature upsampling network and the auxiliary network respectively. The calculation formula is as follows:
Figure FDA0004064341810000039
Figure FDA0004064341810000039
Figure FDA00040643418100000310
Figure FDA00040643418100000310
其中,xi表示第i个样本点的输入特征,f1表示特征上采样网络,θ1表示特征上采样网络的参数,f2表示辅助网络,θ2表示辅助网络的参数。Among them, xi represents the input feature of the i-th sample point, f1 represents the feature upsampling network, θ1 represents the parameters of the feature upsampling network, f2 represents the auxiliary network, and θ2 represents the parameters of the auxiliary network.
6.根据权利要求5所述的方法,其特征在于,(4a)中根据语义分割网络的损失函数,调整网络参数,是通过如下公式进行:6. The method according to claim 5, characterized in that in (4a), adjusting the network parameters according to the loss function of the semantic segmentation network is performed by the following formula:
Figure FDA0004064341810000041
Figure FDA0004064341810000041
其中,lq表示第q轮训练过程的学习率,θq表示第q轮训练过程的3D点云语义分割网络的参数,θq+1表示对θq调整后,用于第q+1轮训练过程的参数。Among them, lq represents the learning rate of the qth round of training process, θq represents the parameters of the 3D point cloud semantic segmentation network in the qth round of training process, and θq+1 represents the parameters used for the q+1th round of training process after adjusting θq .
7.根据权利要求1所述的方法,其特征在于,(4b)中每隔P1轮,对当前网络模型的分割精度进行评估,实现如下:7. The method according to claim 1, characterized in that, in (4b), the segmentation accuracy of the current network model is evaluated every P 1 rounds, which is implemented as follows: (4b1)每隔P1轮,将测试集输入到当前网络模型中,得到测试集中所有点云数据的预测类别;(4b1) Every P 1 rounds, the test set is input into the current network model to obtain the predicted categories of all point cloud data in the test set; (4b2)统计测试集中所有点云数据的预测类别与其真实类别相同的数目,计算分割精度:
Figure FDA0004064341810000042
其中,R表示测试集中所有点云数据的预测类别与其真实类别相同的数目,H表示测试集中所有点云数据的数目;
(4b2) Count the number of predicted categories of all point cloud data in the test set that are the same as their true categories, and calculate the segmentation accuracy:
Figure FDA0004064341810000042
Among them, R represents the number of predicted categories of all point cloud data in the test set that are the same as their true categories, and H represents the number of all point cloud data in the test set;
(4b3)将当前网络模型的分割精度与之前保存的网络模型的分割精度进行比较,若当前网络模型的分割精度高于之前保存的网络模型的分割精度,则表明当前网络模型更好,并对其进行保存,否则,不进行保存。(4b3) Compare the segmentation accuracy of the current network model with the segmentation accuracy of the previously saved network model. If the segmentation accuracy of the current network model is higher than that of the previously saved network model, it indicates that the current network model is better and is saved. Otherwise, it is not saved.
CN201910604264.0A 2019-07-05 2019-07-05 3D point cloud semantic segmentation method based on position attention and auxiliary network Active CN110322453B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910604264.0A CN110322453B (en) 2019-07-05 2019-07-05 3D point cloud semantic segmentation method based on position attention and auxiliary network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910604264.0A CN110322453B (en) 2019-07-05 2019-07-05 3D point cloud semantic segmentation method based on position attention and auxiliary network

Publications (2)

Publication Number Publication Date
CN110322453A CN110322453A (en) 2019-10-11
CN110322453B true CN110322453B (en) 2023-04-18

Family

ID=68122807

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910604264.0A Active CN110322453B (en) 2019-07-05 2019-07-05 3D point cloud semantic segmentation method based on position attention and auxiliary network

Country Status (1)

Country Link
CN (1) CN110322453B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110827398B (en) * 2019-11-04 2023-12-26 北京建筑大学 Automatic semantic segmentation method for indoor three-dimensional point cloud based on deep neural network
CN111127493A (en) * 2019-11-12 2020-05-08 中国矿业大学 Remote sensing image semantic segmentation method based on attention multi-scale feature fusion
CN111223120B (en) * 2019-12-10 2023-08-04 南京理工大学 Point cloud semantic segmentation method
CN111192270A (en) * 2020-01-03 2020-05-22 中山大学 Point cloud semantic segmentation method based on point global context reasoning
CN111428619B (en) * 2020-03-20 2022-08-05 电子科技大学 Three-dimensional point cloud head attitude estimation system and method based on ordered regression and soft labels
CN111583263B (en) * 2020-04-30 2022-09-23 北京工业大学 A point cloud segmentation method based on joint dynamic graph convolution
CN112633330B (en) * 2020-12-06 2024-02-02 西安电子科技大学 Point cloud segmentation method, system, medium, computer equipment, terminal and application
CN112560865B (en) * 2020-12-23 2022-08-12 清华大学 A Semantic Segmentation Method for Point Clouds in Large Outdoor Scenes
CN112927248B (en) * 2021-03-23 2022-05-10 重庆邮电大学 Point cloud segmentation method based on local feature enhancement and conditional random field
CN113205509B (en) * 2021-05-24 2021-11-09 山东省人工智能研究院 Blood vessel plaque CT image segmentation method based on position convolution attention network
CN113554653B (en) * 2021-06-07 2024-10-29 之江实验室 Semantic segmentation method based on mutual information calibration point cloud data long tail distribution
CN113470048B (en) * 2021-07-06 2023-04-25 北京深睿博联科技有限责任公司 Scene segmentation method, device, equipment and computer readable storage medium
CN114140841A (en) * 2021-10-30 2022-03-04 华为技术有限公司 Processing method of point cloud data, training method of neural network and related equipment
CN115619963B (en) * 2022-11-14 2023-06-02 吉奥时空信息技术股份有限公司 Urban building entity modeling method based on content perception

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102034267A (en) * 2010-11-30 2011-04-27 中国科学院自动化研究所 Three-dimensional reconstruction method of target based on attention
CN102036073B (en) * 2010-12-21 2012-11-28 西安交通大学 Method for encoding and decoding JPEG2000 image based on vision potential attention target area
US11094137B2 (en) * 2012-02-24 2021-08-17 Matterport, Inc. Employing three-dimensional (3D) data predicted from two-dimensional (2D) images using neural networks for 3D modeling applications and other applications
CN103871050B (en) * 2014-02-19 2017-12-29 小米科技有限责任公司 icon dividing method, device and terminal
US11004202B2 (en) * 2017-10-09 2021-05-11 The Board Of Trustees Of The Leland Stanford Junior University Systems and methods for semantic segmentation of 3D point clouds
US10824862B2 (en) * 2017-11-14 2020-11-03 Nuro, Inc. Three-dimensional object detection for autonomous robotic systems using image proposals
CN109871532B (en) * 2019-01-04 2022-07-08 平安科技(深圳)有限公司 Text theme extraction method and device and storage medium

Also Published As

Publication number Publication date
CN110322453A (en) 2019-10-11

Similar Documents

Publication Publication Date Title
CN110322453B (en) 3D point cloud semantic segmentation method based on position attention and auxiliary network
CN109410321B (en) Three-dimensional reconstruction method based on convolutional neural network
CN111299815B (en) A method for visual inspection and laser cutting trajectory planning for low-gray rubber pads
CN110245709B (en) 3D point cloud data semantic segmentation method based on deep learning and self-attention
CN109029363A (en) A kind of target ranging method based on deep learning
CN111860587B (en) Detection method for small targets of pictures
CN114973002A (en) Improved YOLOv 5-based ear detection method
CN109377555B (en) Target feature extraction and recognition method for 3D reconstruction of autonomous underwater robot's foreground field of view
CN108629288A (en) A kind of gesture identification model training method, gesture identification method and system
CN110070574B (en) Binocular vision stereo matching method based on improved PSMAT net
CN103310481A (en) Point cloud reduction method based on fuzzy entropy iteration
CN107977660A (en) Region of interest area detecting method based on background priori and foreground node
CN113610905B (en) Deep learning remote sensing image registration method based on sub-image matching and application
CN110222767A (en) Three-dimensional point cloud classification method based on nested neural and grating map
CN116703932A (en) CBAM-HRNet model wheat spike grain segmentation and counting method based on convolution attention mechanism
CN116645595A (en) Method, device, equipment and medium for recognizing building roof contours from remote sensing images
CN117496384A (en) A method for object detection in drone images
CN111339924A (en) Polarized SAR image classification method based on superpixel and full convolution network
CN112329662B (en) Multi-view saliency estimation method based on unsupervised learning
CN113989296A (en) Unmanned aerial vehicle wheat field remote sensing image segmentation method based on improved U-net network
CN112819832A (en) Urban scene semantic segmentation fine-grained boundary extraction method based on laser point cloud
CN114399728B (en) Foggy scene crowd counting method
CN115272278A (en) Method for constructing change detection model for remote sensing image change detection
CN112967296B (en) Point cloud dynamic region graph convolution method, classification method and segmentation method
CN112990336B (en) Deep three-dimensional point cloud classification network construction method based on competitive attention fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant