CN114694022A

CN114694022A - Spherical neighborhood based multi-scale multi-feature algorithm semantic segmentation method

Info

Publication number: CN114694022A
Application number: CN202210244408.8A
Authority: CN
Inventors: 何培培; 费美琪; 王靖伟; 程星星; 胡青峰; 高科甲; 廖磊
Original assignee: Rizhao Land Spatial Data Center; North China University of Water Resources and Electric Power
Current assignee: Rizhao Land Spatial Data Center; North China University of Water Resources and Electric Power
Priority date: 2022-03-11
Filing date: 2022-03-11
Publication date: 2022-07-01

Abstract

A semantic segmentation method of a multi-scale and multi-feature algorithm based on a spherical neighborhood, the method comprising: registering the acquired point cloud data with the remote sensing image to generate point cloud data fusing RGB information; selecting a spherical neighborhood to obtain local neighborhood characteristics of point cloud data fused with RGB information, and extracting multi-scale point cloud characteristics by changing the radius of the spherical neighborhood; and combining the extracted basic features, the 5-dimensional neighborhood features of at least two scales and the xyz coordinate information of the point cloud, inputting the combination into an improved model MSMF-PointNet based on PointNet for semantic segmentation, and outputting a classification result. The method can obtain the classification precision far better than PointNet in the outdoor scene point cloud data obtained by airborne LiDAR scanning, and has the advantages of better classification of building facades, fences and the like due to the addition of characteristics such as linearity and verticality, better classification results of trees and shrubs due to the addition of roughness and total variance, better classification results of roofs and impervious grounds due to the addition of flatness.

Description

A Semantic Segmentation Method Based on Spherical Neighborhood Multi-scale and Multi-feature Algorithm

技术领域technical field

本发明属于遥感与摄影测量技术领域，具体涉及一种基于球形邻域的多尺度多特征算法的语义分割方法。The invention belongs to the technical field of remote sensing and photogrammetry, and in particular relates to a semantic segmentation method based on a spherical neighborhood-based multi-scale and multi-feature algorithm.

背景技术Background technique

深度学习是一种可以通过深层网络结构自动学习提取输入数据高级特征的新兴技术，是当前模式识别，计算机视觉和数据分析中最有影响力，发展最快的前沿技术。在应用于3D数据之前，深度学习已成为2D计算机视觉和图像处理中各种任务的有效力量，特别是2012年AlexNe在ImageNet的图像识别竞赛上应用卷积神经网络(Convolutional NeuralNetworks,CNN)以高出第二名十几个百分点的成绩夺冠后，以CNN为主的深度神经网络结构在图像分类、分割和识别等领域取得了重大的突破。但由于三维激光点云的高密度、海量及无结构的特性，传统的深度学习方法无法直接应用于三维点云的分割。Deep learning is an emerging technology that can automatically learn to extract advanced features of input data through deep network structure. It is the most influential and fastest-growing cutting-edge technology in current pattern recognition, computer vision and data analysis. Before being applied to 3D data, deep learning has become an effective force for various tasks in 2D computer vision and image processing. After winning the championship with a score of more than ten percent in the second place, the deep neural network structure dominated by CNN has made major breakthroughs in the fields of image classification, segmentation and recognition. However, due to the high-density, massive and unstructured characteristics of 3D laser point clouds, traditional deep learning methods cannot be directly applied to the segmentation of 3D point clouds.

大量学者在三维点云数据的深度学习上做了大量的工作，为了能够在三维激光点云数据上应用现有的神经网络结构，在将点云数据输入到神经网络之前需要对其进行数据预处理，目前常用的方法有三种，(1)将3D点云数据投影为多视角2D影像后使用传统的卷积神经网络；(2)将点云转化为栅格体素后使用3D卷积神经网络；(3)将点云转化为图(Graph)结构后使用3D图卷积神经网络。A large number of scholars have done a lot of work on the deep learning of 3D point cloud data. In order to apply the existing neural network structure to the 3D laser point cloud data, it is necessary to perform data pre-processing before inputting the point cloud data into the neural network. There are three commonly used methods for processing: (1) using traditional convolutional neural networks after projecting 3D point cloud data into multi-view 2D images; (2) using 3D convolutional neural networks after converting point clouds into grid voxels network; (3) use a 3D graph convolutional neural network after converting the point cloud into a graph (Graph) structure.

但是上述方法在预处理过程中，都会造成数据信息的丢失，针对这一缺点2017年，斯坦福大学的学者Qi等发表了一个开创性的研究工作，提出了一个深度学习模型PointNet，能够直接应用到三维点云数据上面，使得语义分割的精度进一步提高。然而PointNet网络结构主要提取点云间的全局特征，忽略了点云中点与点之间相关联的局部特征提取，而局部特征的提取能力不足往往会导致分割精度不足，物体细节分割效果较差等问题。However, the above methods will cause the loss of data information in the process of preprocessing. In response to this shortcoming, in 2017, scholars from Stanford University, Qi et al., published a pioneering research work and proposed a deep learning model PointNet, which can be directly applied to On the 3D point cloud data, the accuracy of semantic segmentation is further improved. However, the PointNet network structure mainly extracts global features between point clouds, ignoring the extraction of local features associated with points in point clouds, and the lack of local feature extraction ability often leads to insufficient segmentation accuracy and poor object detail segmentation. And other issues.

一方面，由于三维激光点云的高密度、海量及无结构特性，研究快速有效的三维场景语义分割算法具有重要的理论价值。另一方面，由于真实自然场景的复杂性，三维目标间的重叠、遮挡等现象，研究与其他领域技术的结合，提出高鲁棒、自动化、智能化的复杂三维场景目标语义分割方法对进一步研究点云语义分割及其在各个领域的应用具有重要的现实意义。On the one hand, due to the high-density, massive and unstructured characteristics of 3D laser point clouds, it is of great theoretical value to study fast and effective 3D scene semantic segmentation algorithms. On the other hand, due to the complexity of real natural scenes, overlapping, occlusion and other phenomena between 3D objects, the research is combined with other fields of technology, and a highly robust, automated and intelligent semantic segmentation method for complex 3D scene objects is proposed for further research. Point cloud semantic segmentation and its application in various fields have important practical significance.

发明内容SUMMARY OF THE INVENTION

为解决上述问题，提供一种基于球形邻域的多尺度多特征算法的语义分割方法。To solve the above problems, a semantic segmentation method based on spherical neighborhood multi-scale and multi-feature algorithm is provided.

本发明的目的是以下述方式实现的：The purpose of this invention is to realize in the following way:

一种基于球形邻域的多尺度多特征算法的语义分割方法，所述方法包括：A semantic segmentation method based on a spherical neighborhood-based multi-scale and multi-feature algorithm, the method comprising:

S1：将获取到的点云数据与遥感影像进行配准，生成融合RGB信息的点云数据；S1: Register the acquired point cloud data with the remote sensing image to generate point cloud data fused with RGB information;

S2:对融合RGB信息的点云数据进行多尺度邻域的设计及特征提取：通过研究点云空间索引结构，选定球形邻域来获取融合RGB信息的点云数据的局部邻域特征，并通过改变球形邻域半径大小，提取多尺度的点云特征；所述点云特征包括基础特征和基于协方差的多特征；所述基础特征包括点云的xyz坐标信息和RGB信息；所述基于协方差的多特征包括5维邻域特征，即基于协方差的全方差、粗糙度、平整度、线性度和垂直度信息；S2: Multi-scale neighborhood design and feature extraction for the point cloud data fused with RGB information: By studying the point cloud spatial index structure, the spherical neighborhood is selected to obtain the local neighborhood features of the point cloud data fused with RGB information, and By changing the radius of the spherical neighborhood, multi-scale point cloud features are extracted; the point cloud features include basic features and covariance-based multi-features; the basic features include xyz coordinate information and RGB information of the point cloud; The multi-feature of covariance includes 5-dimensional neighborhood features, that is, the covariance-based total variance, roughness, flatness, linearity and verticality information;

S3：将提取的基础特征、至少两个尺度的5维邻域特征与点云的xyz坐标信息组合输进基于PointNet的改进的模型MSMF-PointNet中进行语义分割，输出分类结果。S3: Combine the extracted basic features, 5-dimensional neighborhood features of at least two scales, and the xyz coordinate information of the point cloud into an improved PointNet-based model MSMF-PointNet for semantic segmentation, and output the classification result.

所述基于PointNet的改进的模型MSMF-PointNet包括改进的PointNet网络和至少两个Mini-pointnet网络；所述点云的xyz坐标信息和RGB信息输入改进的PointNet网络，改进的PointNet网络输出的是64维的点特征和1024维的全局特征；所述至少两个尺度的5维邻域特征与点云的xyz坐标信息组合输入Mini-pointnet网络；Mini-PointNet输出的是两个256维的特征向量，将两部分输出后的数据进行全连接，输进softmax分类器进行分类。The improved PointNet-based model MSMF-PointNet includes an improved PointNet network and at least two Mini-pointnet networks; the xyz coordinate information and RGB information of the point cloud are input into the improved PointNet network, and the output of the improved PointNet network is 64 dimensional point features and 1024-dimensional global features; the 5-dimensional neighborhood features of at least two scales and the xyz coordinate information of the point cloud are combined into the Mini-pointnet network; the output of Mini-PointNet is two 256-dimensional feature vectors , the two parts of the output data are fully connected and input into the softmax classifier for classification.

所述改进的PointNet网络包括六层，从输入至输出依次为第一T-Net点云旋转变换、第一感知器mlp、第二T-Net、第二感知器mlp、第三感知器mlp和Max pooling网络。The improved PointNet network includes six layers, from input to output, the first T-Net point cloud rotation transformation, the first perceptron mlp, the second T-Net, the second perceptron mlp, the third perceptron mlp and Max pooling network.

所述Mini-pointnet网络包括4层，从输入至输出依次为T-Net点云旋转变换、两层感知器mlp和Max pooling网络。The Mini-pointnet network includes 4 layers, which are T-Net point cloud rotation transformation, two-layer perceptron mlp and Max pooling network in order from input to output.

所述点云的xyz坐标信息根据系统的GPS、INS和激光测距仪记录的数据解算出来；RGB信息由成像装置获取。The xyz coordinate information of the point cloud is calculated according to the data recorded by the GPS, INS and laser rangefinder of the system; the RGB information is acquired by the imaging device.

所述S1中配准主要通过Arcigs中“值提取到点”的功能将影像每个波段不同像素点的光谱信息赋给点云数据。The registration in S1 mainly assigns the spectral information of different pixel points in each band of the image to the point cloud data through the function of "value extraction to points" in Arcigs.

一种电子设备，包括：An electronic device comprising:

至少一个处理器；at least one processor;

以及as well as

与所述至少一个处理器通信连接的存储器；其中，a memory communicatively coupled to the at least one processor; wherein,

所述存储器存储有可被所述至少一个处理器执行的指令，所述指令被所述至少一个处理器执行，以使所述至少一个处理器能够执行上述的方法。The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the method described above.

一种存储有计算机指令的非瞬时计算机可读存储介质，其中，A non-transitory computer-readable storage medium storing computer instructions, wherein,

所述计算机指令用于使所述计算机执行上述的方法。The computer instructions are for causing the computer to perform the above-described method.

一种计算机程序产品，包括计算机程序，所述计算机程序在被处理器执行时实现上述的方法。A computer program product comprising a computer program that, when executed by a processor, implements the above-described method.

本发明的有益效果：(1)所提方法在机载LiDAR扫描获得的室外场景点云数据中能获得远远优于PointNet的分类精度，因加入线性度、垂直度等特征，建筑物立面，篱笆等得到了更好的分类，加入粗糙度、全方差，树和灌木的分类结果更好，加入平整度，屋顶和不透水地面的分类结果更好。The beneficial effects of the present invention are as follows: (1) The proposed method can obtain far better classification accuracy than PointNet in the outdoor scene point cloud data obtained by airborne LiDAR scanning. , fences, etc. are better classified, adding roughness, full variance, trees and shrubs are better classified, adding flatness, roofs and impervious ground classification results are better.

(2)本发明方法中加入了光谱信息和其他几何特征，并基于深度学习进行训练，可有效弥补点云空间几何特征的不足，提高了点云分类精度，对屋顶、不透水地面以及树木的分类精度更高。(2) The method of the present invention adds spectral information and other geometric features, and conducts training based on deep learning, which can effectively make up for the lack of spatial geometric features of point clouds, improve the classification accuracy of point clouds, and improve the classification accuracy of roofs, impervious ground and trees. The classification accuracy is higher.

附图说明Description of drawings

图1是基于虚拟规则格网的索引建立示意图。FIG. 1 is a schematic diagram of index establishment based on a virtual regular grid.

图2是不同半径的分类效果对比图。Figure 2 is a comparison chart of the classification effects of different radii.

图3是MSMF-PointNet网络模型架构。Figure 3 is the MSMF-PointNet network model architecture.

图4是不同尺度下各类地物和整体分类精度。Figure 4 shows the classification accuracy of various ground objects and overall classification at different scales.

图5是基于球形邻域的多尺度多特征算法的语义分割模型流程图。Figure 5 is a flowchart of the semantic segmentation model of the spherical neighborhood-based multi-scale and multi-feature algorithm.

具体实施方式Detailed ways

下面结合附图1至附图5和具体实施方式对本发明作进一步详细的说明。The present invention will be described in further detail below with reference to Fig. 1 to Fig. 5 and specific embodiments.

应该指出，以下详细说明都是例式性的，旨在对本申请提供进一步的说明。除非另有指明，本文使用的所有技术和科学术语具有与本申请所属技术领域的普通技术人员通常理解的技术含义相同。It should be noted that the following detailed description is exemplary and intended to provide further explanation of the application. Unless otherwise defined, all technical and scientific terms used herein have the same technical meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

如图3所示，所述基于PointNet的改进的模型MSMF-PointNet包括改进的PointNet网络和至少两个Mini-pointnet网络；所述点云的xyz坐标信息和RGB信息输入改进的PointNet网络，改进的PointNet网络输出的是64维的点特征和1024维的全局特征；所述至少两个尺度的5维邻域特征与点云的xyz坐标信息组合输入Mini-pointnet网络；Mini-PointNet输出的是两个256维的特征向量，将两部分输出后的数据进行全连接，输进softmax分类器进行分类。As shown in Figure 3, the improved PointNet-based model MSMF-PointNet includes an improved PointNet network and at least two Mini-pointnet networks; the xyz coordinate information and RGB information of the point cloud are input into the improved PointNet network, and the improved The PointNet network outputs 64-dimensional point features and 1024-dimensional global features; the 5-dimensional neighborhood features of at least two scales and the xyz coordinate information of the point cloud are combined into the Mini-pointnet network; Mini-PointNet outputs two A 256-dimensional feature vector, the two parts of the output data are fully connected, and input into the softmax classifier for classification.

所述改进的PointNet网络包括六层，从输入至输出依次为T-Net点云旋转变换、第一感知器mlp、T-Net、第二感知器mlp、第三感知器mlp和Max pooling网络。The improved PointNet network includes six layers, which are T-Net point cloud rotation transformation, first perceptron mlp, T-Net, second perceptron mlp, third perceptron mlp and Max pooling network in order from input to output.

进一步地，基于球形邻域的点云特征提取方法研究：Further, research on point cloud feature extraction method based on spherical neighborhood:

为了提高点云数据的查询检索的效率，需要构建点云数据的空间索引，常用的索引方法有四叉树、八叉树、k-d树等。由于k-d树索引效率高，在点云数据中得到了广泛应用，针对具体的数据处理需求采用不同的空间索引，本发明为了后续滤波算法设计，选择k-d树来构建点云的空间索引、作空间划分及近邻搜索。In order to improve the efficiency of query and retrieval of point cloud data, it is necessary to build a spatial index of point cloud data. Commonly used index methods include quadtree, octree, k-d tree, etc. Due to the high index efficiency of k-d tree, it has been widely used in point cloud data, and different spatial indexes are used for specific data processing requirements. In the present invention, for the design of subsequent filtering algorithms, k-d tree is selected to construct the spatial index of point cloud and make space Partition and nearest neighbor search.

K-d树建立后，需要进一步选择邻域点的查询方式，室外环境下不同物体之间的尺度差异较大，在特征提取过程中邻域选择的方式决定了特征对不同物体的描述能力。点云的单点分类非常依赖局部特征的提取，而局部特征又是从选定点的邻域点集中提取，所以单点分类跟所选的局部邻域区域密切相关。After the K-d tree is established, it is necessary to further select the query method of neighborhood points. The scale difference between different objects in the outdoor environment is large, and the method of neighborhood selection in the process of feature extraction determines the ability of features to describe different objects. The single-point classification of the point cloud is very dependent on the extraction of local features, and the local features are extracted from the neighborhood point set of the selected point, so the single-point classification is closely related to the selected local neighborhood area.

半径查询就是给定目标点和查询距离的阈值(以目标点为圆心，查询距离为半径R)，从数据集中找出所有与查询点距离小于阈值的数据(半径内的数据)，常见的局部邻域选择有三种：K近邻邻域，柱状邻域和球形邻域，球形邻域法也叫半径查询法，示意图如图1所示，图中(a)为球半径邻域，(b)为柱状邻域，(c)为K近邻邻域。Radius query is to give the target point and the threshold of the query distance (with the target point as the center of the circle, and the query distance as the radius R), and find all the data (data within the radius) whose distance from the query point is less than the threshold from the dataset. Common local There are three types of neighborhood selection: K-nearest neighborhood, cylindrical neighborhood and spherical neighborhood. The spherical neighborhood method is also called the radius query method. The schematic diagram is shown in Figure 1. In the figure, (a) is the spherical radius neighborhood, (b) is the columnar neighborhood, and (c) is the K-nearest neighborhood.

由于三维城市场景的空间各向异质性，球形邻域下的分类效果较好。与K近邻法相比，球形邻域对应于空间中的固定部分，受到点云密度的影响相对较低，所以本发明选择球形邻域法。Due to the spatial heterogeneity of 3D urban scenes, the classification effect under spherical neighborhood is better. Compared with the K-nearest neighbor method, the spherical neighborhood corresponds to a fixed part in space and is relatively less affected by the density of the point cloud, so the present invention selects the spherical neighborhood method.

通过改变球形邻域半径R的大小获得多尺度的点云特征。Multi-scale point cloud features are obtained by changing the size of the spherical neighborhood radius R.

尺度的选择会直接影响点云的分类精度，因此需要根据场景内的地物选择合适的尺度即球形邻域法里的半径值R。半径过大，球形邻域包含的点数过多会导致计算时间大大增加，降低效率。一般来说，半径的最小值大于点云的平均密度并规律的向上增加，为了选出最适合的半径参数，同时验证多尺度融合的点云特征分割效果更好，选取一部分Vaihingen数据集，选择半径参数R＝0.4m,0.8m,1.2m,2.0m和它们的四种组合0.4+0.8，0.8+1.2，0.8+2.0，1.2+2.0计算出来的点云特征，将其分别输进PointNet网络，除输入不同外其余全部相同，分类效果的柱状对比图如图2所示：由图2柱状图对比分析出，多尺度R＝0.8+1.2的组合分类准确度最高，单尺度时R＝0.8准确度最高，因只是为了确定最适合的半径参数，还没有对网络进行升级，所以没有具体的精度，只有精度相对高低。所以本发明综合考虑，最后球形邻域的半径R取0.8m和1.2m。The choice of scale will directly affect the classification accuracy of the point cloud, so it is necessary to select an appropriate scale according to the features in the scene, that is, the radius value R in the spherical neighborhood method. If the radius is too large, the spherical neighborhood contains too many points, which will greatly increase the calculation time and reduce the efficiency. Generally speaking, the minimum value of the radius is greater than the average density of the point cloud and increases regularly. In order to select the most suitable radius parameter, and to verify that the segmentation effect of the multi-scale fusion point cloud feature is better, select a part of the Vaihingen data set, select The radius parameter R=0.4m, 0.8m, 1.2m, 2.0m and their four combinations 0.4+0.8, 0.8+1.2, 0.8+2.0, 1.2+2.0 The calculated point cloud features are input into the PointNet network respectively , except the input is different, the rest are the same. The histogram of the classification effect is shown in Figure 2: from the comparison and analysis of the histogram in Figure 2, the combination of multi-scale R=0.8+1.2 has the highest classification accuracy, and single-scale R=0.8 The accuracy is the highest, because it is only to determine the most suitable radius parameter, and the network has not been upgraded, so there is no specific accuracy, only the accuracy is relatively high. Therefore, the present invention comprehensively considers that the radius R of the final spherical neighborhood is 0.8m and 1.2m.

点云的基础特征及常用特征以及基于协方差的多特征的选取Basic and common features of point cloud and selection of multi-features based on covariance

点云的基础特征就是点云的xyz坐标信息和RGB信息，xyz可根据系统的GPS、INS和激光测距仪记录的数据解算出来，RGB信息由成像装置获取，光谱信息有利于区分植被和其他地物，因此为了提高分类精度，通常将机载LiDAR点云与多光谱航空影像融合，生成具有光谱信息的点云数据，以实现点云数据的光谱信息补充。The basic features of the point cloud are the xyz coordinate information and RGB information of the point cloud. The xyz can be calculated according to the data recorded by the GPS, INS and laser rangefinder of the system. The RGB information is obtained by the imaging device. Therefore, in order to improve the classification accuracy, the airborne LiDAR point cloud and multi-spectral aerial imagery are usually fused to generate point cloud data with spectral information, so as to realize the spectral information supplement of the point cloud data.

在已知航摄像片内外方位元素的情况下，将LiDAR点云的三维坐标代入共线条件方程，可计算得到对应三维点在影像上的像素位置，进而通过重采样获取近红外(NIR)、红色(R)以及绿色(G)通道的灰度值。共线条件方程可表示为：When the internal and external azimuth elements of the aerial image are known, the three-dimensional coordinates of the LiDAR point cloud are substituted into the collinear condition equation, and the pixel position of the corresponding three-dimensional point on the image can be calculated, and then the near-infrared (NIR), Grayscale values for the red (R) and green (G) channels. The collinear condition equation can be expressed as:

式中，f是像焦距，XYZ为地面点的三维坐标，(X_S,Y_S,Z_S)是外方位元素的三个线元素，(a₁,b₁,c₁,a₂,b₂,c₂,a₃,b₃,c₃)是由外方位元素中三个角元素计算得到的旋转矩阵参数。计算出每个激光脚点对应的光谱信息(NIR、R、G三通道灰度值)后，将点云三维坐标、光谱信息组合得到增强的点云数据，作为后续点云分类的输入数据，点云与影像的融合。In the formula, f is the image focal length, XYZ is the three-dimensional coordinates of the ground point, (X _S , Y _S , Z _S ) are the three line elements of the outer orientation element, (a ₁ , b ₁ , c ₁ , a ₂ , b ₂ , c ₂ , a ₃ , b ₃ , c ₃ ) are the rotation matrix parameters calculated from the three corner elements in the outer orientation element. After calculating the spectral information corresponding to each laser foot point (NIR, R, G three-channel gray value), the point cloud three-dimensional coordinates and spectral information are combined to obtain enhanced point cloud data, which is used as the input data for subsequent point cloud classification. Fusion of point clouds and images.

协方差特征是点云中常见的特征，能够表征物体的形状或者局部点云的分布状态等，在点云数据处理中有着重要应用。对于三维点云数据来说，可以用D＝{P_i,i＝1,...,N}表示点云，其中N是点云D中点的数量。对此，首先计算当前点云的均值和协方差矩阵:Covariance features are common features in point clouds, which can characterize the shape of objects or the distribution state of local point clouds, and have important applications in point cloud data processing. For three-dimensional point cloud data, the point cloud can be represented by D={P _i , i=1, . . . , N}, where N is the number of points in the point cloud D. For this, first calculate the mean and covariance matrix of the current point cloud:

其中，C的无偏估计的公式为:Among them, the formula for the unbiased estimate of C is:

其次，由于协方差矩阵C是正定矩阵，由对称正定矩阵的特性可知，矩阵C的特征值：λ₁≥λ₂≥λ₃≥0，以及对应的特征向量e₁,e₂,e₃之间相互垂直，可以构成正交坐标系，即C可以表示成：Secondly, since the covariance matrix C is a positive definite matrix, it can be known from the characteristics of the symmetric positive definite matrix that the eigenvalues of the matrix C: λ ₁ ≥λ ₂ ≥λ ₃ ≥ 0, and the corresponding eigenvectors e ₁ , e ₂ , e ₃ among They are perpendicular to each other and can form an orthogonal coordinate system, that is, C can be expressed as:

根据特征值的大小情况可以得到局部邻域点云的形态特征。在λ₁>>λ₂≈λ₃时，局部邻域点云呈线状分布。在λ₁≈λ₂>>λ₃时，点云呈面状分布。λ₁>>λ₂≈λ₃时，点云呈三维离线分布。并用线性度L_λ、平面度P_λ和离散度S_λ来表示一维、二维和三维特征：According to the size of the eigenvalues, the morphological features of the local neighborhood point cloud can be obtained. When λ ₁ >>λ ₂ ≈λ ₃ , the local neighborhood point cloud is linearly distributed. When λ ₁ ≈ λ ₂ >>λ ₃ , the point cloud is distributed in a plane. When λ ₁ >>λ ₂ ≈λ ₃ , the point cloud is distributed offline in three dimensions. And use linearity L _λ , flatness P _λ and dispersion S _λ to represent one-dimensional, two-dimensional and three-dimensional features:

L_λ＝(λ₁-λ₂)/λ₁ (6)L _λ =(λ ₁ -λ ₂ )/λ ₁ (6)

P_λ＝(λ₂-λ₃)/λ₁ (7)P _λ =(λ ₂ -λ ₃ )/λ ₁ (7)

S_λ＝λ₃/λ₁ (8)S _λ =λ ₃ /λ ₁ (8)

三者和为1。根据公式(3)可以得出选定点邻域构建出的协方差矩阵，进而求得协方差矩阵的特征值以及对应的特征向量。The sum of the three is 1. According to formula (3), the covariance matrix constructed by the neighborhood of the selected point can be obtained, and then the eigenvalues of the covariance matrix and the corresponding eigenvectors can be obtained.

本发明除了选择线性度和平整性外，还选取了全方差：In addition to selecting linearity and flatness, the present invention also selects total variance:

垂直度：Verticality:

V_λ＝1-|Z·N| (10)V _λ =1-|Z·N| (10)

其中Z为垂直方向单位向量，N为该点法向量。Where Z is the vertical unit vector, and N is the normal vector of the point.

截取一块Vaihingen实验数据集，选择R＝0.8时的各邻域特征显示，各特征值的估算结果可视化，可以清晰地看到不同地物的特征值明显不同，有益区分。A piece of Vaihingen experimental data set is intercepted, and the characteristics of each neighborhood when R=0.8 is selected. The estimation results of each eigenvalue are visualized, and it can be clearly seen that the eigenvalues of different objects are obviously different, which is beneficial to distinguish.

基于改进的PointNet的点云分割方法研究：Research on point cloud segmentation method based on improved PointNet:

本发明基于改进的PointNet网络提出了一种融合多尺度多邻域特征的MSMF-PointNet地物分类分割算法，弥补原始PointNet网络对点云特征利用不足，缺少局部特征的缺陷，借鉴了PointNet和PointNet++的网络参数设置，同时将16维基于球形邻域的多尺度局部特征(xyz，RGB，R＝0.8和1.2时对应的R,O_λ,P_λ,L_λ,V_λ)代替原来的单点xyz信息作为新数据源用于分类器的训练分类。Based on the improved PointNet network, the invention proposes a MSMF-PointNet ground object classification and segmentation algorithm that integrates multi-scale and multi-neighborhood features, which makes up for the shortcomings of the original PointNet network for insufficient use of point cloud features and lack of local features, and draws on PointNet and PointNet++ for reference. The network parameter settings of the 16-dimensional spherical neighborhood-based multi-scale local features (xyz, RGB, R = 0.8 and 1.2 corresponding to R, O _λ , P _λ , L _λ , V _λ ) replace the original single point The xyz information is used as a new data source for the training classification of the classifier.

具体改进思路如下：针对融合点云特征空间维度增加的问题，通过调整输入变换矩阵维度增加通道数，使矩阵从原来处理三维特征向量变为处理融合后的六维和八维特征向量；针对融合点云数据扩展特征空间带来的数据量增加，通过加深MLP层的网络层数充分提取点云的深度特征；针对缺少局部邻域特征的问题，利用球形邻域计算的点云特征作为输入，搭建两个mini-PointNet特征提取网络，提取了不同尺度的多邻域点云特征。MSMF-PointNet网络模型架构如图3所示。The specific improvement ideas are as follows: In view of the problem of increasing the dimension of the fusion point cloud feature space, the number of channels is increased by adjusting the dimension of the input transformation matrix, so that the matrix changes from the original processing of three-dimensional feature vectors to the six-dimensional and eight-dimensional feature vectors after fusion; Due to the increase in the amount of data caused by the expansion of the feature space of cloud data, the depth features of the point cloud are fully extracted by deepening the number of network layers in the MLP layer; for the problem of lack of local neighborhood features, the point cloud features calculated in the spherical neighborhood are used as input to build Two mini-PointNet feature extraction networks extract multi-neighbor point cloud features at different scales. The MSMF-PointNet network model architecture is shown in Figure 3.

实验验证与分析Experimental verification and analysis

测区数据：采用ISPRS提供的德国Vaihingen城市测试数据集。该数据集是ALS(Airborne Laser Scanning)点云的集合，由Leica ALS50系统捕获的10个条带组成，两个相邻条带之间平均重叠率为30％左右。多光谱航空影像的地面分辨率为8cm，每张影像的尺寸为7680pixel×13824pixel，并且提供了影像的内、外方位元素。目前，该数据标记的点云被分为9个类别作为算法评估标准，其包含丰富的地理环境、城区环境及建筑物类型，能够充分验证本算法。Vaihingen数据集中提供了航空影像数据以及与其对应的机载LiDAR点云数据，LiDAR点云数据的平均点距为4points/m2。Vaihingen数据集提供的参考数据中已经对LiDAR点云数据的分类类别进行了标注，提取相关建筑物特征后，可以直接训练分类器，对LiDAR点云测试数据进行分类，最终实现建筑物点的精确提取。使用PointNet网络和所提算法进行训练和测试，数据集中点的语义标签包含9类(电力线、车辆、低矮植物、不透水表面、护栏、屋顶、墙面、灌木、树)。训练数据集共包含753,876个点，测试数据集有411,722个点。Test area data: The test data set of the German city of Vaihingen provided by ISPRS is used. The dataset is a collection of ALS (Airborne Laser Scanning) point clouds, consisting of 10 strips captured by the Leica ALS50 system, with an average overlap of around 30% between two adjacent strips. The ground resolution of the multispectral aerial image is 8cm, the size of each image is 7680pixel×13824pixel, and the inner and outer orientation elements of the image are provided. At present, the point cloud marked by the data is divided into 9 categories as the algorithm evaluation criteria, which contain rich geographical environment, urban environment and building types, which can fully verify the algorithm. The aerial image data and the corresponding airborne LiDAR point cloud data are provided in the Vaihingen dataset. The average point distance of the LiDAR point cloud data is 4 points/m2. The classification categories of LiDAR point cloud data have been marked in the reference data provided by the Vaihingen dataset. After extracting the relevant building features, the classifier can be directly trained to classify the LiDAR point cloud test data, and finally achieve accurate building points. extract. Using the PointNet network and the proposed algorithm for training and testing, the semantic labels of points in the dataset contain 9 categories (power lines, vehicles, low plants, impervious surfaces, guardrails, roofs, walls, shrubs, trees). The training dataset contains a total of 753,876 points, and the test dataset has 411,722 points.

表1 ISPRS Vaihingen数据集点云类别及数目Table 1 Types and numbers of point clouds in ISPRS Vaihingen dataset

表2 Vaihingen数据参数Table 2 Vaihingen data parameters

为了验证本发明提出的基于PointNet的多尺度多邻域特征的算法成立，首先用相同的数据对单尺度单特征(SS)(仅包含xyz信息)、单尺度多特征(SM)、多尺度多特征(MM)分别进行验证，单尺度单特征就是只包含xyz信息的PointNet网络；单尺度多特征即在原有PointNet的基础上只添加一个mini-pointnet，计算当R＝0.8时的邻域特征，与xyz一起输进mini网络，与此同时点云与影像融合后的xyzRGB信息组合作为原PointNet网络的输入；多尺度多特征也就是本文提出的MSMF-PointNet算法，分别计算R＝0.8和R＝1.2时的点云特征，与xyz一起输进mini网络，与此同时将xyzRGB信息代替点云单特征输进PointNet网络。分类精度如图4所示。In order to verify the establishment of the algorithm based on the multi-scale and multi-neighborhood features of PointNet proposed in the present invention, firstly, the same data are used for single-scale single feature (SS) (only contains xyz information), single-scale multi-feature (SM), multi-scale multi-feature (SM), multi-scale The features (MM) are verified separately. A single-scale single feature is a PointNet network that only contains xyz information; a single-scale multi-feature is only a mini-pointnet added to the original PointNet, and the neighborhood features when R=0.8 are calculated. Enter the mini network together with xyz, and at the same time combine the xyzRGB information after point cloud and image fusion as the input of the original PointNet network; multi-scale and multi-feature is the MSMF-PointNet algorithm proposed in this paper, respectively calculating R=0.8 and R= The point cloud features at 1.2 are input into the mini network together with xyz, and at the same time, the xyzRGB information is input into the PointNet network instead of the single feature of the point cloud. The classification accuracy is shown in Figure 4.

从图4可以看出，所提的基于PointNet的多尺度多邻域算法(MM)的整体分类精度最高，分类效果最好，总体精度达到88.1％，从上述定量化的评价结果来看，融合光谱信息和基于协方差特征(SM)后，所有地物的点云分类精度都有较大提升。其中，SM总体精度与SS相比提高了19.2个百分点；融合光谱信息和两个尺度的协方差特征(MM)后，精度又进一步提升，与单尺度的SM相比提升了8.4个百分点。其中不透水地面、树、屋顶的精度提升最为明显，分别提高了19.7个百分点、30个百分点和31.3个百分点。由此可见，多尺度多邻域的点云的属性信息可以得到有效增强，从而实现了对各种地物目标更加准确的分类。具体分类结果对比如图4所示，依次为真实值，SS，SM，MM的最终分类结果图。从整体上看，MM的最终分类结果与真实值相差无几，特别是在屋顶、树及不透水地面上，分类效果尤为突出。从SS上可以看出，地物被误分类为屋顶和树的情况居多，当SM中加入基于球形邻域的局部特征后，房屋和树的分类精度大大提升，SS中只依据地形和高程简单地分类成树、屋顶和灌木，而SM中将分类细化，分类出屋顶、树、灌木、不透水地面、篱笆和车。加入多尺度的MM比SM更加的准确和细化，如图上圈出的上方区域，可以明显的看到中间有一条凸起，在实地是一道左右两侧高度相差较大的窄沟，R＝0.8时，窄沟高度偏高的右侧地面被分类为屋顶，当加入R＝1.2的局部特征后，不透水地面准确的被分类出。It can be seen from Figure 4 that the proposed PointNet-based multi-scale and multi-neighbor algorithm (MM) has the highest overall classification accuracy and the best classification effect, with an overall accuracy of 88.1%. From the above quantitative evaluation results, the fusion After spectral information and covariance feature (SM)-based, the point cloud classification accuracy of all ground objects has been greatly improved. Among them, the overall accuracy of SM is improved by 19.2 percentage points compared with SS; after fusing spectral information and covariance features (MM) of two scales, the accuracy is further improved, which is 8.4 percentage points higher than that of single-scale SM. Among them, the accuracy of impervious ground, trees and roofs has been improved the most, by 19.7 percentage points, 30 percentage points and 31.3 percentage points respectively. It can be seen that the attribute information of the point cloud of multi-scale and multi-neighborhood can be effectively enhanced, thereby realizing more accurate classification of various ground objects. The specific classification results are compared as shown in Figure 4, which are the final classification results of the true value, SS, SM, and MM in order. On the whole, the final classification result of MM is almost the same as the real value, especially on the roof, tree and impervious ground, the classification effect is particularly outstanding. It can be seen from SS that the ground objects are misclassified as roofs and trees in most cases. When local features based on spherical neighborhoods are added to SM, the classification accuracy of houses and trees is greatly improved. In SS, only terrain and elevation are simple. The ground is classified into trees, roofs and shrubs, while in SM, the classification is refined into roofs, trees, shrubs, impervious ground, fences and vehicles. Adding multi-scale MM is more accurate and refined than SM. In the upper area circled in the figure, it can be clearly seen that there is a bulge in the middle. In the field, it is a narrow groove with a large difference in height between the left and right sides. R When R = 0.8, the ground on the right side with a higher height of the narrow ditch is classified as a roof. When the local feature of R = 1.2 is added, the impervious ground is accurately classified.

以上所述的仅是本发明的优选实施方式，应当指出，对于本领域的技术人员来说，在不脱离本发明整体构思前提下，还可以作出若干改变和改进，这些也应该视为本发明的保护范围。The above are only the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, some changes and improvements can be made without departing from the overall concept of the present invention, and these should also be regarded as the present invention. scope of protection.

Claims

1. a semantic segmentation method based on the multi-scale multi-feature algorithm of spherical neighborhood, is characterized in that: described method comprises:

S1: Register the acquired point cloud data with the remote sensing image to generate point cloud data fused with RGB information;

S2: Multi-scale neighborhood design and feature extraction for the point cloud data fused with RGB information: By studying the point cloud spatial index structure, the spherical neighborhood is selected to obtain the local neighborhood features of the point cloud data fused with RGB information, and By changing the radius of the spherical neighborhood, multi-scale point cloud features are extracted; the point cloud features include basic features and covariance-based multi-features; the basic features include xyz coordinate information and RGB information of the point cloud; The multi-feature of covariance includes 5-dimensional neighborhood features, that is, the covariance-based total variance, roughness, flatness, linearity and verticality information;

S3: Combine the extracted basic features, 5-dimensional neighborhood features of at least two scales, and the xyz coordinate information of the point cloud into an improved PointNet-based model MSMF-PointNet for semantic segmentation, and output the classification result.

2. the semantic segmentation method based on the multi-scale multi-feature algorithm of spherical neighborhood as claimed in claim 1, is characterized in that: described PointNet-based improved model MSMF-PointNet comprises improved PointNet network and at least two Mini- pointnet network; the xyz coordinate information and RGB information of the point cloud are input into the improved PointNet network, and the improved PointNet network outputs 64-dimensional point features and 1024-dimensional global features; the 5-dimensional neighborhood of the at least two scales The combination of the feature and the xyz coordinate information of the point cloud is input to the Mini-pointnet network; the output of the Mini-PointNet is two 256-dimensional feature vectors, and the two parts of the output data are fully connected and input to the softmax classifier for classification.

3. the semantic segmentation method based on the multi-scale multi-feature algorithm of spherical neighborhood as claimed in claim 2, it is characterized in that: described improved PointNet network comprises six layers, is the first T-Net point sequentially from input to output Cloud rotation transform, first perceptron mlp, second T-Net, second perceptron mlp, third perceptron mlp and Max pooling network.

4. The semantic segmentation method based on the multi-scale multi-feature algorithm of spherical neighborhood as claimed in claim 2, it is characterized in that: described Mini-pointnet network comprises 4 layers, from input to output sequentially is T-Net point cloud rotation Transform, two-layer perceptron mlp and Max pooling network.

5. the semantic segmentation method based on the multi-scale multi-feature algorithm of spherical neighborhood as claimed in claim 1, it is characterized in that: the xyz coordinate information of described point cloud is according to the data recorded by GPS, INS and laser rangefinder of the system Solved; RGB information is acquired by the imaging device.

6. The semantic segmentation method based on the spherical neighborhood multi-scale and multi-feature algorithm according to claim 1, wherein the registration in the S1 mainly uses the function of "value extraction to points" in Arcigs The spectral information of different pixel points in the band is assigned to the point cloud data.

7. An electronic device comprising:

at least one processor;

as well as

a memory communicatively coupled to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the execution of any of claims 1-6 Methods.

8. A non-transitory computer-readable storage medium storing computer instructions, wherein,

The computer instructions are for causing the computer to perform the method of any of claims 1-6.

9. A computer program product comprising a computer program which, when executed by a processor, implements the method of any of claims 1-6.