CN117572457A - A cross-scene multispectral point cloud classification method based on pseudo-label learning - Google Patents
A cross-scene multispectral point cloud classification method based on pseudo-label learning Download PDFInfo
- Publication number
- CN117572457A CN117572457A CN202410061674.6A CN202410061674A CN117572457A CN 117572457 A CN117572457 A CN 117572457A CN 202410061674 A CN202410061674 A CN 202410061674A CN 117572457 A CN117572457 A CN 117572457A
- Authority
- CN
- China
- Prior art keywords
- target domain
- scene
- multispectral
- domain
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 239000011159 matrix material Substances 0.000 claims abstract description 68
- 238000001228 spectrum Methods 0.000 claims abstract description 6
- 238000012549 training Methods 0.000 claims description 15
- 230000003595 spectral effect Effects 0.000 claims description 10
- 230000009466 transformation Effects 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 claims description 3
- 230000017105 transposition Effects 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims 1
- 238000012800 visualization Methods 0.000 description 6
- RTAQQCXQSZGOHL-UHFFFAOYSA-N Titanium Chemical compound [Ti] RTAQQCXQSZGOHL-UHFFFAOYSA-N 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000001932 seasonal effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S17/00—Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
- G01S17/88—Lidar systems specially adapted for specific applications
- G01S17/89—Lidar systems specially adapted for specific applications for mapping or imaging
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/48—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S17/00
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/58—Extraction of image or video features relating to hyperspectral data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
- G06V10/765—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Remote Sensing (AREA)
- Molecular Biology (AREA)
- Radar, Positioning & Navigation (AREA)
- Computer Networks & Wireless Communication (AREA)
- Databases & Information Systems (AREA)
- Electromagnetism (AREA)
- Biodiversity & Conservation Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Radar Systems Or Details Thereof (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域Technical field
本发明涉及一种基于伪标签学习的跨场景多光谱点云分类方法,属于多光谱激光雷达点云技术领域。The invention relates to a cross-scene multispectral point cloud classification method based on pseudo-label learning, and belongs to the technical field of multispectral laser radar point cloud.
背景技术Background technique
多光谱LiDAR系统能够同步获取场景中的三维空间分布信息和光谱信息,可以为遥感场景解译任务提供更加丰富的特征信息。在多光谱LiDAR的相关处理任务中,目前大多数分类方法,特别是基于深度学习的分类方法,需要大量的训练数据集才能达到最佳性能。然而,收集和标记大量的点云往往是费力和耗时的。另一方面,它们只适用于固定场景,即训练样本和测试样本是独立且同分布的。当应用于陌生场景时性能会显著下降。因此,这些方法不能直接转移到其他场景,也不能在实时收集的未标记数据上进行测试。这已经成为多光谱LiDAR数据解译的主要制约因素。The multispectral LiDAR system can simultaneously obtain the three-dimensional spatial distribution information and spectral information in the scene, and can provide richer feature information for remote sensing scene interpretation tasks. In multispectral LiDAR-related processing tasks, most current classification methods, especially those based on deep learning, require a large number of training data sets to achieve optimal performance. However, collecting and labeling large amounts of point clouds is often laborious and time-consuming. On the other hand, they only apply to fixed scenarios, where training and test samples are independent and identically distributed. Performance drops significantly when applied to unfamiliar scenarios. Therefore, these methods are not directly transferable to other scenarios, nor can they be tested on unlabeled data collected in real time. This has become a major constraint in the interpretation of multispectral LiDAR data.
多光谱LiDAR对遥感场景进行数据采集时,激光脉冲发射角度、地物空间分布、季节和天气变化等多种因素都会影响接收激光脉冲的强度,即产生光谱漂移现象。此外,无论传统方法还是基于深度学习的方法,它们的场景自适应能力较差,当训练样本与测试样本存在分布差异时性能会显著下降。显然,多光谱点云同时具有地物的空间几何信息和光谱信息,通过从源域场景多光谱点云中学习表征地物本质属性的空间几何-光谱一致性信息,指导目标域场景多光谱点云伪标签高精度生成,采用目标域伪标签训练网络,能够提高多光谱点云地物分类网络在目标域场景中的性能,提高网络的场景自适应能力。因此,如何在不同场景中多光谱点云光谱漂移、地物分布不一致等情况下,生成高精度目标域场景点云伪标签,在没有目标域场景真实标签的情况下实现跨场景多光谱点云高精度分类,是目前亟待解决的技术问题。When multispectral LiDAR collects data from remote sensing scenes, various factors such as laser pulse emission angle, spatial distribution of ground objects, seasonal and weather changes will affect the intensity of the received laser pulse, which causes spectral drift. In addition, both traditional methods and methods based on deep learning have poor scene adaptability, and their performance will drop significantly when there is a distribution difference between training samples and test samples. Obviously, multispectral point clouds contain both spatial geometric information and spectral information of ground objects. By learning the spatial geometry-spectral consistency information that represents the essential attributes of ground objects from the multispectral point cloud of the source domain scene, it can guide the multispectral points of the target domain scene. The high-precision generation of cloud pseudo-labels and the use of target domain pseudo-labels to train the network can improve the performance of the multispectral point cloud feature classification network in target domain scenes and improve the network's scene adaptability. Therefore, how to generate high-precision target domain scene point cloud pseudo-labels when the multispectral point cloud spectrum drifts and the distribution of ground objects is inconsistent in different scenes, etc., and how to achieve cross-scene multispectral point cloud without real labels of the target domain scene. High-precision classification is a technical problem that needs to be solved urgently.
发明内容Contents of the invention
本发明要解决的技术问题是提供一种基于伪标签学习的跨场景多光谱点云分类方法,以应对多光谱激光雷达点云在不同场景之间的光谱漂移现象,缓解光谱漂移现象带来的跨场景多光谱激光雷达点云分类困难等问题,在没有目标域场景真实标签的情况下实现跨场景多光谱点云高精度分类。The technical problem to be solved by this invention is to provide a cross-scene multi-spectral point cloud classification method based on pseudo-label learning to cope with the spectral drift phenomenon of multi-spectral lidar point clouds between different scenes and alleviate the problems caused by the spectral drift phenomenon. Cross-scene multispectral lidar point cloud classification is difficult and other problems. High-precision classification of cross-scene multispectral point clouds is achieved without real labels of target domain scenes.
本发明的技术方案是:一种基于伪标签学习的跨场景多光谱点云分类方法,包括如下步骤:The technical solution of the present invention is: a cross-scene multispectral point cloud classification method based on pseudo-label learning, which includes the following steps:
Step1:分别将带标签源域场景和无标签目标域场景多光谱激光雷达点云特征根据L2范数和拉普拉斯矩阵进行特征预对齐;Step1: Pre-align the multispectral lidar point cloud features of the labeled source domain scene and the unlabeled target domain scene according to the L2 norm and Laplacian matrix;
Step2:根据预对齐后的特征,采用图卷积神经网络(Graph Convolution NeuralNetworks, GCN)分别提取两个场景的图特征;Step2: Based on the pre-aligned features, use Graph Convolution Neural Networks (GCN) to extract graph features of the two scenes respectively;
Step3:根据提取得到的两个场景的图特征和源域标签计算源域分类损失、最大均值差异(Maximum Mean Discrepancy, MMD)损失、目标域香农熵损失;Step3: Calculate the source domain classification loss, Maximum Mean Discrepancy (MMD) loss, and target domain Shannon entropy loss based on the extracted graph features and source domain labels of the two scenes;
Step4:迭代地进行Step3,更新源域-目标域对齐网络参数,判断模型是否收敛,是则结束,然后进行Step5,否则重复Step3,得到目标域的伪标签及其置信度;Step4: Perform Step3 iteratively, update the source domain-target domain alignment network parameters, determine whether the model has converged, if so, end, and then proceed to Step5, otherwise repeat Step3 to obtain the pseudo label and its confidence of the target domain;
Step5:根据置信度对伪标签降序排列,设置阈值α,选取前α%的目标域伪标签作为目标域分类网络真值输入;Step5: Arrange the pseudo-labels in descending order according to the confidence level, set the threshold α, and select the top α% of the pseudo-labels of the target domain as the true value input of the target domain classification network;
Step6:拼接目标域中的邻接矩阵和特征矩阵得到新的特征矩阵作为目标域分类网络特征输入;Step6: Splice the adjacency matrix and feature matrix in the target domain to obtain a new feature matrix as the feature input of the target domain classification network;
Step7:根据Step5选取出的伪标签和Step6得到的新的特征矩阵计算目标域分类损失;Step7: Calculate the target domain classification loss based on the pseudo labels selected in Step5 and the new feature matrix obtained in Step6;
Step8:迭代地进行Step7,更新目标域分类网络参数,判断模型是否收敛,是则结束,否则重复Step7,最终得到目标域多光谱点云数据分类结果。Step8: Perform Step7 iteratively, update the target domain classification network parameters, and determine whether the model has converged. If so, end it. Otherwise, repeat Step7, and finally obtain the target domain multispectral point cloud data classification results.
具体地,在Step1中,所述带标签源域场景多光谱激光雷达点云数据记为(Ps,Y), 无标签目标域场景记为(Pt,),其中表示源域场景包含Ns个有标签多光谱点,表示源域场景中第i个有标签多光谱点,分别表示目标域场景包含Nt个无标 签多光谱点,表示目标域场景中第i个无标签多光谱点,表示所有源域场景 多光谱点对应的真值标签,表示源域场景中第i个多光谱点对应的真值标签。 Specifically, in Step 1, the multispectral lidar point cloud data of the labeled source domain scene is recorded as (P s , Y), and the unlabeled target domain scene is recorded as (P t , ),in Indicates that the source domain scene contains N s labeled multispectral points, Represents the i-th labeled multispectral point in the source domain scene, Respectively indicating that the target domain scene contains N t unlabeled multispectral points, Represents the i-th unlabeled multispectral point in the target domain scene, Represents the ground truth labels corresponding to all source domain scene multispectral points, Represents the true value label corresponding to the i-th multispectral point in the source domain scene.
具体地,在Step1中,所述根据L2范数和拉普拉斯矩阵进行特征预对齐具体步骤为:Specifically, in Step 1, the specific steps for feature pre-alignment based on L 2 norm and Laplacian matrix are:
(1)通过L2范数对源域和目标域特征进行特征变换,具体特征变换公式为:(1) Perform feature transformation on the source domain and target domain features through L 2 norm. The specific feature transformation formula is:
其中x为源域、目标域特征,为特征变换后的源域、目标域特征,为2范数。 where x is the source domain and target domain features, are the source domain and target domain features after feature transformation, is 2 norm.
(2)根据步骤(1)的公式,得到M维度的源域特征和M维度的目标域特征,将和拼接得到M维度的总体特征矩阵,根据K最邻近算法计 算总体特征矩阵的邻接矩阵W,进一步计算对角矩阵D,对角矩阵D中的元素,为邻接矩阵W中的元素,则拉普拉斯矩阵L=D-W,所以最终的总体特征矩阵X根据以下公 式更新: (2) According to the formula in step (1), obtain the M-dimensional source domain features and M-dimensional target domain features ,Will and Splicing to obtain the overall feature matrix of M dimensions , calculate the overall feature matrix according to the K nearest neighbor algorithm The adjacency matrix W, further calculates the diagonal matrix D, the elements in the diagonal matrix D , is an element in the adjacency matrix W, then the Laplacian matrix L=DW, so the final overall feature matrix X is updated according to the following formula:
其中,为更新后的特征矩阵,T为矩阵转置操作,Ns为源域场景有标签多光谱点的 个数,Nt为目标域场景中无标签多光谱点的个数。 in, is the updated feature matrix, T is the matrix transposition operation, N s is the number of labeled multispectral points in the source domain scene, and N t is the number of unlabeled multispectral points in the target domain scene.
具体地,所述Step3具体为:Specifically, the Step 3 is:
将Step2中提取得到的源域场景和目标域场景图特征分别记为和,源域分类 损失计算公式为:The source domain scene and target domain scene graph features extracted in Step 2 are recorded as and , the source domain classification loss calculation formula is:
其中,是源域场景中第i个点的标签,是源域场景中第i个点的预测标签, 是源域场景标签集合,Ns为源域场景有标签多光谱点的个数; in, is the label of the i-th point in the source domain scene, is the predicted label of the i-th point in the source domain scene, is the source domain scene label set, N s is the number of labeled multispectral points in the source domain scene;
为了衡量提取得到的特征之间的差异,采用最大均值差异(Maximum MeanDiscrepancy, MMD)损失计算两个场景的特征偏差,用以促进GCN提取域不变特征:In order to measure the difference between the extracted features, the Maximum Mean Discrepancy (MMD) loss is used to calculate the feature deviation of the two scenes to promote GCN to extract domain-invariant features:
其中,是将原始变量映射到高维空间的映射函数,是第i个源域多光谱点的图 特征,是第j个目标域多光谱点的图特征,Nt为目标域场景中无标签多光谱点的个数; in, is a mapping function that maps original variables to high-dimensional space, is the graph feature of the i-th source domain multispectral point, is the graph feature of the jth target domain multispectral point, and N t is the number of unlabeled multispectral points in the target domain scene;
采用香农熵损失约束网络以得到更高置信度的目标域场景伪标签,具体香农熵损失公式为:Shannon entropy loss is used to constrain the network to obtain higher confidence target domain scene pseudo-labels. The specific Shannon entropy loss formula is:
其中,H为香农熵矩阵,为H中的元素,具体计算公式如下: Among them, H is the Shannon entropy matrix, is an element in H, specifically Calculated as follows:
其中,P为网络对目标域多光谱激光雷达点云的预测概率矩阵,为预测概率,l 为多光谱点云的特征通道数,为目标域节点预对齐后的特征。 Among them, P is the network’s prediction probability matrix for the multispectral lidar point cloud in the target domain, is the prediction probability, l is the number of characteristic channels of the multispectral point cloud, Pre-aligned features for target domain nodes.
具体地,所述Step4中更新源域-目标域对齐网络参数,具体为:Specifically, the source domain-target domain alignment network parameters are updated in Step 4, specifically as follows:
(1)所有参数都使用标准反向传播算法进行优化;(1) All parameters are optimized using the standard backpropagation algorithm;
(2)在训练中,整体损失为源域分类损失、最大均值差异(Maximum MeanDiscrepancy, MMD)损失、目标域香农熵损失的组合,训练的整体损失为:(2) During training, the overall loss is a combination of source domain classification loss, Maximum Mean Discrepancy (MMD) loss, and target domain Shannon entropy loss. The overall loss of training is:
其中,和是平衡损失的平衡系数。 in, and is the balance coefficient to balance the loss.
具体地,所述Step6中拼接目标域中的邻接矩阵和特征矩阵,具体为:Specifically, the adjacency matrix and feature matrix in the target domain are spliced in Step 6, specifically:
将目标域邻接矩阵记为,则将M维度的目标域特征与拼接得到更新后的目标域特征。 Denote the adjacency matrix of the target domain as , then the M-dimensional target domain features and Splicing to obtain updated target domain features .
具体地,所述Step7中计算目标域分类损失具体公式如下:Specifically, the specific formula for calculating the target domain classification loss in Step 7 is as follows:
其中,是目标域场景中第i个点的伪标签,是目标域场景中第i个点的预测标 签,Nt为目标域场景中无标签多光谱点的个数,为Step2中提取得到的目标域场景图特 征,为目标域伪标签集合。 in, is the pseudo label of the i-th point in the target domain scene, is the predicted label of the i-th point in the target domain scene, N t is the number of unlabeled multispectral points in the target domain scene, is the target domain scene graph feature extracted in Step 2, is the pseudo-label set of the target domain.
具体地,所述Step8中更新目标域分类网络参数,具体为:Specifically, the target domain classification network parameters are updated in Step 8, specifically as follows:
(1)所有参数都使用标准反向传播算法进行优化。(1) All parameters are optimized using standard backpropagation algorithm.
(2)在训练中,采用Step7中的目标域分类损失作为训练损失。(2) In training, the target domain classification loss in Step 7 is used as the training loss.
多光谱激光雷达在不同场景中往往会出现同物异谱或者同谱异物的现象,这会导致在目标域场景没有标签可供训练时,仅采用源域点云标签训练得到的网络对目标域点云分类精度较低。本发明通过设计特征预对齐操作对源域和目标域场景的特征进行对齐,采用最大均值差异(Maximum Mean Discrepancy, MMD)损失和香农熵损失促进GCN提取域不变特征,并得到高质量的目标域点云伪标签。根据目标域邻接矩阵对目标域特征进行特征增强,实现利用有标签的源域场景多光谱点云训练图神经网络,对无标签的目标域多光谱点云进行高精度分类。Multispectral lidar often has the same objects with different spectra or different objects with the same spectrum in different scenes. This will lead to the use of only the network trained with source domain point cloud labels to train the target domain when there are no labels for training in the target domain scene. Point cloud classification accuracy is low. This invention aligns the features of the source domain and target domain scenes by designing feature pre-alignment operations, and uses Maximum Mean Discrepancy (MMD) loss and Shannon entropy loss to promote GCN to extract domain-invariant features and obtain high-quality targets. Domain point cloud pseudo-labels. The target domain features are enhanced according to the target domain adjacency matrix, and the labeled source domain scene multispectral point cloud is used to train the graph neural network to perform high-precision classification of the unlabeled target domain multispectral point cloud.
本发明的有益效果是:本发明与现有技术相比,缓解了不同场景间多光谱点云光谱漂移带来的负面影响。通过特征域对齐操作帮助GCN提取域不变特征、采用最大均值差异(Maximum Mean Discrepancy, MMD)损失和香农熵损失保证目标域点云伪标签的准确性。进一步根据邻接矩阵对目标域特征进行增强。在不同场景中多光谱点云光谱漂移、地物分布不一致等情况下,实现了有效和可靠的信息转移以实现对无标签目标域场景进行地物分类。在没有目标域场景真实标签的情况下实现跨场景多光谱点云高精度分类。The beneficial effects of the present invention are: compared with the existing technology, the present invention alleviates the negative impact caused by the spectral drift of multispectral point clouds between different scenes. The feature domain alignment operation helps GCN extract domain invariant features, and maximum mean difference (MMD) loss and Shannon entropy loss are used to ensure the accuracy of pseudo-labeling of target domain point clouds. The target domain features are further enhanced based on the adjacency matrix. In situations such as spectral drift of multispectral point clouds and inconsistent distribution of ground objects in different scenes, effective and reliable information transfer is achieved to classify ground objects in unlabeled target domain scenes. Achieve high-precision classification of cross-scene multispectral point clouds without real labels of target domain scenes.
附图说明Description of the drawings
图1是本发明的基于伪标签学习的跨场景多光谱点云分类方法框架;Figure 1 is the framework of the present invention's cross-scene multispectral point cloud classification method based on pseudo-label learning;
图2是实施例中数据集真实地物分布图,(a)是源场景可视化图、(b)是目标场景可视化图。Figure 2 is a distribution map of real ground objects in the data set in the embodiment, (a) is a source scene visualization map, (b) is a target scene visualization map.
具体实施方式Detailed ways
下面结合附图和具体实施例,对本发明作进一步说明。The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.
实施例1:如图1所示,一种基于伪标签学习的跨场景多光谱点云分类方法,包括如下步骤:Embodiment 1: As shown in Figure 1, a cross-scene multispectral point cloud classification method based on pseudo-label learning includes the following steps:
Step1:分别将带标签源域场景和无标签目标域场景多光谱激光雷达点云特征根据L2范数和拉普拉斯矩阵进行特征预对齐;Step1: Pre-align the multispectral lidar point cloud features of the labeled source domain scene and the unlabeled target domain scene according to the L2 norm and Laplacian matrix;
在Step1中,所述带标签源域场景多光谱激光雷达点云数据记为(Ps,Y),无标签目 标域场景记为(Pt,),其中表示源域场景包含Ns个有标签多光谱点,表示 源域场景中第i个有标签多光谱点,分别表示目标域场景包含Nt个无标签多光 谱点,表示目标域场景中第i个无标签多光谱点,表示所有源域场景多光谱 点对应的真值标签,表示源域场景中第i个多光谱点对应的真值标签。 In Step 1, the multispectral lidar point cloud data of the labeled source domain scene is recorded as (P s , Y), and the unlabeled target domain scene is recorded as (P t , ),in Indicates that the source domain scene contains N s labeled multispectral points, Represents the i-th labeled multispectral point in the source domain scene, Respectively indicating that the target domain scene contains N t unlabeled multispectral points, Represents the i-th unlabeled multispectral point in the target domain scene, Represents the ground truth labels corresponding to all source domain scene multispectral points, Represents the true value label corresponding to the i-th multispectral point in the source domain scene.
在Step1中,所述根据L2范数和拉普拉斯矩阵进行特征预对齐具体步骤为:In Step 1, the specific steps for feature pre-alignment based on L 2 norm and Laplacian matrix are:
(1)通过L2范数对源域和目标域特征进行特征变换,具体特征变换公式为:(1) Perform feature transformation on the source domain and target domain features through L 2 norm. The specific feature transformation formula is:
其中x为源域、目标域特征,为特征变换后的源域、目标域特征,为2范数。 where x is the source domain and target domain features, are the source domain and target domain features after feature transformation, is 2 norm.
(2)根据步骤(1)的公式,得到M维度的源域特征和M维度的目标域特征,将和拼接得到M维度的总体特征矩阵,根据K最邻近算法计 算总体特征矩阵的邻接矩阵W,进一步计算对角矩阵D,对角矩阵D中的元素,为邻接矩阵W中的元素,则拉普拉斯矩阵L=D-W,所以最终的总体特征矩阵X根据以下公 式更新: (2) According to the formula in step (1), obtain the M-dimensional source domain features and M-dimensional target domain features ,Will and Splicing to obtain the overall feature matrix of M dimensions , calculate the overall feature matrix according to the K nearest neighbor algorithm The adjacency matrix W, further calculates the diagonal matrix D, the elements in the diagonal matrix D , is an element in the adjacency matrix W, then the Laplacian matrix L=DW, so the final overall feature matrix X is updated according to the following formula:
其中,为更新后的特征矩阵,T为矩阵转置操作,Ns为源域场景有标签多光谱点的 个数,Nt为目标域场景中无标签多光谱点的个数。 in, is the updated feature matrix, T is the matrix transposition operation, N s is the number of labeled multispectral points in the source domain scene, and N t is the number of unlabeled multispectral points in the target domain scene.
Step2:根据预对齐后的特征,采用图卷积神经网络(Graph Convolution NeuralNetworks, GCN)分别提取两个场景的图特征;Step2: Based on the pre-aligned features, use Graph Convolution Neural Networks (GCN) to extract graph features of the two scenes respectively;
Step3:根据提取得到的两个场景的图特征和源域标签计算源域分类损失、最大均值差异(Maximum Mean Discrepancy, MMD)损失、目标域香农熵损失;Step3: Calculate the source domain classification loss, Maximum Mean Discrepancy (MMD) loss, and target domain Shannon entropy loss based on the extracted graph features and source domain labels of the two scenes;
将Step2中提取得到的源域场景和目标域场景图特征分别记为和,源域分类 损失计算公式为: The source domain scene and target domain scene graph features extracted in Step 2 are recorded as and , the source domain classification loss calculation formula is:
其中,是源域场景中第i个点的标签,是源域场景中第i个点的预测标签, 是源域场景标签集合,Ns为源域场景有标签多光谱点的个数; in, is the label of the i-th point in the source domain scene, is the predicted label of the i-th point in the source domain scene, is the source domain scene label set, N s is the number of labeled multispectral points in the source domain scene;
为了衡量提取得到的特征之间的差异,采用最大均值差异(Maximum MeanDiscrepancy, MMD)损失计算两个场景的特征偏差,用以促进GCN提取域不变特征:In order to measure the difference between the extracted features, the Maximum Mean Discrepancy (MMD) loss is used to calculate the feature deviation of the two scenes to promote GCN to extract domain-invariant features:
其中,是将原始变量映射到高维空间的映射函数,是第i个源域多光谱点的图 特征,是第j个目标域多光谱点的图特征,Nt为目标域场景中无标签多光谱点的个数; in, is a mapping function that maps original variables to high-dimensional space, is the graph feature of the i-th source domain multispectral point, is the graph feature of the jth target domain multispectral point, and N t is the number of unlabeled multispectral points in the target domain scene;
采用香农熵损失约束网络以得到更高置信度的目标域场景伪标签,具体香农熵损失公式为:Shannon entropy loss is used to constrain the network to obtain higher confidence target domain scene pseudo-labels. The specific Shannon entropy loss formula is:
其中,H为香农熵矩阵,为H中的元素,具体计算公式如下:Among them, H is the Shannon entropy matrix, is an element in H, specifically Calculated as follows:
其中,P为网络对目标域多光谱激光雷达点云的预测概率矩阵,为预测概率,l 为多光谱点云的特征通道数,为目标域节点预对齐后的特征。 Among them, P is the network’s prediction probability matrix for the multispectral lidar point cloud in the target domain, is the prediction probability, l is the number of characteristic channels of the multispectral point cloud, Pre-aligned features for target domain nodes.
Step4:迭代地进行Step3,更新源域-目标域对齐网络参数,判断模型是否收敛,是则结束,然后进行Step5,否则重复Step3,得到目标域的伪标签及其置信度;Step4: Perform Step3 iteratively, update the source domain-target domain alignment network parameters, determine whether the model has converged, if so, end, and then proceed to Step5, otherwise repeat Step3 to obtain the pseudo label and its confidence of the target domain;
所述Step4中更新源域-目标域对齐网络参数,具体为:In Step 4, the source domain-target domain alignment network parameters are updated, specifically:
(1)所有参数都使用标准反向传播算法进行优化;(1) All parameters are optimized using the standard backpropagation algorithm;
(2)在训练中,整体损失为源域分类损失、最大均值差异(Maximum MeanDiscrepancy, MMD)损失、目标域香农熵损失的组合,训练的整体损失为:(2) During training, the overall loss is a combination of source domain classification loss, Maximum Mean Discrepancy (MMD) loss, and target domain Shannon entropy loss. The overall loss of training is:
其中,和是平衡损失的平衡系数,在本发明中,和取值为1。 in, and is the balance coefficient to balance the loss. In the present invention, and The value is 1.
使用标准反向传播算法更新源域-目标域对齐网络参数,判断模型是否收敛,是则结束,否则重复步骤S3,直至模型收敛。Use the standard backpropagation algorithm to update the source domain-target domain alignment network parameters and determine whether the model has converged. If so, end. Otherwise, repeat step S3 until the model converges.
Step5:根据置信度对伪标签降序排列,设置阈值α,选取前α%的目标域伪标签作为目标域分类网络真值输入,在本发明中α取值为50;Step5: Arrange the pseudo labels in descending order according to the confidence level, set the threshold α, and select the top α% of the pseudo labels in the target domain as the true value input of the target domain classification network. In the present invention, the value of α is 50;
Step6:拼接目标域中的邻接矩阵和特征矩阵得到新的特征矩阵作为目标域分类网络特征输入;Step6: Splice the adjacency matrix and feature matrix in the target domain to obtain a new feature matrix as the feature input of the target domain classification network;
所述Step6中拼接目标域中的邻接矩阵和特征矩阵,具体为:The adjacency matrix and feature matrix in the splicing target domain in Step 6 are specifically:
将目标域邻接矩阵记为,则将M维度的目标域特征与拼接得到更新后的目标域特征。 Denote the adjacency matrix of the target domain as , then the M-dimensional target domain features and Splicing to obtain updated target domain features .
Step7:根据Step5选取出的伪标签和Step6得到的新的特征矩阵计算目标域分类损失;Step7: Calculate the target domain classification loss based on the pseudo labels selected in Step5 and the new feature matrix obtained in Step6;
所述Step7中计算目标域分类损失具体公式如下:The specific formula for calculating the target domain classification loss in Step 7 is as follows:
其中,是目标域场景中第i个点的伪标签,是目标域场景中第i个点的预测标 签,Nt为目标域场景中无标签多光谱点的个数,为Step2中提取得到的目标域场景图特 征,为目标域伪标签集合。 in, is the pseudo label of the i-th point in the target domain scene, is the predicted label of the i-th point in the target domain scene, N t is the number of unlabeled multispectral points in the target domain scene, is the target domain scene graph feature extracted in Step 2, is the pseudo-label set of the target domain.
Step8:将步骤S7的目标分类损失作为训练损失,使用标准反向传播算法更新目标域分类网络参数,判断模型是否收敛,是则结束,否则重复步骤S7,直至模型收敛。Step8: Use the target classification loss in step S7 as the training loss, use the standard backpropagation algorithm to update the target domain classification network parameters, and determine whether the model has converged. If so, end it. Otherwise, repeat step S7 until the model converges.
下面在具体实施记载的基础上,通过实验的方式来说明本发明是切实可行的:On the basis of the specific implementation records, the feasibility of the present invention will be demonstrated through experiments below:
1、实验数据1. Experimental data
Harbor of Tobermory数据集:该数据集场景是位于英国托伯莫里的一个小型海港,由Optech Titan激光雷达采集的三波段点云数据,波长分别为1550nm、1064nm和532nm,数据集可视化效果如图2所示,其中(a)是源场景可视化图、(b)是目标场景可视化图。根据土地覆盖的高度、材料和语义信息将研究区域划分为7类,分别为裸地、草地、道路、建筑物、树木、电力线和汽车。Harbor of Tobermory data set: The scene of this data set is a small seaport located in Tobermory, UK. The three-band point cloud data collected by Optech Titan lidar, the wavelengths are 1550nm, 1064nm and 532nm respectively. The visualization effect of the data set is as follows 2, where (a) is the source scene visualization diagram and (b) is the target scene visualization diagram. The study area is divided into seven categories according to the height, material and semantic information of land cover, namely bare land, grassland, roads, buildings, trees, power lines and cars.
University of Houston数据集:该数据集场景是休斯顿校园的一部分区域,由Optech Titan激光雷达采集的三波段点云数据,波长分别为1550nm、1064nm和532nm。根据土地覆盖的高度、材料和语义信息将研究区域划分为7类,分别为裸地、汽车、草地、道路、电力线、建筑物和树木。采用F分数作为评价指标。两个数据集的可视化效果如图2所示。University of Houston data set: This data set scene is a part of the Houston campus. It is a three-band point cloud data collected by the Optech Titan lidar. The wavelengths are 1550nm, 1064nm and 532nm respectively. The study area is divided into seven categories according to the height, material and semantic information of land cover, namely bare land, cars, grassland, roads, power lines, buildings and trees. The F score is used as the evaluation index. The visualization of the two data sets is shown in Figure 2.
2、实验内容2. Experimental content
在实验中,采用本发明方法和传统GCN方法对以上数据集进行分类验证。将Harborof Tobermory数据集作为源域场景,将University of Houston数据集作为目标域场景,为节约计算资源,采用超点分割方法将两个场景分别分割为8000个超点作为输入。采用本发明方法进行点云分类,将分类结果采用如下公式中的评价指标进行评价,表1为本发明方法在不同地物中的均交并比(MIoU)。In the experiment, the method of the present invention and the traditional GCN method were used to classify and verify the above data set. The Harborof Tobermory data set is used as the source domain scene, and the University of Houston data set is used as the target domain scene. In order to save computing resources, the super-point segmentation method is used to divide the two scenes into 8000 super-points as input. The method of the present invention is used for point cloud classification, and the classification results are evaluated using the evaluation indicators in the following formula. Table 1 shows the average intersection over union (MIoU) ratio of the method of the present invention in different ground objects.
其中,TP是被分割到正类点中正类点的数量、FP是被分割到正类点中负类点的数量、FN是被分割到负类点中正类点的数量。Among them, TP is the number of positive class points divided into positive class points, FP is the number of negative class points divided into positive class points, and FN is the number of positive class points divided into negative class points.
表1Table 1
本发明能够有效应对多光谱激光雷达点云在不同场景之间的光谱漂移现象,缓解光谱漂移现象带来的跨场景多光谱激光雷达点云分类困难等问题,在没有目标域场景真实标签的情况下实现跨场景多光谱点云高精度分类。This invention can effectively deal with the spectral drift phenomenon of multi-spectral lidar point clouds between different scenes, alleviate problems such as difficulty in classifying cross-scenario multi-spectral lidar point clouds caused by the spectral drift phenomenon, and solve the problem of no real label of the target domain scene. High-precision classification of cross-scenario multispectral point clouds is achieved.
以上结合附图对本发明的具体实施方式作了详细说明,但是本发明并不限于上述实施方式,在本领域普通技术人员所具备的知识范围内,还可以在不脱离本发明宗旨的前提下做出各种变化。The specific embodiments of the present invention have been described in detail above with reference to the accompanying drawings. However, the present invention is not limited to the above-described embodiments. Within the scope of knowledge possessed by those of ordinary skill in the art, other modifications can be made without departing from the spirit of the present invention. various changes.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410061674.6A CN117572457B (en) | 2024-01-16 | 2024-01-16 | A cross-scene multispectral point cloud classification method based on pseudo-label learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410061674.6A CN117572457B (en) | 2024-01-16 | 2024-01-16 | A cross-scene multispectral point cloud classification method based on pseudo-label learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117572457A true CN117572457A (en) | 2024-02-20 |
CN117572457B CN117572457B (en) | 2024-04-05 |
Family
ID=89892215
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410061674.6A Active CN117572457B (en) | 2024-01-16 | 2024-01-16 | A cross-scene multispectral point cloud classification method based on pseudo-label learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117572457B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117830752A (en) * | 2024-03-06 | 2024-04-05 | 昆明理工大学 | An Adaptive Spatial-Spectral Mask Graph Convolution Method for Multispectral Point Cloud Classification |
CN117953384A (en) * | 2024-03-27 | 2024-04-30 | 昆明理工大学 | A cross-scene multispectral lidar point cloud building extraction and vectorization method |
CN119006944A (en) * | 2024-10-24 | 2024-11-22 | 南京信息工程大学 | Label-free point cloud classification method based on multi-mode comparison learning |
US12254681B2 (en) * | 2021-09-07 | 2025-03-18 | Nec Corporation | Multi-modal test-time adaptation |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115841574A (en) * | 2022-12-19 | 2023-03-24 | 中国科学技术大学 | Domain-adaptive laser radar point cloud semantic segmentation method, device and storage medium |
CN116403058A (en) * | 2023-06-09 | 2023-07-07 | 昆明理工大学 | Remote sensing cross-scene multispectral laser radar point cloud classification method |
CN117015813A (en) * | 2021-03-16 | 2023-11-07 | 华为技术有限公司 | Apparatus, system, method, and medium for adaptively enhancing point cloud data sets for training |
CN117315612A (en) * | 2023-11-13 | 2023-12-29 | 重庆邮电大学 | 3D point cloud target detection method based on dynamic self-adaptive data enhancement |
-
2024
- 2024-01-16 CN CN202410061674.6A patent/CN117572457B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117015813A (en) * | 2021-03-16 | 2023-11-07 | 华为技术有限公司 | Apparatus, system, method, and medium for adaptively enhancing point cloud data sets for training |
CN115841574A (en) * | 2022-12-19 | 2023-03-24 | 中国科学技术大学 | Domain-adaptive laser radar point cloud semantic segmentation method, device and storage medium |
CN116403058A (en) * | 2023-06-09 | 2023-07-07 | 昆明理工大学 | Remote sensing cross-scene multispectral laser radar point cloud classification method |
CN117315612A (en) * | 2023-11-13 | 2023-12-29 | 重庆邮电大学 | 3D point cloud target detection method based on dynamic self-adaptive data enhancement |
Non-Patent Citations (2)
Title |
---|
杨德东: "基于置信域伪标签策略的半监督三维目标检测", 《计算机应用研究》, vol. 40, no. 6, 30 June 2023 (2023-06-30), pages 1888 - 1893 * |
王青旺: "多/高光谱图像和LiDAR数据联合分类方法研究", 《中国博士学位论文全文数据库 工程科技II辑》, vol. 2021, no. 1, 15 January 2021 (2021-01-15), pages 028 - 25 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US12254681B2 (en) * | 2021-09-07 | 2025-03-18 | Nec Corporation | Multi-modal test-time adaptation |
CN117830752A (en) * | 2024-03-06 | 2024-04-05 | 昆明理工大学 | An Adaptive Spatial-Spectral Mask Graph Convolution Method for Multispectral Point Cloud Classification |
CN117830752B (en) * | 2024-03-06 | 2024-05-07 | 昆明理工大学 | An adaptive spatial-spectral mask graph convolution method for multispectral point cloud classification |
CN117953384A (en) * | 2024-03-27 | 2024-04-30 | 昆明理工大学 | A cross-scene multispectral lidar point cloud building extraction and vectorization method |
CN117953384B (en) * | 2024-03-27 | 2024-06-07 | 昆明理工大学 | Cross-scene multispectral laser radar point cloud building extraction and vectorization method |
CN119006944A (en) * | 2024-10-24 | 2024-11-22 | 南京信息工程大学 | Label-free point cloud classification method based on multi-mode comparison learning |
Also Published As
Publication number | Publication date |
---|---|
CN117572457B (en) | 2024-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN117572457B (en) | A cross-scene multispectral point cloud classification method based on pseudo-label learning | |
CN111191732B (en) | Target detection method based on full-automatic learning | |
CN113706480B (en) | Point cloud 3D target detection method based on key point multi-scale feature fusion | |
CN112232371B (en) | American license plate recognition method based on YOLOv3 and text recognition | |
CN116403058B (en) | Remote sensing cross-scene multispectral laser radar point cloud classification method | |
CN110334578B (en) | Weak supervision method for automatically extracting high-resolution remote sensing image buildings through image level annotation | |
CN110874590B (en) | Adapter-based mutual learning model training and visible light infrared vision tracking method | |
CN114488194A (en) | Method for detecting and identifying targets under structured road of intelligent driving vehicle | |
CN108280396A (en) | Hyperspectral image classification method based on depth multiple features active migration network | |
CN111461067B (en) | A zero-sample remote sensing image scene recognition method based on prior knowledge mapping and correction | |
CN104680193A (en) | Online target classification method and system based on fast similarity network fusion algorithm | |
CN118279320A (en) | Target instance segmentation model building method based on automatic prompt learning and application thereof | |
CN113781404B (en) | Road disease detection method and system based on self-supervised pre-training | |
CN116434076A (en) | A Target Recognition Method of Remote Sensing Image Integrating Prior Knowledge | |
CN113869418A (en) | Small sample ship target identification method based on global attention relationship network | |
CN114998688A (en) | A large field of view target detection method based on improved YOLOv4 algorithm | |
CN102867192A (en) | Scene semantic shift method based on supervised geodesic propagation | |
CN104463207B (en) | Knowledge autoencoder network and its polarization SAR image terrain classification method | |
CN116484295A (en) | Mineral resources classification and prediction method and system based on multi-source small sample joint learning | |
CN118818222B (en) | Power grid space position analysis method combining GIS service and artificial intelligence technology | |
CN115346055A (en) | A feature extraction and classification method based on multi-core width graph neural network | |
CN115393666A (en) | Small sample expansion method and system based on prototype completion in image classification | |
CN104463205B (en) | Data classification method based on chaos depth wavelet network | |
CN118537612A (en) | Insulator defect detection method under severe environment based on improved DETR algorithm | |
CN117746252A (en) | A landslide detection method based on improved lightweight YOLOv7 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |