CN117572457B

CN117572457B - Cross-scene multispectral point cloud classification method based on pseudo tag learning

Info

Publication number: CN117572457B
Application number: CN202410061674.6A
Authority: CN
Inventors: 王青旺; 王铭野; 王盼新; 蒋涛; 张梓峰; 沈韬
Original assignee: Kunming University of Science and Technology
Current assignee: Kunming University of Science and Technology
Priority date: 2024-01-16
Filing date: 2024-01-16
Publication date: 2024-04-05
Anticipated expiration: 2044-01-16
Also published as: CN117572457A

Abstract

The invention relates to a cross-scene multispectral point cloud classification method based on pseudo tag learning, and belongs to the technical field of multispectral laser radar point clouds. The method comprises the following steps: 1) Respectively carrying out characteristic pre-alignment on the multispectral laser radar point cloud characteristics of the source domain scene and the target domain scene; 2) Respectively extracting graph features of two scenes; 3) Calculating loss; 4) Iteratively performing 3) updating the source domain-target domain alignment network parameters until the model converges to obtain a pseudo tag of the target domain and the confidence coefficient thereof; 5) Setting threshold value for pseudo tags in descending orderαBefore selectingα% of pseudo tags; 6) Splicing the adjacent matrix and the feature matrix in the target domain to obtain a new feature matrix; 7) Calculating loss according to the pseudo tag obtained in the step 5) and the feature matrix obtained in the step 6); 8) And 7) updating parameters in the step until the model converges, and finally obtaining a target domain multispectral point cloud data classification result. The method can realize high-precision classification of multi-spectrum point cloud of the cross-scene.

Description

Cross-scene multispectral point cloud classification method based on pseudo tag learning

Technical Field

The invention relates to a cross-scene multispectral point cloud classification method based on pseudo tag learning, and belongs to the technical field of multispectral laser radar point clouds.

Background

The multispectral LiDAR system can synchronously acquire three-dimensional space distribution information and spectrum information in a scene, and can provide richer characteristic information for remote sensing scene interpretation tasks. In the related processing task of multispectral LiDAR, most classification methods, especially those based on deep learning, require a large number of training data sets to achieve optimal performance. However, collecting and marking a large number of point clouds is often laborious and time consuming. On the other hand, they are only applicable to fixed scenarios, i.e. the training samples and the test samples are independent and co-distributed. The performance may be significantly degraded when applied to strange scenes. Thus, these methods cannot be directly transferred to other scenarios, nor can they be tested on unlabeled data collected in real-time. This has been the primary limiting factor in multispectral LiDAR data interpretation.

When multispectral LiDAR carries out data acquisition to remote sensing scene, multiple factors such as laser pulse emission angle, ground object spatial distribution, season and weather change all can influence the intensity of receiving laser pulse, produce spectral drift phenomenon promptly. Furthermore, either conventional methods or deep learning-based methods have poor scene adaptation, and performance is significantly degraded when there is a distribution difference between the training sample and the test sample. Obviously, the multispectral point cloud has space geometric information and spectrum information of ground objects, and the multispectral point cloud pseudo-labels of the target field scene are guided to be generated with high precision by learning the space geometric-spectrum consistency information of the intrinsic attribute of the Xi Biaozheng ground objects from the multispectral point cloud of the source field scene, and the target field pseudo-label training network is adopted, so that the performance of the multispectral point cloud ground object classification network in the target field scene can be improved, and the scene self-adaption capability of the network is improved. Therefore, how to generate high-precision target domain scene point cloud pseudo tags under the conditions of multispectral point cloud spectral drift, inconsistent ground feature distribution and the like in different scenes and realize cross-scene multispectral point cloud high-precision classification under the condition that no target domain scene real tags exist is a technical problem to be solved at present.

Disclosure of Invention

The invention aims to solve the technical problem of providing a cross-scene multispectral point cloud classification method based on pseudo tag learning, so as to cope with spectrum drift phenomenon of multispectral laser radar point clouds among different scenes, alleviate the problem of difficult cross-scene multispectral laser radar point cloud classification caused by spectrum drift phenomenon, and realize high-precision cross-scene multispectral point cloud classification under the condition of no real tags of a target domain scene.

The technical scheme of the invention is as follows: a cross-scene multispectral point cloud classification method based on pseudo tag learning comprises the following steps:

step1: respectively enabling the multispectral laser radar point cloud characteristics of the tagged source domain scene and the untagged target domain scene to be according to L ₂ Performing feature pre-alignment on the norms and the Laplace matrix;

step2: respectively extracting graph features of two scenes by adopting a graph convolution neural network (Graph Convolution Neural Networks, GCN) according to the pre-aligned features;

step3: calculating source domain classification loss, maximum mean difference (Maximum Mean Discrepancy, MMD) loss and target domain shannon entropy loss according to the extracted graph characteristics and source domain labels of the two scenes;

step4: iteratively performing Step3, updating the source domain-target domain alignment network parameters, judging whether the model is converged, if yes, ending, then performing Step5, otherwise repeating Step3 to obtain a pseudo tag of the target domain and the confidence coefficient thereof;

step5: according to the confidence level, the pseudo labels are arranged in a descending order, a threshold value alpha is set, and the target domain pseudo labels with the alpha percent before are selected to be used as the true value input of the target domain classification network;

step6: splicing the adjacent matrix and the feature matrix in the target domain to obtain a new feature matrix as the feature input of the target domain classification network;

step7: calculating the classification loss of the target domain according to the pseudo tag selected by Step5 and the new feature matrix obtained by Step 6;

step8: and (3) iteratively performing Step7, updating the target domain classification network parameters, judging whether the model is converged, if yes, ending, otherwise repeating Step7, and finally obtaining a target domain multispectral point cloud data classification result.

Specifically, in Step1, the labeled source domain scene multispectral lidar point cloud data is denoted as (P _s Y), the unlabeled target domain scene is denoted (P _t ，) WhereinRepresenting a source domain scene contains N _s Each of which has a plurality of spectral points of the label,representing the ith labeled multispectral point in the source domain scene,respectively representing that the target domain scene contains N _t A single non-labeled multi-spectral point,representing the i-th unlabeled multispectral point in the target domain scene,true value labels corresponding to multispectral points of all source field scenes are represented,and (5) representing a truth value label corresponding to the ith multispectral point in the source domain scene.

Specifically, in Step1, the method is described as L ₂ The specific steps of feature pre-alignment of the norm and the Laplace matrix are as follows:

(1) Through L ₂ The norm carries out characteristic transformation on the characteristics of the source domain and the target domain, and a specific characteristic transformation formula is as follows:

where x is the source domain, target domain characteristics,is the source domain and target domain characteristics after the characteristic transformation,is 2 norms.

(2) Obtaining source domain features of M dimensions according to the formula of the step (1)And M-dimensional target domain featuresWill beAndsplicing to obtain an overall feature matrix with M dimensionCalculating the overall feature matrix according to the K nearest neighbor algorithmFurther computing a diagonal matrix D, the elements of the diagonal matrix D，For elements in the adjacency matrix W, then the laplace matrix l=d-W, so the final overall feature matrix X is updated according to the following formula:

wherein,in order to update the feature matrix after the update, ^T n for matrix transpose operation _s The number of the multispectral points with labels for the source domain scene is N _t The number of unlabeled multispectral points in the target domain scene.

Specifically, step3 is specifically:

respectively marking the source domain scene and the target domain scene map features extracted from Step2 asAndthe source domain classification loss calculation formula is:

wherein,is the label of the i-th point in the source domain scene,is the predictive label of the i-th point in the source domain scene,is a source domain scene tag set, N _s The number of the multi-spectrum points is the number of the labeled multi-spectrum points for the source field scene;

in order to measure the difference between the extracted features, the feature deviation of two scenes is calculated by using the maximum mean difference (Maximum Mean Discrepancy, MMD) loss, so as to promote the GCN extraction domain invariant feature:

wherein,is a mapping function that maps the original variables into a high-dimensional space,is a graph characteristic of the ith source domain multispectral point,is the graph characteristic of the j-th target domain multispectral point, N _t The number of unlabeled multispectral points in the target domain scene;

the shannon entropy loss constraint network is adopted to obtain a target domain scene pseudo tag with higher confidence, and a specific shannon entropy loss formula is as follows:

wherein H is shannon entropy matrix,is an element in H, in particularThe calculation formula is as follows:

wherein P is a prediction probability matrix of the network to the target domain multispectral laser radar point cloud,to predict probability, l is the number of characteristic channels of the multispectral point cloud,pre-aligned features for the target domain node.

Specifically, the Step4 updates the source domain-target domain alignment network parameters, specifically:

(1) All parameters are optimized using a standard back propagation algorithm;

(2) In training, the overall loss is a combination of source domain classification loss, maximum mean difference (Maximum Mean Discrepancy, MMD) loss, target domain shannon entropy loss, and the overall loss of training is:

wherein,andis the balance coefficient of the balance loss.

Specifically, the adjacency matrix and the feature matrix in the splicing target domain in Step6 are specifically:

marking the target domain adjacency matrix asThe object domain of M dimension is characterizedAnd (3) withSplicing to obtain updated target domain characteristics。

Specifically, the specific formula for calculating the target domain classification loss in Step7 is as follows:

wherein,is a pseudo tag for the i-th point in the target domain scene,is the predictive label of the ith point in the target domain scene, N _t For the number of unlabeled multispectral points in the target domain scene,for the extracted target domain scene graph features in Step2,is a set of target domain pseudo tags.

Specifically, the Step8 updates the target domain classification network parameters specifically as follows:

(1) All parameters were optimized using a standard back propagation algorithm.

(2) In training, the target domain classification loss in Step7 is used as a training loss.

The multispectral laser radar often has the phenomenon of alien substances or alien substances in different scenes, which can lead to lower classification precision of the target domain point cloud by using a network obtained by training only the source domain point cloud label when the target domain scene has no label for training. According to the method, the characteristics of a scene of a source domain and a scene of a target domain are aligned through design characteristic pre-alignment Ji Cao, the constant characteristics of the domain are promoted to be extracted by adopting maximum mean difference (Maximum Mean Discrepancy, MMD) loss and shannon entropy loss, and the high-quality target domain point cloud pseudo tag is obtained. And carrying out feature enhancement on the target domain features according to the target domain adjacency matrix, so as to realize high-precision classification on the unlabeled target domain multispectral point cloud by utilizing the labeled source domain scene multispectral point cloud training diagram neural network.

The beneficial effects of the invention are as follows: compared with the prior art, the method and the device for detecting the multi-spectrum point cloud spectrum drift of the multi-spectrum point cloud have the advantage that negative effects caused by the multi-spectrum point cloud spectrum drift among different scenes are relieved. The GCN is assisted to extract domain invariant features through feature domain alignment operation, and the accuracy of the target domain point cloud pseudo tag is guaranteed by adopting maximum mean difference (Maximum Mean Discrepancy, MMD) loss and shannon entropy loss. The target domain features are further enhanced according to the adjacency matrix. Under the conditions of spectrum drift, inconsistent ground feature distribution and the like of multispectral point clouds in different scenes, effective and reliable information transfer is realized so as to realize the ground feature classification of the unlabeled target domain scene. And realizing high-precision classification of multi-spectrum point cloud of the cross-scene under the condition of no real label of the scene of the target domain.

Drawings

FIG. 1 is a cross-scene multispectral point cloud classification method framework based on pseudo tag learning of the present invention;

fig. 2 is a diagram of a real ground object distribution diagram of a data set, (a) a source scene visual diagram, and (b) a target scene visual diagram in the embodiment.

Detailed Description

The invention will be further described with reference to the drawings and the specific examples.

Example 1: as shown in fig. 1, a cross-scene multispectral point cloud classification method based on pseudo tag learning includes the following steps:

in Step1, the labeled source domain scene multispectral lidar point cloud data is denoted as (P _s Y), the unlabeled target domain scene is denoted (P _t ，) WhereinRepresenting a source domain scene contains N _s Each of which has a plurality of spectral points of the label,representing the ith labeled multispectral point in the source domain scene,respectively representing that the target domain scene contains N _t A single non-labeled multi-spectral point,representing the i-th unlabeled multispectral point in the target domain scene,true value labels corresponding to multispectral points of all source field scenes are represented,representing the ith multiple in a source domain sceneTruth labels corresponding to spectral points.

In Step1, the process is described in terms of L ₂ The specific steps of feature pre-alignment of the norm and the Laplace matrix are as follows:

wherein P is the multispectral laser of the network to the target domainLei Dadian a predictive probability matrix of a cloud,to predict probability, l is the number of characteristic channels of the multispectral point cloud,pre-aligned features for the target domain node.

the Step4 updates the source domain-target domain alignment network parameters specifically as follows:

(1) All parameters are optimized using a standard back propagation algorithm;

wherein,andis the balance coefficient of the balance loss and, in the present invention,andthe value is 1.

And updating the source domain-target domain alignment network parameters by using a standard back propagation algorithm, judging whether the model is converged, if so, ending, otherwise, repeating the step S3 until the model is converged.

Step5: according to the confidence level, the pseudo labels are arranged in a descending order, a threshold value alpha is set, and the prior alpha percent of the pseudo labels in the target domain are selected as the true value input of the classification network of the target domain, wherein the value of alpha is 50 in the invention;

the adjacent matrix and the feature matrix in the splicing target domain in Step6 specifically are:

the specific formula for calculating the target domain classification loss in Step7 is as follows:

Step8: and (3) taking the target classification loss in the step (S7) as training loss, updating the target domain classification network parameters by using a standard back propagation algorithm, judging whether the model is converged, if so, ending, otherwise, repeating the step (S7) until the model is converged.

The invention is practically feasible to be explained by means of experiments on the basis of the specific implementation description below:

1. experimental data

Harbor of Tobermory dataset: the data set scene is a small harbor in tobermod in the united kingdom, three-band point cloud data are collected by an Optech Titan laser radar, the wavelengths are 1550nm, 1064nm and 532nm respectively, and the data set visualization effect is shown in fig. 2, wherein (a) is a source scene visualization map and (b) is a target scene visualization map. The study area is divided into 7 categories according to the height, material and semantic information of the land cover, namely bare land, grassland, roads, buildings, trees, power lines and automobiles.

University of Houston dataset: the data set scene is a part of the area of the houston campus, and three-band point cloud data are acquired by the Optech Titan laser radar, and the wavelengths are 1550nm, 1064nm and 532nm respectively. The study area is divided into 7 categories, namely bare land, automobiles, grasslands, roads, power lines, buildings and trees, according to the height, materials and semantic information of the land cover. The F score was used as an evaluation index. The visual effect of the two data sets is shown in fig. 2.

2. Experimental details

In the experiment, the data set is classified and verified by adopting the method and the traditional GCN method. Harbor of Tobermory data set is used as a source domain scene, university of Houston data set is used as a target domain scene, and in order to save computing resources, a super-point segmentation method is adopted to segment two scenes into 8000 super-points respectively as input. The method is adopted to carry out point cloud classification, the classification result is evaluated by adopting the evaluation index in the following formula, and the average cross-over ratio (MIoU) of the method in different ground features is shown in the table 1.

Where TP is the number of positive class points split into positive class points, FP is the number of negative class points split into positive class points, and FN is the number of positive class points split into negative class points.

TABLE 1

The method can effectively solve the problems of difficult classification of the multi-spectrum laser radar point cloud of the cross-scene caused by spectrum drift phenomenon, and the like, and can realize high-precision classification of the multi-spectrum point cloud of the cross-scene under the condition that no real label of the scene of the target domain exists.

While the present invention has been described in detail with reference to the drawings, the present invention is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

Claims

1. A cross-scene multispectral point cloud classification method based on pseudo tag learning is characterized in that: the method comprises the following steps:

step2: respectively extracting graph features of two scenes by adopting a graph convolution neural network (GCN) according to the pre-aligned features;

step3: calculating source domain classification loss, maximum mean difference MMD loss and target domain shannon entropy loss according to the extracted graph characteristics and source domain labels of the two scenes;

2. The cross-scene multispectral point cloud classification method based on pseudo tag learning of claim 1, wherein the method comprises the following steps of: in Step1, the labeled source domain scene multispectral lidar point cloud data is denoted as (P _s Y), the unlabeled target domain scene is denoted (P _t ，) Wherein->Representing a source domain scene contains N _s Multiple spectral spots with labels->Representing the i-th labeled multispectral point in the source domain scene,>respectively representing that the target domain scene contains N _t Label-free multispectral spots->Representing the i-th unlabeled multispectral point in the target domain scene,>truth-value label corresponding to multispectral points of all source field scenes>And (5) representing a truth value label corresponding to the ith multispectral point in the source domain scene.

3. The cross-scene multispectral point cloud classification method based on pseudo tag learning of claim 1, wherein the method comprises the following steps of: in Step1, the process is described in terms of L ₂ The specific steps of feature pre-alignment of the norm and the Laplace matrix are as follows:

；

where x is the source domain, target domain characteristics,for the source domain, target domain characteristics after characteristic transformation,/->Is 2 norms;

(2) Obtaining source domain features of M dimensions according to the formula of the step (1)And M-dimensional target domain featuresWill->And->Splicing to obtain an overall feature matrix of M dimension +.>Calculating an overall feature matrix according to K nearest neighbor algorithm>Further calculating a diagonal matrix D, the elements in the diagonal matrix D being +.>，For elements in the adjacency matrix W, then the laplace matrix l=d-W, so the final overall feature matrix X is updated according to the following formula:

；

4. The cross-scene multispectral point cloud classification method based on pseudo tag learning of claim 1, wherein the method comprises the following steps of: the Step3 specifically comprises the following steps:

respectively marking the source domain scene and the target domain scene map features extracted from Step2 asAnd->The source domain classification loss calculation formula is:

；

wherein,is the label of the i-th point in the source domain scene,/->Is the predictive label of the i-th point in the source domain scene,>is a source domain scene tag set, N _s The number of the multi-spectrum points is the number of the labeled multi-spectrum points for the source field scene;

in order to measure the difference between the extracted features, the feature deviation of two scenes is calculated by adopting the maximum mean difference MMD loss, so as to promote the GCN extraction domain invariant feature:

；

wherein,is a mapping function mapping the original variable to a high-dimensional space,/->Is a graph characteristic of the ith source domain multispectral point,is the graph characteristic of the j-th target domain multispectral point, N _t The number of unlabeled multispectral points in the target domain scene;

；

wherein H is shannon entropy matrix,is an element in H, in particular +.>The calculation formula is as follows:

；

wherein P is a prediction probability matrix of the network to the target domain multispectral laser radar point cloud,for predicting probability, l is the number of characteristic channels of the multispectral point cloud, +.>Pre-aligned features for the target domain node.

5. The cross-scene multispectral point cloud classification method based on pseudo tag learning of claim 4, wherein the method comprises the following steps: the Step4 updates the source domain-target domain alignment network parameters specifically as follows:

(1) All parameters are optimized using a standard back propagation algorithm;

(2) In training, the overall loss is a combination of source domain classification loss, maximum mean difference MMD loss and target domain shannon entropy loss, and the overall loss of training is as follows:

；

wherein,and->Is the balance coefficient of the balance loss.

6. A cross-scene multispectral point cloud classification method based on pseudo tag learning as claimed in claim 3, wherein: the adjacent matrix and the feature matrix in the splicing target domain in Step6 specifically are:

marking the target domain adjacency matrix asThe target domain feature of M dimension is +.>And->Splicing to obtain updated target domain characteristics +.>。

7. The cross-scene multispectral point cloud classification method based on pseudo tag learning of claim 1, wherein the method comprises the following steps of: the specific formula for calculating the target domain classification loss in Step7 is as follows:

；

wherein,pseudo tag which is the i-th point in the target domain scene,>is the target fieldPredictive label of ith point in scene, N _t For the number of unlabeled multispectral points in the target domain scene, < >>For the target domain scene graph feature extracted from Step2,/a>Is a set of target domain pseudo tags.

8. The cross-scene multispectral point cloud classification method based on pseudo tag learning of claim 1, wherein the method comprises the following steps of: the Step8 updates the target domain classification network parameters specifically as follows:

(1) All parameters are optimized using a standard back propagation algorithm;