CN113989535A - Point cloud classification method combining region growing and random forest - Google Patents

Point cloud classification method combining region growing and random forest Download PDF

Info

Publication number
CN113989535A
CN113989535A CN202111239501.1A CN202111239501A CN113989535A CN 113989535 A CN113989535 A CN 113989535A CN 202111239501 A CN202111239501 A CN 202111239501A CN 113989535 A CN113989535 A CN 113989535A
Authority
CN
China
Prior art keywords
point cloud
patches
point
random forest
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111239501.1A
Other languages
Chinese (zh)
Inventor
王竞雪
宿颖
刘肃艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Liaoning Technical University
Original Assignee
Liaoning Technical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Liaoning Technical University filed Critical Liaoning Technical University
Priority to CN202111239501.1A priority Critical patent/CN113989535A/en
Publication of CN113989535A publication Critical patent/CN113989535A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a point cloud classification method combining region growing and random forests, which is characterized in that a region growing algorithm is utilized to segment airborne LiDAR point cloud data of a training set to obtain a plurality of segmented patches; extracting the characteristics of the segmented patches; determining an optimal feature combination based on feature importance and out-of-bag errors of different feature combinations, and realizing feature selection of a random forest classifier; training a random forest classifier by using the optimal feature combination, and using the random forest classifier for classifying the test data set; and performing topology optimization on the surface patches in the classification result to obtain a final point cloud classification result. Compared with the prior art, the invention realizes a random forest classification algorithm facing to a segmentation object, takes a segmentation surface patch as a primitive, is easy to express and extract the ground feature characteristics, and combines a random forest classifier to classify the ground feature characteristics, thereby accurately classifying the point cloud, and providing basic data support for subsequent point cloud data statistics.

Description

Point cloud classification method combining region growing and random forest
Technical Field
The invention relates to the technical field of remote sensing data processing, in particular to a point cloud classification method combining region growing and random forests.
Background
The airborne laser radar (LiDAR) can directly obtain high-precision and high-density three-dimensional point coordinates (LiDAR point cloud for short), and is widely applied to the aspects of three-dimensional reconstruction, city planning and management, vehicle navigation, disaster emergency and evaluation and the like, wherein point cloud classification plays a key role. The point cloud classification mainly comprises a process of distinguishing ground points, building points and vegetation points, and the existing LiDAR point cloud classification method mainly comprises an unsupervised point cloud classification method and a supervised point cloud classification method.
The unsupervised point cloud classification method is mainly used for constructing the relation among three-dimensional point clouds by calculating various characteristics of the point clouds, and actually realizing a point cloud clustering process based on characteristic similarity. A point cloud data classification method based on height difference [ J ] mapping and reporting, 2018(06):46-49 ], and the like, provides a point cloud classification method based on a height difference secondary derivative, and realizes effective classification of buildings and vegetation. A cloth simulation filtering algorithm is proposed by ZHANG W M (ZHANG W M, QI J B, XIE D H.an easy-to-use air filtration method on cl other simulation [ J ]. Remote Sensing,2016,8(6):501.) and the like, and the separation of ground points and non-ground points is realized by utilizing the change before and after cloth turning. The unsupervised point cloud classification algorithm generally only extracts ground objects of certain specific categories, and the application scene is limited greatly and lacks universality.
The supervised point cloud classification method mainly comprises an artificial neural network, a support vector machine, a random forest, a decision tree and the like. The methods firstly use manual labels to select training samples to train the classifier, and then use the classifier to carry out point cloud classification on the test samples. The method for classifying the point cloud by using the information vector machine instead of the support vector machine is provided, and the problem of weak model sparsity during point cloud classification by using the support vector machine is solved. Schlemia beans (Schlemia beans, ChengYing bud, Sao Xiao Song, Qin Xian Xiong, Wen Peu.) the traditional random algorithm is improved by integrating the cloth filtering and improving the point cloud classification algorithm [ J ] of random forest, the progress of laser and optoelectronics, 2020,57(22):192 & 200.), and the like, and the point cloud classification algorithm of the integrated cloth filtering and weighted weak correlation random forest model is provided. The Hodelog (Hodelog. point cloud single point classification method [ J ] based on the curvature-considered adaptive neighborhood modern manufacturing technology and equipment, 2021,57(07):119-122.) proposes a curvature-considered adaptive neighborhood point cloud classification method, which can generate an ideal three-dimensional point cloud neighborhood and enhance the separability of point cloud characteristics. The point cloud classification method calculates the point cloud characteristics by taking a single laser foot point as a minimum classification unit, trains the classifier by applying all the characteristics, does not consider the influence of characteristic calculation and selection on the classification performance of the classifier, and influences the operation efficiency of the algorithm due to overhigh characteristic dimensionality.
Disclosure of Invention
Based on the technical problem, the invention provides a point cloud classification method combining region growing and random forests, which comprises the following steps:
step 1: utilizing a region growing algorithm to carry out segmentation processing on LiDAR point cloud of a training set;
step 2: extracting the characteristics of the patches obtained by fitting each divided unit;
and step 3: determining an optimal feature combination based on feature importance and out-of-bag errors of different feature combinations, and realizing feature selection of a random forest classifier;
and 4, step 4: training a random forest classifier by adopting the optimal characteristic combination, and classifying a test set by using the trained classifier;
and 5: and carrying out topology optimization on the classification result to obtain a final point cloud classification result.
The step 1 comprises the following steps:
step 1.1: carrying out normal vector and curvature estimation on the point cloud of the LiDAR in the training set point by adopting a random sampling consistency method and a principal component analysis method;
step 1.2: selecting a point with the minimum curvature as an initial seed point;
step 1.3: searching k adjacent points of the seed points by adopting a KD tree, and performing region growth by taking two characteristic similarities of a vertical distance and a normal vector included angle as growth conditions;
step 1.4: until no new adjacent point appears, the region growth is finished, and the seed point clustering result point set is separated from the original point cloud and stored as an independent unit;
step 1.5: repeating the step 1.2 to the step 1.4 until all the point clouds are segmented to obtain a plurality of segmentation units;
step 1.6: and performing surface patch fitting on each unit, calculating the elevation and normal vector characteristics of the segmented surface patches, and performing optimization integration on adjacent surface patches to obtain the final point cloud segmentation result.
The step 1.1 comprises the following steps:
step 1.1.1: selecting k adjacent points of the current point based on a KD tree principle;
step 1.1.2: randomly selecting 3 points from the three points to establish an initial fitting plane to obtain a plane fitting equation, and calculating the distances from the other adjacent points to the fitting plane;
step 1.1.3: standard deviation by point-to-plane distance versus distance threshold TdEstimating to make the distance to the fitting plane less than TdThe adjacent points are marked as interior points, and the number of the interior points conforming to the plane model is counted;
step 1.1.4: repeating the step 1.1.2-step 1.1.3 for N times to obtain N plane equations, and selecting a fitting plane containing the largest number of interior points as a best fitting plane model of the points;
step 1.1.5: principal component analysis is performed on the interior point data contained in the best fitting plane model, and the covariance matrix C is obtained and expressed as:
Figure BDA0003318721760000031
x, Y, Z respectively represents one-dimensional vectors of X coordinates, Y coordinates and Z coordinates of all interior points obtained by the random sampling consistency method of k neighborhood points of a current point, and cov (-) represents the covariance of two components;
step 1.1.6: and calculating the eigenvalue and the eigenvector according to the covariance matrix C, wherein the eigenvector corresponding to the minimum eigenvalue is the normal vector of the point, and the ratio of the minimum eigenvalue to the sum of all eigenvalues is defined as the curvature of the point.
The step 3 comprises the following steps:
step 3.1: training a random forest classifier by using the features extracted in the step 2;
step 3.2: testing the classification precision of the random forest classifier according to the data outside the bag, and simultaneously obtaining the importance index of each characteristic variable;
step 3.3: arranging the characteristic variables from high to low according to the importance indexes, and deleting the characteristic with the minimum importance index to form a group of new characteristic combinations;
step 3.4: repeating the step 3.1 to the step 3.3 until the number of the residual characteristic variables is equal to a given threshold value, and ending the iteration;
step 3.5: and selecting the feature combination with the minimum random forest out-of-bag error as the optimal feature combination.
The step 4 comprises the following steps:
step 4.1: training a random forest classifier by using the optimal feature combination;
step 4.2: segmenting the test set data by using a region growing algorithm, and extracting the characteristics of a surface patch obtained by fitting each segmented unit;
step 4.3: and (4) inputting the result obtained in the step (4.2) into a trained random forest classifier, giving a test result for each decision tree in the random forest for each segmentation object, counting the test results of all the decision trees, and taking the test class with the highest ticket number as a final classification result.
The step 5 comprises the following steps:
step 5.1: for point clouds with number less than given threshold TnThe divided patches of (2) are searched for their neighboring patches, and the patch and the neighboring patch are determinedWhether the sheet attributes are the same or not, and if the sheet attributes are different from the attributes of the adjacent sheets, defining the sheet as an island sheet;
step 5.2: calculating the three-dimensional distance between the island patch and its adjacent patch, if less than a given distance threshold TDIf so, merging the segmented patches into segmented patches with larger areas in adjacent patches, and re-classifying the segmented patches, otherwise, keeping the original classification results of the segmented patches.
The step 3.1 comprises the following steps:
step 3.1.1: number N of decision trees contained in a given random foresttRandomly selecting w characteristic variables from all the characteristics as split nodes of each decision tree, and generating the decision tree by continuously splitting the nodes;
step 3.1.2: assuming that a training set is segmented to obtain M segmented patches, taking the M segmented patches as M sample data, wherein each sample data contains l-dimensional features;
step 3.1.3: extracting h samples from the M samples by adopting a random sampling method to serve as a training sample set constructed by a single decision tree, wherein the samples which are not extracted are regarded as corresponding data outside the bag;
step 3.1.4: repeating the step 3.1.3, selecting NtEach training sample set being used for NtTraining of decision trees to generate NtThe decision trees form a random forest classifier.
The step 3.2 comprises:
step 3.2.1: testing a single decision tree by using corresponding sample data outside the bag, calculating the error outside the bag of the decision tree, and recording as err1Then, the characteristic variable l in the data outside the bag is comparedxNoise is added randomly for interference, and the error outside the bag is calculated again and recorded as err2Then the feature variable l in the decision treexIs that V ═ err1-err2|;
Step 3.2.2: assume a characteristic variable lxThe feature variable l is counted when the feature variable exists in r decision treesxThe total importance of (1) is the average of the sum of the importance of the variable in all decision trees to obtain a characteristic variable lxImportance index of。
The invention has the beneficial effects that:
the invention provides a point cloud classification method combining region growing and random forests, which has the following beneficial effects:
(1) the invention improves the traditional point-based random forest point cloud classification method, and the original point cloud data is segmented, and the segmented surface patches are used as the minimum units to calculate the point cloud characteristics, so that the characteristics have more accurate semantic information.
(2) According to the invention, after patch feature calculation, feature importance measurement based on random forests is introduced, and an optimal feature combination is selected from the feature importance measurement to train a random forest classifier, so that the classification precision of the classifier is effectively improved.
Drawings
FIG. 1 is a flow chart of a point cloud classification method combining region growing and random forests in an embodiment of the present invention;
FIG. 2 is a flow chart of a point cloud segmentation process in accordance with an embodiment of the present invention;
FIG. 3 is a schematic diagram of a random forest classifier constructed according to an embodiment of the present invention;
FIG. 4 is a plot of an experimental region of a training data set in accordance with an embodiment of the present invention;
FIG. 5 is a plot of an experimental region of a test data set in accordance with an embodiment of the present invention;
FIG. 6 is a graph of experimental results of a test data set in accordance with an embodiment of the present invention.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. Aiming at the problems in the prior art, the invention provides a point cloud classification method combining region growing and random forests. For two groups of acquired LiDAR point clouds, one group is used as a training set, and the other group is used as a testing set; LiDAR point cloud data with existing standard classification results are used as a training set, a random forest classifier is trained by using the LiDAR point cloud data, and the LiDAR point cloud to be classified (namely a test set) is classified by using the trained random forest classifier. The feature determines the effectiveness and accuracy of machine learning and directly influences the classification precision, so that the invention strengthens the calculation and selection of the feature through the following two aspects: on one hand, the original point cloud data is divided into a plurality of divided surface patches, and the divided surface patches are used as minimum units to calculate various features, so that the features have more accurate semantic information; on the other hand, random forests are used for feature selection, and classifiers are retrained through optimal feature combinations for testing point cloud classification, so that the point cloud classification precision is improved.
A point cloud classification method combining region growing and random forests is shown in FIG. 1 and comprises the following steps:
step 1: segmenting the LiDAR point cloud of the training set by using a region growing algorithm, wherein the specific principle is as shown in figure 2;
step 1.1: carrying out normal vector and curvature estimation on LiDAR point cloud data of a training set point by adopting a random sampling consistency method (RANSAC) and a Principal Component Analysis (PCA);
step 1.1.1: selecting K adjacent points of the current point based on a KD tree (K-dimensional index tree data structure), where K is 20 in this example;
step 1.1.2: randomly selecting 3 points from the three points to establish an initial fitting plane to obtain a plane fitting equation, and calculating the distances from the other adjacent points to the fitting plane;
step 1.1.3: standard deviation by point-to-plane distance versus distance threshold TdEstimating to make the distance to the fitting plane less than TdThe adjacent points are marked as interior points, and the number of the interior points conforming to the plane model is counted;
step 1.1.4: repeating the step 1.1.2 to the step 1.1.3 for N times to obtain N plane equations, and selecting a fitting plane containing the largest number of interior points as a best fitting plane model of the points, wherein in the example, N is 100;
step 1.1.5: principal component analysis is performed on the interior point data contained in the best fitting plane model, and the covariance matrix C is obtained and expressed as:
Figure BDA0003318721760000051
x, Y, Z respectively represents one-dimensional vectors of X coordinates, Y coordinates and Z coordinates of all interior points obtained by the random sampling consistency method of k neighborhood points of a current point, and cov (-) represents the covariance of two components;
step 1.1.6: calculating eigenvalues and eigenvectors according to the covariance matrix C, wherein the eigenvector corresponding to the minimum eigenvalue is the normal vector of the point, and the ratio of the minimum eigenvalue to the sum of all eigenvalues is defined as the curvature of the point;
step 1.2: selecting a point with the minimum curvature as an initial seed point;
step 1.3: searching k adjacent points of the seed points by adopting a KD tree, and performing region growth by taking two characteristic similarities of a vertical distance and a normal vector included angle as growth conditions, wherein the vertical distance threshold and the normal vector included angle threshold are respectively 0.5 and 10 in the example;
step 1.4: until no new adjacent point appears, the region growth is finished, and the seed point clustering result point set is separated from the original point cloud and stored as an independent unit;
step 1.5: repeating the step 1.2-step 1.4 until all the point clouds are segmented to obtain a plurality of segmentation units;
step 1.6: performing surface patch fitting on each unit, calculating the elevation and normal vector characteristics of the segmented surface patches, and performing optimization integration on adjacent surface patches to obtain the final point cloud segmentation result; the process is specifically realized based on a region growing process of the surface patches, the surface patch with the smallest curvature in the surface patch set is sequentially selected as a seed surface patch, and the similarity judgment condition is determined by taking the height difference and the normal vector included angle between the seed surface patch and the surface patch to be grown as similarity judgment conditions in the region growing process; in this example, the height difference threshold of the two slices is 0.5, and the normal vector included angle of the two slices is not more than 30.
Step 2: extracting the characteristics of the patches obtained by fitting each divided unit; the features include elevation-related features, geometric-related features, eigenvalue and eigenvector-related features, echo and reflection intensity features, and others. Respectively extracting the following characteristics of each divided surface patch, specifically:
(1) the elevation related characteristics of the divided surface patches comprise three types, namely normalized average elevation of the divided surface patches, elevation difference of the divided surface patches and elevation variance of the divided surface patches:
average elevation H of divided patchesaThe elevation average value of all point clouds contained in a segmentation patch is shown as follows:
Figure BDA0003318721760000061
wherein N' is the number of point clouds contained in the divided surface slice, ZiExpressing the normalized elevation value of the ith point in the segmentation surface patch;
the elevation difference of the segmentation surface patches refers to the difference between the maximum elevation value and the minimum elevation value of the point cloud in the segmentation surface patches;
elevation variance H of divided patchesvThe variance of the elevation values of all points in the segmented patch is shown as follows:
Figure BDA0003318721760000062
(2) the geometric correlation characteristics of the divided patches are five types:
plane fitting index Sn: the value is the average value of the distances from all points in the patch to the fitting plane;
surface roughness Sr: the value is the curvature average value of all laser foot points in the surface patch;
area Sm: the value is the area of the divided patch projected on the two-dimensional XOY plane;
rectangular degree Sj: the value is the ratio of the area of a polygon projected to a two-dimensional XOY plane by the divided surface patches to the area of the minimum circumscribed rectangle of the polygon;
narrow length Sx: the value is the ratio of the short side to the long side in the minimum circumscribed rectangle of the polygon projected to the two-dimensional XOY plane by the segmentation surface patch;
(3) the above-mentioned andthe characteristic value and the characteristic vector related characteristic comprise a point characteristic lambda of a segmentation surface patchp=λ32Linear characteristic lambdal=(λ12)/λ1Characteristic of dough kneadings=(λ23)/λ1Wherein λ is1,λ2,λ3Three eigenvalues obtained for solving the covariance matrix of the point cloud data, the magnitudes of which satisfy lambda123
(4) The echo and reflection intensity characteristics comprise echo frequency ratio, average intensity and intensity variance of the divided patches, and specifically comprise the following steps:
the invention calculates the echo characteristics based on the segmentation patches by a multi-echo ratio, and assumes that the number of point clouds with multiple echoes in a certain segmentation patch is NmIf the number of point clouds included in the segmented patch is N', the multi-echo ratio is NIThe calculation formula is NI=Nm/N′;
The invention uses two reflection intensity related characteristics, the intensity mean value IaAnd intensity variance Iv,IiThe intensity value of the ith laser foot point in the divided surface patch is represented, then IaAnd IvThe calculation formulas of (A) and (B) are respectively as follows:
Figure BDA0003318721760000071
(5) the other features are the number N' of point clouds included in the segmented patch.
And step 3: and determining an optimal feature combination based on the feature importance and the out-of-bag errors of different feature combinations, and realizing feature selection of the random forest classifier. And (3) performing feature selection based on a random forest on the segmentation patch features calculated in the step (2), wherein the construction principle of a random forest classifier is shown in the attached figure 3, and the specific method is as follows:
step 3.1: training a random forest classifier by using the features extracted in the step 2;
step 3.1.1: number N of decision trees contained in a given random foresttRandomly selecting w characteristic variables from all characteristic variables as splitting nodes of each decision tree, and generating the decision trees through continuous splitting of the nodes, wherein the number of the decision trees contained in a given random forest is 200 in the example, namely Nt=200;
Step 3.1.2: assuming that the training set is segmented to obtain M segmented patches, taking the M segmented patches as M sample data, wherein each sample data contains l-dimensional features, and the total of 15 features is calculated in the example, so that the value of l is 15;
step 3.1.3: extracting h samples from the M samples by adopting a random sampling (Bootstrap) method to serve as a training sample set constructed by a single decision tree, wherein the samples which are not extracted are regarded as corresponding data outside the bag;
step 3.1.4: repeating the step 3.1.3, selecting NtEach training sample set being used for NtTraining of decision trees to generate NtForming a random forest classifier by the decision trees;
step 3.2: testing the classification precision of the random forest classifier according to the data outside the bag, and simultaneously obtaining the importance index of each characteristic variable;
step 3.2.1: testing a single decision tree by using corresponding sample data outside the bag, calculating the error outside the bag of the decision tree, and recording as err1Then, the characteristic variable l in the data outside the bag is comparedxNoise is added randomly for interference, and the error outside the bag is calculated again and recorded as err2Then the feature variable l in the decision treexIs that V ═ err1-err2|;
Step 3.2.2: assume a characteristic variable lxThe feature variable l is counted when the feature variable exists in r decision treesxThe total importance of (1) is the average of the sum of the importance of the variable in all decision trees to obtain a characteristic variable lxThe importance index of (a);
step 3.3: arranging the characteristic variables from high to low according to the importance indexes, and deleting the characteristic with the minimum importance index to form a group of new characteristic combinations;
step 3.4: repeating the step 3.1 to the step 3.3 until the number of the residual characteristic variables is equal to a given threshold value, and ending the iteration;
step 3.5: the feature combination with the minimum random forest out-of-bag error is selected as the optimal feature combination, and in the example, when the feature combination is normalized height average, height difference, average curvature, height variance, intensity variance, linear features of the segmented patches, facial features of the segmented patches, plane fitting indexes and the number of point clouds included in the segmented patches, the out-of-bag error is the minimum, so that the feature combination is used as the optimal feature combination of the invention.
And 4, step 4: training a random forest classifier by adopting the optimal feature combination, and classifying a test data set by utilizing a training result, wherein the specific steps are implemented as follows:
step 4.1: training a random forest classifier by using the optimal feature combination (the training method is the same as the step 3.1.1 to the step 3.1.4);
step 4.2: dividing the test set data by using a region growing algorithm, and extracting features of a patch obtained by fitting each divided unit, wherein the extracted features comprise elevation related features, geometric related features, eigenvalue and eigenvector related features, echo and reflection intensity features and other features (the dividing method is the same as the step 1.1-the step 1.6);
step 4.3: and (4) inputting the result obtained in the step (4.2) into a trained random forest classifier, giving a test result for each decision tree in the random forest for each segmentation object, counting the test results of all the decision trees, and taking the test class with the highest ticket number as a final classification result.
And 5: performing topology optimization on the classification result to obtain a final point cloud classification result, and specifically performing the following steps:
step 5.1: for point clouds with number less than given threshold TnSearching adjacent patches of the divided patches, judging whether the attributes of the patches and the adjacent patches are the same, if the attributes of the patches and the adjacent patches are different, defining the patches as 'island patches', in the example, taking the patches with the point cloud number less than 50 contained in the divided patches as the patches to be processed, and performing neighborhood search;
step 5.2: calculating the three-dimensional distance between the 'island patch' and its adjacent patches, if less than a given distance threshold TDIt is merged into the divided patches with larger area in the adjacent patches and then re-classified, otherwise, the original classification result of the divided patches is retained, in this example, TDSet to 5.
The experimental data set used by the invention is used for testing the point cloud data of the benchmark in the Vaihingen region provided by ISPRS-Commission III. The data is located in the central area of the city, and the collection time is in the midsummer season, so the vegetation is luxuriant. The average density of the point cloud in the whole area under 30 percent course overlapping and 60 percent side direction overlapping is 6.7pts/m2. The training set comprises 753876 points, the test set comprises 411722 points, the original point cloud data comprises three-dimensional coordinate information, intensity information and echo information, and the point cloud data of the training set and the test set are respectively displayed as shown in fig. 4 and 5 and are colored according to the elevation.
The method is realized by using MATLAB7.11.0 platform programming on a CPU dual-core 3.30GHz, memory 4GB and Windows 7 flagship version system in the experiment, a confusion matrix is established based on standard data, and the point cloud classification accuracy of the invention is evaluated. Meanwhile, in order to verify the effectiveness of the algorithm, the algorithm is respectively compared with the classification results of an original elevation feature classification method and a Support Vector Machine (SVM) classification method. Through comparative analysis, the highest overall classification precision of the algorithm is 87%, and the Kappa coefficient is 0.7965; the classification precision of the original elevation feature classification method is only 52.88%, and the importance of the normalized elevation features is verified; the point cloud classification precision based on the Support Vector Machine (SVM) is 85%, the Kappa coefficient is 0.7683, and the classification performance of the random forest classifier based on the random forest classifier is proved to be superior to that of the support vector machine classifier in the example. Fig. 6 is a diagram of the final classification effect of the present example.

Claims (8)

1. A point cloud classification method combining region growing and random forests is characterized by comprising the following steps:
step 1: utilizing a region growing algorithm to carry out segmentation processing on LiDAR point cloud of a training set;
step 2: extracting the characteristics of the patches obtained by fitting each divided unit;
and step 3: determining an optimal feature combination based on feature importance and out-of-bag errors of different feature combinations, and realizing feature selection of a random forest classifier;
and 4, step 4: training a random forest classifier by adopting the optimal characteristic combination, and classifying a test set by using the trained classifier;
and 5: and carrying out topology optimization on the classification result to obtain a final point cloud classification result.
2. The method for point cloud classification in combination with region growing and random forests as claimed in claim 1, wherein said step 1 comprises:
step 1.1: carrying out normal vector and curvature estimation on the point cloud of the LiDAR in the training set point by adopting a random sampling consistency method and a principal component analysis method;
step 1.2: selecting a point with the minimum curvature as an initial seed point;
step 1.3: searching k adjacent points of the seed points by adopting a KD tree, and performing region growth by taking two characteristic similarities of a vertical distance and a normal vector included angle as growth conditions;
step 1.4: until no new adjacent point appears, the region growth is finished, and the seed point clustering result point set is separated from the original point cloud and stored as an independent unit;
step 1.5: repeating the step 1.2 to the step 1.4 until all the point clouds are segmented to obtain a plurality of segmentation units;
step 1.6: and performing surface patch fitting on each unit, calculating the elevation and normal vector characteristics of the segmented surface patches, and performing optimization integration on adjacent surface patches to obtain the final point cloud segmentation result.
3. A method of point cloud classification in conjunction with region growing and random forests as claimed in claim 2 wherein said step 1.1 comprises:
step 1.1.1: selecting k adjacent points of the current point based on a KD tree principle;
step 1.1.2: randomly selecting 3 points from the three points to establish an initial fitting plane to obtain a plane fitting equation, and calculating the distances from the other adjacent points to the fitting plane;
step 1.1.3: standard deviation by point-to-plane distance versus distance threshold TdEstimating to make the distance to the fitting plane less than TdThe adjacent points are marked as interior points, and the number of the interior points conforming to the plane model is counted;
step 1.1.4: repeating the step 1.1.2-step 1.1.3 for N times to obtain N plane equations, and selecting a fitting plane containing the largest number of interior points as a best fitting plane model of the points;
step 1.1.5: principal component analysis is performed on the interior point data contained in the best fitting plane model, and the covariance matrix C is obtained and expressed as:
Figure FDA0003318721750000021
x, Y, Z respectively represents one-dimensional vectors of X coordinates, Y coordinates and Z coordinates of all interior points obtained by the random sampling consistency method of k neighborhood points of a current point, and cov (-) represents the covariance of two components;
step 1.1.6: and calculating the eigenvalue and the eigenvector according to the covariance matrix C, wherein the eigenvector corresponding to the minimum eigenvalue is the normal vector of the point, and the ratio of the minimum eigenvalue to the sum of all eigenvalues is defined as the curvature of the point.
4. The method for point cloud classification in combination with region growing and random forests as claimed in claim 1, wherein said step 3 comprises:
step 3.1: training a random forest classifier by using the features extracted in the step 2;
step 3.2: testing the classification precision of the random forest classifier according to the data outside the bag, and simultaneously obtaining the importance index of each characteristic variable;
step 3.3: arranging the characteristic variables from high to low according to the importance indexes, and deleting the characteristic with the minimum importance index to form a group of new characteristic combinations;
step 3.4: repeating the step 3.1 to the step 3.3 until the number of the residual characteristic variables is equal to a given threshold value, and ending the iteration;
step 3.5: and selecting the feature combination with the minimum random forest out-of-bag error as the optimal feature combination.
5. A method for point cloud classification in combination with region growing and random forests as claimed in claim 1 or 4 wherein said step 4 comprises:
step 4.1: training a random forest classifier by using the optimal feature combination;
step 4.2: segmenting the test set data by using a region growing algorithm, and extracting the characteristics of a surface patch obtained by fitting each segmented unit;
step 4.3: and (4) inputting the result obtained in the step (4.2) into a trained random forest classifier, giving a test result for each decision tree in the random forest for each segmentation object, counting the test results of all the decision trees, and taking the test class with the highest ticket number as a final classification result.
6. The method for point cloud classification in combination with region growing and random forests as claimed in claim 1, wherein said step 5 comprises:
step 5.1: for point clouds with number less than given threshold TnSearching adjacent patches, judging whether the attributes of the patches are the same as those of the adjacent patches, and if the attributes of the patches are different from those of the adjacent patches, defining the patches as island patches;
step 5.2: calculating the three-dimensional distance between the island patch and its adjacent patch, if less than a given distance threshold TDIf so, merging the segmented patches into segmented patches with larger areas in adjacent patches, and re-classifying the segmented patches, otherwise, keeping the original classification results of the segmented patches.
7. A method of point cloud classification combining region growing and random forests according to claim 4, wherein said step 3.1 comprises:
step 3.1.1: number N of decision trees contained in a given random foresttRandomly selecting w characteristic variables from all the characteristics as split nodes of each decision tree, and generating the decision tree by continuously splitting the nodes;
step 3.1.2: assuming that a training set is segmented to obtain M segmented patches, taking the M segmented patches as M sample data, wherein each sample data contains l-dimensional features;
step 3.1.3: extracting h samples from the M samples by adopting a random sampling method to serve as a training sample set constructed by a single decision tree, wherein the samples which are not extracted are regarded as corresponding data outside the bag;
step 3.1.4: repeating the step 3.1.3, selecting NtEach training sample set being used for NtTraining of decision trees to generate NtThe decision trees form a random forest classifier.
8. A method of point cloud classification combining region growing and random forests according to claim 4, wherein said step 3.2 comprises:
step 3.2.1: testing a single decision tree by using corresponding sample data outside the bag, calculating the error outside the bag of the decision tree, and recording as err1Then, the characteristic variable l in the data outside the bag is comparedxNoise is added randomly for interference, and the error outside the bag is calculated again and recorded as err2Then the feature variable l in the decision treexIs that V ═ err1-err2|;
Step 3.2.2: assume a characteristic variable lxThe feature variable l is counted when the feature variable exists in r decision treesxThe total importance of (1) is the average of the sum of the importance of the variable in all decision trees to obtain a characteristic variable lxThe importance index of (2).
CN202111239501.1A 2021-10-25 2021-10-25 Point cloud classification method combining region growing and random forest Pending CN113989535A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111239501.1A CN113989535A (en) 2021-10-25 2021-10-25 Point cloud classification method combining region growing and random forest

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111239501.1A CN113989535A (en) 2021-10-25 2021-10-25 Point cloud classification method combining region growing and random forest

Publications (1)

Publication Number Publication Date
CN113989535A true CN113989535A (en) 2022-01-28

Family

ID=79740834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111239501.1A Pending CN113989535A (en) 2021-10-25 2021-10-25 Point cloud classification method combining region growing and random forest

Country Status (1)

Country Link
CN (1) CN113989535A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117541799A (en) * 2024-01-09 2024-02-09 四川大学 Large-scale point cloud semantic segmentation method based on online random forest model multiplexing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117541799A (en) * 2024-01-09 2024-02-09 四川大学 Large-scale point cloud semantic segmentation method based on online random forest model multiplexing
CN117541799B (en) * 2024-01-09 2024-03-08 四川大学 Large-scale point cloud semantic segmentation method based on online random forest model multiplexing

Similar Documents

Publication Publication Date Title
CN104091321B (en) It is applicable to the extracting method of the multi-level point set feature of ground laser radar point cloud classifications
CN109325960B (en) Infrared cloud chart cyclone analysis method and analysis system
Polewski et al. Detection of fallen trees in ALS point clouds using a Normalized Cut approach trained by simulation
CN112347894B (en) Single plant vegetation extraction method based on transfer learning and Gaussian mixture model separation
CN111985322A (en) Road environment element sensing method based on laser radar
Zhao et al. Automatic recognition of loess landforms using Random Forest method
CN109146889A (en) A kind of field boundary extracting method based on high-resolution remote sensing image
CN110992341A (en) Segmentation-based airborne LiDAR point cloud building extraction method
CN107341813B (en) SAR image segmentation method based on Structure learning and sketch characteristic inference network
CN106199557A (en) A kind of airborne laser radar data vegetation extracting method
CN113484875B (en) Laser radar point cloud target hierarchical identification method based on mixed Gaussian ordering
CN110348478B (en) Method for extracting trees in outdoor point cloud scene based on shape classification and combination
CN111860359B (en) Point cloud classification method based on improved random forest algorithm
Hui et al. Wood and leaf separation from terrestrial LiDAR point clouds based on mode points evolution
CN103136545A (en) High resolution remote sensing image analysis tree automatic extraction method based on space consistency
CN106529501A (en) Fingerprint and finger vein image fusion method based on weighted fusion and layered serial structure
CN113989535A (en) Point cloud classification method combining region growing and random forest
Patel et al. Adaboosted extra trees classifier for object-based multispectral image classification of urban fringe area
Bai et al. Semantic segmentation of sparse irregular point clouds for leaf/wood discrimination
CN108647719B (en) Non-surveillance clustering method for big data quantity spectral remote sensing image classification
CN117765006A (en) Multi-level dense crown segmentation method based on unmanned aerial vehicle image and laser point cloud
Hetti Arachchige Automatic tree stem detection–a geometric feature based approach for MLS point clouds
Diez et al. Comparison of algorithms for Tree-top detection in Drone image mosaics of Japanese Mixed Forests.
CN111311643B (en) Video target tracking method using dynamic search
Dong et al. Unsupervised semantic segmenting TLS data of individual tree based on smoothness constraint using open-source datasets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination