CN112418363B - Complex background region landslide classification model establishing and identifying method and device - Google Patents

Complex background region landslide classification model establishing and identifying method and device Download PDF

Info

Publication number
CN112418363B
CN112418363B CN202110093373.8A CN202110093373A CN112418363B CN 112418363 B CN112418363 B CN 112418363B CN 202110093373 A CN202110093373 A CN 202110093373A CN 112418363 B CN112418363 B CN 112418363B
Authority
CN
China
Prior art keywords
landslide
classification model
optimal
feature
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110093373.8A
Other languages
Chinese (zh)
Other versions
CN112418363A (en
Inventor
李显巨
陈伟涛
王力哲
陈刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN202110093373.8A priority Critical patent/CN112418363B/en
Publication of CN112418363A publication Critical patent/CN112418363A/en
Application granted granted Critical
Publication of CN112418363B publication Critical patent/CN112418363B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images

Abstract

The invention provides a method and a device for establishing and identifying a landslide classification model in a complex background area, and relates to the establishment of a class imbalance model and landslide identification. The invention relates to a method for establishing a landslide classification model in a complex background area, which comprises the following steps: acquiring laser radar data of a research area; constructing a terrain object according to the laser radar data, and determining a feature vector of the terrain object according to the terrain object to determine a data set; performing combined optimization on the classification model parameters and the balance coefficients according to the data set to determine the optimal cooperative balance coefficients and the optimal cooperative classification model parameters; determining a robust sensitive feature subset after class balance according to the collaborative optimal balance coefficient and the collaborative optimal classification model parameter; and establishing a landslide classification model according to the class-balanced robust sensitive feature subset. According to the technical scheme, the landslide remote sensing identification precision is improved.

Description

Complex background region landslide classification model establishing and identifying method and device
Technical Field
The invention relates to the technical field of class unbalance model establishment and landslide identification, in particular to a method and a device for establishing and identifying a landslide classification model in a complex background area.
Background
In the prior art, in the field of landslide remote sensing identification, the problem of class imbalance is not paid enough attention yet, and the remote sensing class imbalance learning method has the following two problems: (1) due to the lack of an effective balance coefficient optimizing method, optimal class balanced data cannot be acquired; (2) and due to the lack of the robust sensitive feature subset after class balancing, an optimized data set cannot be provided for subsequent classification. These problems further restrict the improvement of remote sensing recognition accuracy of landslide.
Disclosure of Invention
The problem to be solved by the invention is how to improve the remote sensing identification precision of the landslide.
In order to solve the above problems, the present invention provides a method for establishing a landslide classification model in a complex background region, comprising: acquiring laser radar data of a research area; constructing a terrain object according to the laser radar data, and determining a feature vector of the terrain object according to the terrain object to determine a data set; performing combined optimization on the classification model parameters and the balance coefficients according to the data set to determine the optimal cooperative balance coefficients and the optimal cooperative classification model parameters; determining a robust sensitive feature subset after class balance according to the collaborative optimal balance coefficient and the collaborative optimal classification model parameter; and establishing a landslide classification model according to the class-balanced robust sensitive feature subset.
According to the method for establishing the landslide classification model in the complex background area, the optimal data after class balance and the optimized data set are provided for landslide classification through the collaborative optimal balance coefficient determined by the classification model parameter and balance coefficient combined optimization and the robust sensitive feature subset after class balance determined by the collaborative optimal classification model parameter, and the landslide remote sensing identification precision is further improved.
Optionally, the constructing a terrain object from the lidar data and determining a terrain object feature vector from the terrain object to determine a data set comprises: extracting a pixel scale terrain feature vector, wherein the pixel scale terrain feature vector comprises terrain features, average texture features and slope texture features based on the terrain features and a gray level co-occurrence matrix, and filtering features based on the terrain features; determining an image input layer from the pixel scale terrain feature vector, and carrying out image segmentation on the image input layer through a multi-scale segmentation algorithm and an optimal scale parameter obtained based on a POF (point of fusion) method so as to construct the terrain object; and determining the feature vector of the terrain object by adopting an object feature extraction method based on the pixel scale terrain feature vector and the elements of the terrain object, thereby determining the data set.
According to the method for establishing the landslide classification model in the complex background area, the data set is determined by constructing the terrain object and determining the characteristic vector of the terrain object, the precision and the effectiveness of the data set are improved, and the landslide remote sensing identification precision is further improved.
Optionally, the jointly optimizing the classification model parameters and the balance coefficients according to the data set to determine the co-optimal balance coefficients and the co-optimal classification model parameters includes: dividing a data set into a training set and a testing set by adopting ten-fold cross validation, wherein the training set comprises a landslide object set and a non-landslide object set; increasing landslide object samples in the landslide object set by utilizing an SMOTE algorithm to obtain a new landslide object set; adopting the new landslide object set and the non-landslide object set to construct a classification model, and classifying the test set; and performing combined optimization on the balance coefficient and the classification model parameter by adopting ten-fold cross validation to determine the collaborative optimal balance coefficient and the collaborative optimal classification model parameter.
According to the method for establishing the landslide classification model in the complex background area, the collaborative optimal balance coefficient and the collaborative optimal classification model parameter are determined by jointly optimizing the classification model parameter and the balance coefficient through the data set, the optimal class-balanced data and the optimized data set are provided for landslide classification, and therefore landslide remote sensing identification precision is improved.
Optionally, the determining the class-balanced robust sensitive feature subset according to the collaborative optimal balance coefficient and the collaborative optimal classification model parameter includes: acquiring an optimal class-balanced feature selection data set according to the collaborative optimal balance coefficient; determining a feature subset according to the class-balanced feature selection data set and the collaborative optimal classification model parameters; and determining the robust sensitive feature subset after class balancing according to the feature subset.
According to the method for establishing the landslide classification model in the complex background area, the optimal data after class balance and the optimized data set are provided for landslide classification through the collaborative optimal balance coefficient determined by the classification model parameter and balance coefficient combined optimization and the robust sensitive feature subset after class balance determined by the collaborative optimal classification model parameter, and the landslide remote sensing identification precision is further improved.
Optionally, the obtaining an optimal class-balanced feature selection data set according to the collaborative optimal balance coefficient includes: and constructing a new landslide object set according to the collaborative optimal balance coefficient by adopting an SMOTE algorithm to determine a new training set so as to obtain the optimal feature selection data set after class balance.
According to the method for establishing the landslide classification model in the complex background area, the new landslide object set is established by cooperating with the optimal balance coefficient to determine the new training set so as to obtain the optimal feature selection data set after class balance, provide the optimal data set after class balance and the optimal data set for landslide classification, and further improve the landslide remote sensing identification precision.
Optionally, the determining a feature subset according to the class-balanced feature selection data set and the collaborative optimal classification model parameter includes: after the optimal class-balanced feature selection data set is obtained, determining the feature subset according to the class-balanced feature selection data set and the collaborative optimal classification model parameters by adopting a varSelRF algorithm, wherein the varSelRF algorithm comprises the following steps: presetting the parameter ntree of the varSelRF algorithm as a default value, and setting the parameter ntree of the collaborative optimal classification model0A parameter ntreeIterat given to the varSelRF algorithm, and a parameter mtry according to the collaborative optimal classification model0Determining a parameter mtryFactor of the varSelRF algorithm to determine the feature subset.
According to the method for establishing the landslide classification model in the complex background area, the characteristic subset is determined by setting the parameters of the varSelRF algorithm, the optimal data after class balance and the optimized data set are provided for landslide classification, and the landslide remote sensing identification precision is further improved.
Optionally, the determining the class-balanced robust sensitive feature subset according to the feature subset includes: determining a first feature subset and a second feature subset according to the feature subsets and a feature selection time threshold; performing a union operation on the specified features of the first feature subset and the second feature subset in specified proportion, and performing an intersection operation on the rest features to determine a union and intersection; performing a union operation on the union and intersection to determine the class-balanced robust sensitive feature subset.
According to the method for establishing the landslide classification model in the complex background area, the most important features in the two feature sets in the specified proportion are executed and operated, the rest features are subjected to intersection operation, then the union and intersection are executed and operated, so that the robust sensitive feature subset after class balance is determined, the optimal data after class balance and the optimized data set are provided for landslide classification, and further the landslide remote sensing identification precision is improved.
The invention also provides a landslide identification method for the complex background area, which comprises the following steps: classifying landslides and non-landslides according to the landslide classification model established by the complex background region landslide classification model establishing method to determine landslide and non-landslide classification results of the research region; and determining a landslide boundary based on the landslide semi-automatic identification method and the classification result to realize landslide identification. Compared with the prior art, the complex background region landslide identification method and the complex background region landslide classification model establishment method have the same advantages, and are not repeated herein.
Optionally, the classifying landslide and non-landslide according to the landslide classification model includes: inputting a data set of a research area into the landslide classification model, wherein the data set is an object feature set of all terrain object primitives of the research area; and predicting the object characteristic set according to the landslide classification model to determine landslide and non-landslide classification results.
The invention also provides a device for identifying the landslide in the complex background area, which comprises a computer readable storage medium and a processor, wherein the computer readable storage medium is used for storing a computer program, and the computer program is read by the processor and runs to realize the method for establishing the landslide classification model in the complex background area or the method for identifying the landslide in the complex background area.
Drawings
FIG. 1 is a schematic diagram of a complex background region landslide classification model building method according to an embodiment of the invention;
FIG. 2 is a schematic diagram of a process for jointly optimizing classification model parameters and balance coefficients according to a data set according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating a process of determining a class-balanced robust sensitive feature subset according to a co-optimal balance coefficient and co-optimal classification model parameters according to an embodiment of the present invention;
fig. 4 is a process diagram of a landslide semi-automatic identification method according to an embodiment of the present invention.
Detailed Description
Landslides are widely distributed worldwide, are second-only to earthquakes, and cause a great deal of casualties and economic loss. Therefore, the method has important theoretical and practical significance for landslide disaster prediction and early warning, disaster prevention and reduction, real-time disaster assessment and the like by quickly and accurately identifying the landslide.
The remote sensing technology has the technical characteristics of large-area synchronous observation, strong timeliness and capability of realizing dynamic observation, and becomes a main means for landslide identification. The classification method based on optical images, terrain data and machine learning algorithms is currently the most effective and deeply studied landslide identification method, but the landslide identification accuracy is not high. When the classification method is adopted to identify landslide, classification data concentrated landslide and non-landslide samples usually have the characteristic of class imbalance, and the problem of class imbalance is one of important reasons for hindering the improvement of landslide identification precision.
In a complex geological background area, influenced by some factors, the separability of landslide and non-landslide is low, so that the problem of class imbalance is more prominent: (1) high vegetation coverage, finely divided cut terrain, and ergonomic activity mask or impair morphological characterization of landslides; (2) bedrock exposure causes the landslide area and the surrounding non-landslide area to have similar surface roughness. Compared with conventional photogrammetry, the laser radar (LiDAR) technology can penetrate vegetation with certain coverage because laser pulses are not easily influenced by shadow and sun angles, so that the quality of data acquisition is greatly improved, and high-precision bare earth surface terrain information is acquired. However, the large number of topographic features derived from LiDAR results in a high dimensional set of strongly correlated features, which also exacerbates the class imbalance problem.
In the fields of data mining and artificial intelligence, the problem of class imbalance has attracted extensive attention and deep discussion, but in the field of remote sensing identification of landslides, sufficient attention has not been attracted yet. At present, the class imbalance learning method mainly comprises a data level sampling and feature selection method and an algorithm level method, such as an improved classification algorithm, a single-class classifier, a cost-sensitive learning and ensemble learning algorithm and the like. The data layer method is easy to operate, simple and effective; and the algorithm level is often complex and the operation is relatively difficult. In the field of remote sensing classification, the processing method is mostly directly applied to class imbalance research, and a data-level method is mainly used. In general, the remote sensing type imbalance learning method has the following two problems: (1) due to the lack of an effective balance coefficient optimizing method, optimal class balanced data cannot be acquired; (2) and due to the lack of the robust sensitive feature subset after class balancing, an optimized data set cannot be provided for subsequent classification. These problems further restrict the improvement of remote sensing recognition accuracy of landslide.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.
As shown in fig. 1, an embodiment of the present invention provides a method for establishing a complex background region landslide classification model, including: acquiring laser radar data of a research area; constructing a terrain object according to the laser radar data, and determining a feature vector of the terrain object according to the terrain object to determine a data set; performing combined optimization on the classification model parameters and the balance coefficients according to the data set to determine the optimal cooperative balance coefficients and the optimal cooperative classification model parameters; determining a robust sensitive feature subset after class balance according to the collaborative optimal balance coefficient and the collaborative optimal classification model parameter; and establishing a landslide classification model according to the class-balanced robust sensitive feature subset.
Specifically, in this embodiment, the method for establishing the complex background region landslide classification model includes: laser radar data (LiDAR DTM data) of a research area is obtained, and collected LiDAR point cloud data are filtered, classified and the like to obtain the LiDAR DTM data of a bare ground surface. Using LiDAR DTM as the base data, the spatial resolution was 3 x 3 m. In addition, the referenced landslide boundary data is derived based on collected landslide inventory maps, field survey work, and visual interpretation.
In the present embodiment, the ziguo segment Qinggan river basin in the three gorge region is selected as a research region, and is mainly based on the following considerations: (1) the geological background of the three gorges area is complex and the vegetation coverage is high; (2) the green arid river basin belongs to a typical landslide sensitive area. Therefore, the research of the class imbalance learning method in the region has typical significance for the landslide identification research.
After laser radar data of a research area are obtained, a terrain object is constructed according to the laser radar data, pixel scale terrain feature vectors are extracted at first, namely the pixel scale terrain feature vectors are extracted based on LiDAR DTM data: topographic features including DTM, slope and surface roughness; secondly, based on the GLCM and the average texture features and slope texture features of the terrain features, including contrast, correlation, angle second moment, entropy and homogeneity; filtering characteristics based on the terrain characteristics comprise moving average filtering and standard deviation filtering. Wherein the size of the texture and the filter characteristic window are both 3 x 3 pixels.
Then, an image input layer is optimized from the pixel scale terrain feature vector, and image segmentation is carried out by adopting a multi-scale segmentation algorithm and an optimal scale parameter obtained based on a POF (planar Objective function) method to construct a terrain object.
After the terrain object is constructed, a terrain object feature vector is calculated based on terrain object primitives, pixel scale terrain feature vectors and an object feature extraction method, wherein the terrain object feature vector comprises a maximum value, a minimum value, a mean value and a standard deviation. In addition, geometrical features such as aspect ratio and shape index are extracted. The data set is determined by constructing a terrain object and determining a feature vector of the terrain object.
After the data set is determined, the classification model parameters and the balance coefficients are jointly optimized, namely landslide samples are added based on SMOTE and the balance coefficients, the balance coefficients and the RF model parameters are jointly optimized by adopting a ten-fold cross validation method, and a new objective function is designed to obtain a collaborative optimal parameter combination, namely a collaborative optimal balance coefficient and a collaborative optimal classification model parameter.
After determining the collaborative optimal balance coefficient and the collaborative optimal classification model parameter, determining a robust sensitive feature subset after class balance according to the collaborative optimal balance coefficient and the collaborative optimal classification model parameter, firstly, obtaining an optimal feature Selection data set after class balance based on the collaborative optimal balance coefficient, and constructing a varSelRF (Variable Selection using Random forest) algorithm, and adopting the collaborative optimal classification model parameter to determine the feature subset, wherein in the embodiment, the classification model can adopt an RF (Random forest) model; and then, obtaining the robust sensitive feature subset after class balance by adopting a plurality of sensitive feature subsets after class balance, a feature selection frequency threshold method and an improved feature set and algorithm.
And after the robust sensitive feature subset after class balance is obtained, establishing a landslide classification model according to the robust sensitive feature subset after class balance. Specifically, landslide and non-landslide classification experiments are developed based on a robust sensitive feature subset after class balance and Machine learning algorithms such as a RF and SVM (Support Vector Machine) algorithm, namely, an algorithm model with high classification precision is constructed by adopting a sensitive feature combination after class balance and part of training data.
After the landslide classification model is established, model precision needs to be evaluated, and the model precision is established and evaluated based on a training set and a test set acquired by a hierarchical random sampling method; then, two spatially independent test areas are adopted for cross training and classification, so that the robustness of the model is evaluated.
The adopted imbalance-like learning evaluation indexes comprise: precision (precision), recall (recall), F-measure (F-mean) and G-mean (G-mean) and Receiver Operating Characteristics (ROC) curves.
In the embodiment, the optimal class-balanced data and the optimized data set are provided for landslide classification through the collaborative optimal balance coefficient determined by the classification model parameter and the balance coefficient in a combined optimization mode and the robust sensitive feature subset after class balance determined by the collaborative optimal classification model parameter, so that the landslide remote sensing identification precision is improved.
Optionally, the constructing a terrain object from the lidar data and determining a terrain object feature vector from the terrain object to determine a data set comprises: extracting a pixel scale terrain feature vector, wherein the pixel scale terrain feature vector comprises terrain features, average texture features and slope texture features based on the terrain features and a gray level co-occurrence matrix, and filtering features based on the terrain features; determining an image input layer from the pixel scale terrain feature vector, and carrying out image segmentation on the image input layer through a multi-scale segmentation algorithm and an optimal scale parameter obtained based on a POF (point of fusion) method so as to construct the terrain object; and determining the feature vector of the terrain object by adopting an object feature extraction method based on the pixel scale terrain feature vector and the elements of the terrain object, thereby determining the data set.
Specifically, in this embodiment, constructing a terrain object from the lidar data, and determining a terrain object feature vector from the terrain object to determine the data set comprises:
(1) extracting a pixel scale terrain feature vector, wherein the pixel scale terrain feature vector comprises terrain features, average texture features and slope texture features based on the terrain features and a gray level co-occurrence matrix, and filtering features based on the terrain features; wherein the topographic features comprise DTM, gradient, slope and surface roughness; the average texture features and slope texture features based on GLCM and the terrain features, including contrast, correlation, angular second moment, entropy and homogeneity; filtering characteristics based on the terrain characteristics, including moving average filtering and standard deviation filtering;
(2) determining an image input layer from the pixel scale terrain feature vector, and carrying out image segmentation on the image input layer through a multi-scale segmentation algorithm and an optimal scale parameter obtained based on a POF (point of fusion) method so as to construct a terrain object;
(3) and determining the feature vector of the terrain object by adopting an object feature extraction method based on the pixel scale terrain feature vector and the primitive of the terrain object, thereby determining the data set through the terrain object and the feature vector of the terrain object.
In the embodiment, the data set is determined by constructing the terrain object and determining the characteristic vector of the terrain object, so that the precision and the effectiveness of the data set are improved, and the landslide remote sensing identification precision is further improved.
Optionally, the jointly optimizing the classification model parameters and the balance coefficients according to the data set to determine the co-optimal balance coefficients and the co-optimal classification model parameters includes: dividing the data set into a training set and a testing set by adopting ten-fold cross validation, wherein the training set comprises a landslide object set and a non-landslide object set; increasing landslide object samples in the landslide object set by utilizing an SMOTE algorithm to obtain a new landslide object set; adopting the new landslide object set and the non-landslide object set to construct a classification model, and classifying the test set; and performing combined optimization on the balance coefficient and the classification model parameter by adopting ten-fold cross validation to determine the collaborative optimal balance coefficient and the collaborative optimal classification model parameter.
Specifically, in this embodiment, jointly optimizing the classification model parameters and the balance coefficients according to the data set to determine the optimal balance coefficients and the optimal classification model parameters includes:
(1) dividing the parameter optimization data set into a training set and a testing set by adopting ten-fold cross validation, wherein the training set comprises a landslide object set and a non-landslide object set;
(2) increasing landslide object samples by using an SMOTE algorithm to obtain a new landslide object set; constructing a classification model by adopting a new landslide object set, and classifying the test set;
(3) and performing combined optimization on the balance coefficient and the classification model parameters by adopting ten-fold cross validation to determine the optimal cooperative balance coefficient and the optimal cooperative classification model parameters.
In conjunction with fig. 2, assuming the data set is D, the data set is divided into a training set Tr and a test set Te using ten-fold cross validation. Set of landslide objects in Tr as OLSThe set of non-landslide objects is ONLS. The number of landslide picture elements (objects) and non-landslide picture elements (objects) in the study area is unbalanced and the landslide picture elements (objects) are relatively few. Then Tr imbalance ratio UBIs composed of
UB=|ONLS|/|OLS|(UB>1)
Wherein, | | represents the number of elements of the set.
Assuming Be as a balance coefficient, firstly fixing a non-landslide object set O in a training setNLSIncreasing landslide object samples by using SMOTE algorithm to obtain new landslide object set OLS-1Then, then
|OLS-1|=Be*|OLS|
Wherein Be may Be set as follows:
be =1+0.1 n (n is a positive integer and Be ≦ UB
Using a new training set Tr-1(containing O)LS-1And ONLS) An RF model was constructed and test set Te was classified. And performing joint optimization on the balance coefficient Be and RF model parameters ntree and mtry by adopting cross-folding cross validation, wherein the parameter ntree represents the number of integrated classification trees, and the parameter mtry represents the number of feature vectors adopted during each splitting.
Wherein, the value range of the parameter ntree can be set as follows:
ntree = 100N (N is a positive integer and is 10 or less)
mtry is a positive integer and is less than or equal to the number of feature vectors of the terrain object.
Thereby constructing a new parameter optimizing objective function
Of=OA-|UA-PA|
Wherein OA represents the average total precision obtained by the ten-fold cross validation, UA is the average user precision, PA is the average producer precision, Of is the parameter optimization objective function, and | l represents an absolute value; of at maximum, the optimal parameter combination is Be0,ntree0And mtry0
Wherein, as shown in fig. 2 and 3, the symbolic representation of the number of integrated classification trees includes ntree, ntree0,ntree1(ii) a The symbolic representation of the number of the eigenvectors adopted during each splitting comprises mtry and mtry0,mtry1,mtry2
In the embodiment, the classification model parameters and the balance coefficients are jointly optimized through the data set to determine the collaborative optimal balance coefficients and the collaborative optimal classification model parameters, so that the optimal class-balanced data and the optimized data set are provided for landslide classification, and the landslide remote sensing identification precision is further improved.
Optionally, the determining the class-balanced robust sensitive feature subset according to the collaborative optimal balance coefficient and the collaborative optimal classification model parameter includes: acquiring an optimal class-balanced feature selection data set according to the collaborative optimal balance coefficient; determining a feature subset according to the class-balanced feature selection data set and the collaborative optimal classification model parameters; and determining the robust sensitive feature subset after class balancing according to the feature subset.
Specifically, in this embodiment, determining the class-balanced robust sensitive feature subset according to the collaborative optimal balance coefficient and the collaborative optimal classification model parameter includes:
(1) acquiring an optimal class-balanced feature selection data set according to the collaborative optimal balance coefficient;
(2) determining a feature subset according to the feature selection data set and the collaborative optimal classification model parameters after class balance by adopting a varSelRF algorithm;
(3) and determining the robust sensitive feature subset after class balancing according to the feature subset.
In which the varSelRF algorithm randomly samples two-thirds of the samples from the data as a training set, and the remaining one-third of out-of-bag samples (OOB) as a test set. An initial variable importance ranking is first obtained using a large RF model (ntree = 5000), and then some least important features are iteratively culled. In each iteration, 20% of the least important features are culled, and the remaining features are trained to a new RF model (ntreeIterat = 2000), and OOB samples are used to evaluate their error rate (OOB error). Finally, the feature set adopted by the RF model with the lowest OOB error is the selected feature subset.
In the embodiment, the optimal class-balanced data and the optimized data set are provided for landslide classification through the collaborative optimal balance coefficient determined by the classification model parameter and the balance coefficient in a combined optimization mode and the robust sensitive feature subset after class balance determined by the collaborative optimal classification model parameter, so that the landslide remote sensing identification precision is improved.
Optionally, the obtaining an optimal class-balanced feature selection data set according to the collaborative optimal balance coefficient includes: and constructing a new landslide object set according to the collaborative optimal balance coefficient by adopting an SMOTE algorithm to determine a new training set so as to obtain the optimal feature selection data set after class balance.
Specifically, in this embodiment, obtaining the optimal feature selection data set after class balancing according to the collaborative optimal balancing coefficient includes: and constructing a new landslide object set according to the collaborative optimal balance coefficient by adopting an SMOTE algorithm to determine a new training set so as to obtain an optimal feature selection data set after class balance.
The feature selection dataset construction after class balance based on the collaborative optimal balance coefficient and the SMOTE algorithm can adopt the following form.
As shown in connection with FIG. 3, assume that the feature selection dataset is DThe training set is TrTest set Te。TrSet of objects in the middle landslide is O LSThe set of non-landslide objects is O NLS. Using a co-optimal balance coefficient Be0And SMOTE algorithm constructs new landslide object set O LS-1Obtaining a new training set Tr -1(containing O) LS-1And O NLS)。
In this embodiment, a new landslide object set is constructed by cooperating with the optimal balance coefficient to determine a new training set, so as to obtain an optimal post-class-balance feature selection data set, provide optimal post-class-balance data and an optimal data set for landslide classification, and further improve the precision of landslide remote sensing identification.
Optionally, the determining a feature subset according to the class-balanced feature selection data set and the collaborative optimal classification model parameter includes: after the optimal class-balanced feature selection data set is obtained, determining the feature subset according to the class-balanced feature selection data set and the collaborative optimal classification model parameters by adopting a varSelRF algorithm, wherein the varSelRF algorithm comprises the following steps: presetting parameter ntree of the varSelRF algorithm as a default value, and enabling the parameter ntree to be preset as a default valueThe collaborative optimal classification model parameter ntree0A parameter ntreeIterat given to the varSelRF algorithm, and a parameter mtry according to the collaborative optimal classification model0Determining a parameter mtryFactor of the varSelRF algorithm to determine the feature subset.
Specifically, in this embodiment, the determining the feature subset according to the class-balanced feature selection data set and the collaborative optimal classification model parameter includes: after the optimal class-balanced feature selection data set is obtained, the parameter ntree of the varSelRF algorithm is preset as a default value, and the parameter ntree of the collaborative optimal classification model is used0Parameter ntreeIterat given to varSelRF algorithm and parameter mtry according to collaborative optimal classification model0The parameter mtryFactor of the varSelRF algorithm is determined to determine the feature subset.
Referring to fig. 3, the VarSelRF method based on the collaborative optimal classification model parameter may adopt a form in which after the feature selection data set after class balancing is acquired, the parameter ntree of the VarSelRF is preset as a default value, and the parameter ntree of the collaborative optimal classification model is preset as a default value0Directly endowing a parameter ntreeIterat; parameter mtryFactor is
Figure DEST_PATH_IMAGE002AAA
According to the parameter mtry of the collaborative optimal classification model0Determining
mtryFactor=mtry0/
Figure DEST_PATH_IMAGE002AAAA
Thus, the selected feature subset, namely the robust sensitive feature subset after class balancing, can be obtained.
In the embodiment, the characteristic subset is determined by setting the parameters of the varSelRF algorithm, and the optimal class-balanced data and the optimized data set are provided for landslide classification, so that the precision of remote sensing identification of landslide is improved.
Optionally, the determining the class-balanced robust sensitive feature subset according to the feature subset includes: determining a first feature subset and a second feature subset according to the feature subsets and a feature selection time threshold; performing a union operation on the specified features of the first feature subset and the second feature subset in specified proportion, and performing an intersection operation on the rest features to determine a union and intersection; performing a union operation on the union and intersection to determine the class-balanced robust sensitive feature subset.
Specifically, in this embodiment, determining the class-balanced robust sensitive feature subset according to the feature subset includes: determining a first feature subset and a second feature subset according to the feature subsets and a feature selection time threshold; performing a union operation on the specified features in the first feature subset and the specified features in the second feature subset in specified proportion, and performing an intersection operation on the rest features to determine a union and intersection; and performing a union operation on the union and the intersection to determine a class-balanced robust sensitive feature subset.
With reference to fig. 3, 20 feature selection data sets are used, the feature selection number threshold method is respectively executed in two groups, and the threshold is set to 8 (i.e. selected in 80%) to obtain two feature subsets; then, a modified feature set union algorithm is performed on them, wherein the specified proportion in the first union operation is 80%. Performing intersection operation on the rest characteristics to determine intersection; and performing a union operation on the union and the intersection to determine a class-balanced robust sensitive feature subset.
In the embodiment, the most important features in the two feature sets in the specified proportion are executed and operated, the rest features are subjected to intersection operation, and then the sum and the intersection are executed and operated to determine the robust sensitive feature subset after class balance, so that the optimal class-balanced data and the optimized data set are provided for landslide classification, and further the remote sensing identification precision of the landslide is improved.
Another embodiment of the present invention provides a method for identifying a landslide in a complex background area, including: classifying landslides and non-landslides according to the landslide classification model established by the complex background region landslide classification model establishing method to determine landslide and non-landslide classification results of the research region; and determining a landslide boundary based on the landslide semi-automatic identification method and the classification result to realize landslide identification.
Specifically, in this embodiment, first, an algorithm model with high classification accuracy is constructed by using the post-class-balance sensitive feature combination and part of training data; then obtaining classification results of landslides and non-landslides in a research area; and then obtaining a landslide boundary based on a landslide semi-automatic identification method, and evaluating an identification result by adopting a Position mismatch value (PM) and a reference landslide boundary.
With reference to fig. 4, the semi-automatic landslide identification method specifically includes: firstly, filling an island surrounded by a landslide object; then removing isolated small pattern spots far away from the green dry river and the landslide body; and then, defining an envelope curve of the landslide object by adopting manual digital operation, ensuring that the boundary is smooth as much as possible, drawing the side edge of the landslide along the downhill direction, closing the C-shaped semi-surrounding boundary and neglecting the protruded pattern spots of the side edge.
Optionally, the classifying landslide and non-landslide according to the landslide classification model includes: inputting a data set of a research area into the landslide classification model, wherein the data set is an object feature set of all terrain object primitives of the research area; and predicting the object characteristic set according to the landslide classification model to determine landslide and non-landslide classification results.
Another embodiment of the present invention provides a device for identifying a landslide in a complex background region, including a computer-readable storage medium storing a computer program and a processor, where the computer program is read by the processor and executed to implement the method for establishing a landslide classification model in a complex background region or the method for identifying a landslide in a complex background region.
Although the present disclosure has been described above, the scope of the present disclosure is not limited thereto. Various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present disclosure, and these changes and modifications are intended to be within the scope of the present disclosure.

Claims (9)

1. A method for establishing a landslide classification model in a complex background area is characterized by comprising the following steps:
acquiring laser radar data of a research area;
constructing a terrain object according to the laser radar data, and determining a feature vector of the terrain object according to the terrain object to determine a data set;
jointly optimizing the classification model parameters and the balance coefficients according to the data set to determine the optimal balance coefficients and the optimal classification model parameters, specifically comprising: dividing the data set into a training set and a testing set by adopting ten-fold cross validation, wherein the training set comprises a landslide object set and a non-landslide object set; increasing landslide object samples in the landslide object set by utilizing an SMOTE algorithm to obtain a new landslide object set; adopting the new landslide object set and the non-landslide object set to construct a classification model, and classifying the test set; performing combined optimization on the balance coefficient and the classification model parameter by adopting cross-folding verification to determine the collaborative optimal balance coefficient and the collaborative optimal classification model parameter;
determining a robust sensitive feature subset after class balance according to the collaborative optimal balance coefficient and the collaborative optimal classification model parameter;
and establishing a landslide classification model according to the class-balanced robust sensitive feature subset.
2. The method of claim 1, wherein the constructing a terrain object from the lidar data and determining a terrain object feature vector from the terrain object to determine a dataset comprises:
extracting a pixel scale terrain feature vector, wherein the pixel scale terrain feature vector comprises terrain features, average texture features and slope texture features based on the terrain features and a gray level co-occurrence matrix, and filtering features based on the terrain features;
determining an image input layer from the pixel scale terrain feature vector, and carrying out image segmentation on the image input layer through a multi-scale segmentation algorithm and an optimal scale parameter obtained based on a POF (point of fusion) method so as to construct the terrain object;
and determining the feature vector of the terrain object by adopting an object feature extraction method based on the pixel scale terrain feature vector and the elements of the terrain object, thereby determining the data set.
3. The method for establishing the complex background region landslide classification model according to claim 1, wherein the determining the class-balanced robust sensitive feature subset according to the co-optimal balance coefficient and the co-optimal classification model parameters comprises:
acquiring an optimal class-balanced feature selection data set according to the collaborative optimal balance coefficient;
determining a feature subset according to the class-balanced feature selection data set and the collaborative optimal classification model parameters;
and determining the robust sensitive feature subset after class balancing according to the feature subset.
4. The method for establishing the complex background region landslide classification model according to claim 3, wherein the obtaining an optimal class-balanced feature selection data set according to the collaborative optimal balance coefficient comprises:
and constructing a new landslide object set according to the collaborative optimal balance coefficient by adopting an SMOTE algorithm to determine a new training set so as to obtain the optimal feature selection data set after class balance.
5. The method for building a complex background region landslide classification model according to claim 3, wherein the determining a feature subset according to the class-balanced feature selection dataset and the collaborative optimal classification model parameters comprises:
after the optimal class-balanced feature selection data set is obtained, determining the feature subset according to the class-balanced feature selection data set and the collaborative optimal classification model parameters by adopting a varSelRF algorithm, wherein the varSelRF algorithm comprises the following steps: presetting the parameter ntree of the varSelRF algorithm as a default value, and setting the parameter ntree of the collaborative optimal classification model0A parameter ntreeIterat given to the varSelRF algorithm, and a parameter mtry according to the collaborative optimal classification model0Determining a parameter mtryFactor of the varSelRF algorithm to determine the feature subset.
6. The method for building the complex background region landslide classification model according to claim 3, wherein the determining the class-balanced robust sensitive feature subset according to the feature subset comprises:
determining a first feature subset and a second feature subset according to the feature subsets and a feature selection time threshold;
performing a union operation on the specified features of the first feature subset and the second feature subset in specified proportion, and performing an intersection operation on the rest features to determine a union and intersection;
performing a union operation on the union and intersection to determine the class-balanced robust sensitive feature subset.
7. A landslide identification method for a complex background area is characterized by comprising the following steps:
the landslide classification model built according to the complex background region landslide classification model building method of any one of claims 1 to 6 is used for conducting landslide and non-landslide classification to determine landslide and non-landslide classification results of a research region;
and determining a landslide boundary based on the landslide semi-automatic identification method and the classification result to realize landslide identification.
8. The complex background region landslide identification method of claim 7 wherein classifying landslides and non-landslides according to the landslide classification model comprises:
inputting a data set of a research area into the landslide classification model, wherein the data set is an object feature set of all terrain object primitives of the research area;
and predicting the object characteristic set according to the landslide classification model to determine landslide and non-landslide classification results.
9. A complex background region landslide identification apparatus comprising a computer readable storage medium storing a computer program and a processor, wherein the computer program is read and executed by the processor to implement the complex background region landslide classification model building method of any one of claims 1 to 6 or the complex background region landslide identification method of any one of claims 7 and 8.
CN202110093373.8A 2021-01-25 2021-01-25 Complex background region landslide classification model establishing and identifying method and device Active CN112418363B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110093373.8A CN112418363B (en) 2021-01-25 2021-01-25 Complex background region landslide classification model establishing and identifying method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110093373.8A CN112418363B (en) 2021-01-25 2021-01-25 Complex background region landslide classification model establishing and identifying method and device

Publications (2)

Publication Number Publication Date
CN112418363A CN112418363A (en) 2021-02-26
CN112418363B true CN112418363B (en) 2021-05-04

Family

ID=74782496

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110093373.8A Active CN112418363B (en) 2021-01-25 2021-01-25 Complex background region landslide classification model establishing and identifying method and device

Country Status (1)

Country Link
CN (1) CN112418363B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007316709A (en) * 2006-05-23 2007-12-06 Sabo Frontier Foundation Method of imparting sediment disaster dangerous site identification code, and program therefor
CN104820826A (en) * 2015-04-27 2015-08-05 重庆大学 Digital elevation model-based slope form extraction and recognition method
CN106845498A (en) * 2017-01-19 2017-06-13 南京理工大学 With reference to the single width mountain range remote sensing images landslide detection method of elevation
CN107067012A (en) * 2017-04-25 2017-08-18 中国科学院深海科学与工程研究所 Submarine geomorphy cell edges intelligent identification Method based on image procossing

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102819023B (en) * 2012-07-27 2014-09-17 中国地质大学(武汉) Method and system of landslide recognition of complicated geological background area based on LiDAR
WO2016158800A1 (en) * 2015-03-31 2016-10-06 三菱重工業株式会社 Route planning system, route planning method, article arrangement planning system, article arrangement planning method, decision-making support system, computer program, and storage medium
US10599700B2 (en) * 2015-08-24 2020-03-24 Arizona Board Of Regents On Behalf Of Arizona State University Systems and methods for narrative detection and frame detection using generalized concepts and relations
JP7153330B2 (en) * 2018-01-22 2022-10-14 国立大学法人京都大学 Sediment disaster prediction device, computer program, sediment disaster prediction method and map information
CN110781825B (en) * 2019-10-25 2023-05-23 云南电网有限责任公司电力科学研究院 Power grid landslide area identification system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007316709A (en) * 2006-05-23 2007-12-06 Sabo Frontier Foundation Method of imparting sediment disaster dangerous site identification code, and program therefor
JP4459190B2 (en) * 2006-05-23 2010-04-28 財団法人砂防フロンティア整備推進機構 Sediment-related disaster hazard identification code management method and program
CN104820826A (en) * 2015-04-27 2015-08-05 重庆大学 Digital elevation model-based slope form extraction and recognition method
CN106845498A (en) * 2017-01-19 2017-06-13 南京理工大学 With reference to the single width mountain range remote sensing images landslide detection method of elevation
CN107067012A (en) * 2017-04-25 2017-08-18 中国科学院深海科学与工程研究所 Submarine geomorphy cell edges intelligent identification Method based on image procossing

Also Published As

Publication number Publication date
CN112418363A (en) 2021-02-26

Similar Documents

Publication Publication Date Title
Sevara et al. Pixel versus object—A comparison of strategies for the semi-automated mapping of archaeological features using airborne laser scanning data
Brodu et al. 3D terrestrial lidar data classification of complex natural scenes using a multi-scale dimensionality criterion: Applications in geomorphology
Wang et al. A region-growing approach for automatic outcrop fracture extraction from a three-dimensional point cloud
Pradhan et al. Data fusion technique using wavelet transform and Taguchi methods for automatic landslide detection from airborne laser scanning data and quickbird satellite imagery
Anniballe et al. Earthquake damage mapping: An overall assessment of ground surveys and VHR image change detection after L'Aquila 2009 earthquake
Wang et al. Optimal segmentation of high-resolution remote sensing image by combining superpixels with the minimum spanning tree
Wang et al. Land cover change detection at subpixel resolution with a Hopfield neural network
Brédif et al. Extracting polygonal building footprints from digital surface models: A fully-automatic global optimization framework
Korzeniowska et al. Mapping gullies, dunes, lava fields, and landslides via surface roughness
Liu et al. An object-based approach for two-level gully feature mapping using high-resolution DEM and imagery: A case study on hilly loess plateau region, China
Hormese et al. Automated road extraction from high resolution satellite images
Su et al. Deep convolutional neural network–based pixel-wise landslide inventory mapping
Dong et al. Selection of LiDAR geometric features with adaptive neighborhood size for urban land cover classification
Zhang et al. A modified method of discontinuity trace mapping using three-dimensional point clouds of rock mass surfaces
Jiang et al. Determining ground elevations covered by vegetation on construction sites using drone-based orthoimage and convolutional neural network
Li et al. Integrating multiple textural features for remote sensing image change detection
Ruban et al. The method for selecting the urban infrastructure objects contours
Adhikari et al. An integrated object and machine learning approach for tree canopy extraction from UAV datasets
Ajibola et al. Fusion of UAV-based DEMs for vertical component accuracy improvement
Xu et al. Feature-based constraint deep CNN method for mapping rainfall-induced landslides in remote regions with mountainous terrain: An application to Brazil
Sengar et al. Liquefaction identification using class-based sensor independent approach based on single pixel classification after 2001 Bhuj, India earthquake
CN112418363B (en) Complex background region landslide classification model establishing and identifying method and device
Damodaran et al. Attribute profiles on derived features for urban land cover classification
Günen et al. A novel edge detection approach based on backtracking search optimization algorithm (BSA) clustering
Opitz et al. Point clouds segmentation of mixed scenes with archeological standing remains: A multi-criteria and multi-scale iterative approach

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant