CN118296961A - Dynamic landslide susceptibility evaluation method - Google Patents

Dynamic landslide susceptibility evaluation method Download PDF

Info

Publication number
CN118296961A
CN118296961A CN202410468295.9A CN202410468295A CN118296961A CN 118296961 A CN118296961 A CN 118296961A CN 202410468295 A CN202410468295 A CN 202410468295A CN 118296961 A CN118296961 A CN 118296961A
Authority
CN
China
Prior art keywords
landslide
model
evaluation
evaluation index
susceptibility
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410468295.9A
Other languages
Chinese (zh)
Inventor
戴小军
黄伟逸
肖江鸿
高天翔
魏鹏
刘宇涛
胡启军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Petroleum University
Original Assignee
Southwest Petroleum University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Petroleum University filed Critical Southwest Petroleum University
Priority to CN202410468295.9A priority Critical patent/CN118296961A/en
Publication of CN118296961A publication Critical patent/CN118296961A/en
Pending legal-status Critical Current

Links

Landscapes

  • Pit Excavations, Shoring, Fill Or Stabilisation Of Slopes (AREA)

Abstract

The invention discloses a landslide susceptibility dynamic evaluation method, which combines machine learning and SBAS-I nSAR technology, adopts XGBoost model, SVM model and RF model to perform super-parameter optimization on the basis of considering evaluation index interval division method selection and non-landslide negative sample selection, constructs a landslide susceptibility evaluation model, completes landslide susceptibility evaluation of a unit area region, establishes a dynamic evaluation matrix, combines SBAS-I nSAR deformation information and a landslide susceptibility evaluation regional graph to complete landslide susceptibility dynamic evaluation of the unit area region, extracts ground deformation information by using SBAS-I nSAR technology on the basis of acquiring the ascending rail Sent ine l-1A image data of the unit area region, divides the ground deformation information into 5 grades, combines the landslide susceptibility regional graph obtained by the landslide susceptibility evaluation model with best effect-DBO-RF model and SABS-I nSAR deformation information, and completes landslide susceptibility dynamic evaluation of the unit area region on the basis of the dynamic evaluation matrix, thereby providing theoretical reference for landslide prediction and prevention.

Description

Dynamic landslide susceptibility evaluation method
Technical Field
The invention relates to the field of dynamic landslide evaluation, in particular to a dynamic landslide susceptibility evaluation method.
Background
Landslide is a serious geological disaster with great destructiveness and strong hazard and commonly exists, and constitutes a serious threat to life and property safety. In recent years, the occurrence frequency of landslide disasters is remarkably increased due to human activities and global changes, and huge damages are caused to social and economic development. Landslide is a deformation damage process and phenomenon of a slope rock-soil body under the action of gravity and external factors (such as rainfall, water level, earthquake, human engineering activities and the like). Under the common drive of gravity and external force, the slope rock-soil body is broken, and along with inoculation and penetration of the internal potential sliding surface, external macroscopic deformation is generated, and finally instability and damage are caused, so that a landslide is formed. To address this problem, prediction and risk assessment of landslide is critical to reducing life and property loss. Landslide susceptibility assessment (LANDSLIDE SUSCEPTIBILITY ASSESSMENT, LSA) aims to predict the spatial distribution and likelihood of landslide. Landslide susceptibility assessment indicates the area of vulnerability of landslide, which is the basic step in assessing landslide hazard and developing mitigation strategies, and decision makers, scientists, engineers and the public can use the assessment results to avoid catastrophic landslide. Therefore, development of landslide susceptibility evaluation research is of great importance.
The landslide susceptibility evaluation research is crucial to avoiding and relieving regional landslide disasters, is a comprehensive measure for researching the occurrence probability of landslide, and in short, the landslide susceptibility modeling is a landslide risk evaluation technology for acquiring detailed and accurate spatial information by using a geographic information system and a computer technology, namely, researching whether landslide disasters possibly occur in a region or not, and is an important tool for avoiding related loss of landslide disasters. The method can determine the high risk area of the landslide, is generally helpful for government decision making and disaster prevention and reduction measures, and can indirectly reduce casualties and economic losses caused by the landslide.
The existing landslide evaluation still has (1) high uncertainty of landslide susceptibility evaluation index processing in the aspect of construction of a landslide susceptibility evaluation index system in theory and technology, and the numerical value type evaluation index interval dividing method has strong subjectivity and is not widely representative. (2) The non-landslide negative sample is important data in landslide vulnerability modeling, sample deviation possibly occurs in the process of selecting the non-landslide negative sample, and related research is still needed for improving the rationality of selecting the negative sample. (3) The machine learning modeling in the landslide susceptibility evaluation process is a key part, the hyper-parameter selection of the model is of great importance, and how to improve the accuracy of the landslide susceptibility prediction result through model optimization still needs to be further researched. (4) The occurrence of landslide is a dynamically-changing geological phenomenon, and how to dynamically evaluate the landslide susceptibility by combining time dynamic characteristics and spatial characteristics of landslide evaluation indexes is still a problem yet to be solved.
Disclosure of Invention
In order to overcome the defects and shortcomings in the prior art, the invention provides a dynamic landslide susceptibility evaluation method.
The method comprises the following steps:
S1, landslide information is obtained; acquiring landslide inventory information of a unit area by applying technical means and a statistical data mode, comprehensively considering a landslide disaster-pregnancy mechanism and a disaster-pregnancy environment of the unit area, analyzing landslide distribution characteristics of the unit area from two aspects of disaster type and distribution rule, and extracting 10 evaluation indexes of 5 categories;
S2, constructing an evaluation index system; dividing the interval of the data type evaluation indexes by adopting an equidistant classification method, a standard deviation method, a quantile method, a geometric interval method, a natural break point method and ChiMerge discretization methods respectively, calculating q values of the evaluation indexes obtained by different interval methods by using a geographic detector, and selecting an evaluation index interval division method with the maximum q value to realize reasonable interval division of the evaluation indexes; selecting an evaluation index passing through multiple collinearity test, carrying out preliminary prediction on the regional landslide susceptibility based on a weighted information quantity model, randomly selecting non-landslide points with the same number as the landslide points in a very low susceptibility region of the weighted information quantity model as non-landslide negative samples, wherein the selection of the negative samples meets the requirements of randomness and low susceptibility;
S3, constructing a machine learning model; integrating a historical landslide positive sample and a non-landslide negative sample to form a training sample set for landslide vulnerability modeling of a unit area region, introducing a dung beetle algorithm to perform parameter optimization on a XGBoost model, a support vector machine model and a random forest model, selecting optimal parameters to establish a machine learning landslide vulnerability prediction model, completing landslide vulnerability evaluation of the unit area region, drawing a landslide disaster vulnerability partition map, performing partition statistics and machine learning model accuracy evaluation, comparing the results of the XGBoost model, the support vector machine model and the random forest model from 3 aspects of confusion matrix, statistical parameters and a subject working curve, and analyzing the accuracy of the landslide vulnerability prediction model of the unit area region;
s4, dynamically evaluating landslide susceptibility; based on the ascending rail Sentinel-1A image data of the unit area region, extracting earth surface deformation information by utilizing an SBAS-InSAR technology, and analyzing the spatial distribution characteristics of landslide disasters of the unit area region by combining the deformation information; and comparing the application and performance of the landslide susceptibility prediction results of the DBO-XGBoost model, the DBO-SVM model and the DBO-RF model in the unit area region, and establishing a landslide susceptibility dynamic evaluation matrix in the unit area region by combining a landslide susceptibility result graph obtained by the DBO-RF model and SABS-InSAR deformation information, so as to dynamically evaluate the landslide susceptibility in the unit area region.
Further, the 5 categories are respectively topography, geological structure, meteorological hydrology, environmental condition and human activity; the 10 evaluation indexes are respectively elevation, gradient, slope direction, topography fluctuation degree, lithology, river distance, land utilization, road distance, POI nuclear density and human footprint index.
Further, the equidistant classification method divides the range of the evaluation index into a plurality of equal sub-ranges by using equal intervals, the equidistant classification method interval division of the evaluation index grid layer is performed by using a reclassification tool in ArcGIS Pro software, and the interval number is set to be 6 based on a subjective experience method;
the natural break point method automatically classifies the intervals of the evaluation indexes in ArcGIS Pro software, the reclassification tool in the ArcGIS Pro software is used for classifying the intervals of the natural break point classification method of the evaluation index grid layer, and the interval number is set to be 6 based on a subjective experience method;
The score method utilizes a reclassification tool in ArcGIS Pro software to carry out the interval division of the score classification method of the evaluation index grid layer, and the interval number is set to be 6 based on a subjective experience method;
The geometric interval method utilizes a reclassification tool in ArcGIS Pro software to carry out interval division of the geometric interval classification method of the evaluation index grid layer, and the interval number is set to be 6 based on a subjective experience method;
The standard deviation method utilizes a reclassification tool in ArcGIS Pro software to divide standard deviation classification method intervals of an evaluation index grid layer, and the intervals are set to be 1 time;
The ChiMerge discretization method divides the intervals of the data type evaluation indexes, and in the IBM SPSS STATISTICS 26.0.0 software, the continuous evaluation indexes of the elevation, the gradient and the topography fluctuation degree are discretized ChiMerge by using a clamping box-dividing tool, and the threshold value is set to be 0.05.
Further, the geographic detector is used for revealing the driving force by calculating the spatial heterogeneity of the evaluation index, and selecting the difference and factor detector to detect the interpretation degree of the landslide, and the expression is as follows:
Wherein i represents the layering number of the landslide evaluation index, i=1, 2, …, m; n i represents the number of units of the ith layer of the landslide evaluation index; n represents the unit number of the whole area of the evaluation index; Representing the variance of the landslide Y value corresponding to the ith layer of the landslide evaluation index; σ 2 represents the variance of the landslide Y value corresponding to the whole region of the evaluation index; SL represents the sum of the intra-layer partitions; ST denotes the total variance of the whole region.
Further, the multiple collinearity test is used to find the linear or nonlinear relationship between variables, and when TOL >0.1 and VIF <10, there is no multiple collinearity between the evaluation indexes, and the calculation expression is:
Wherein, Representing the decision coefficients, tolerances representing the tolerance, VIF representing the variance expansion factor;
The VIF and TOL of the landslide susceptibility evaluation index of the unit area are calculated by using a colinear diagnosis tool in linear analysis of IBM SPSS STATISTICS 26.0.0 software, the evaluation index with high collinearity is eliminated, 3 noun evaluation indexes and 7 evaluation indexes of an optimal interval division method obtained through a geographic detector model are selected to not show strong collinearity, and the evaluation indexes are used as input variables for constructing a landslide susceptibility prediction model.
Further, the information quantity model has a calculation formula:
Wherein, I (y, x 1,x2,…xn) represents the information quantity provided by the evaluation index combination x 1,x2,…,xn for geological disasters; p (y|x 1,x2,…,xn) represents the probability of occurrence of landslide hazard under the combined condition of the evaluation index x 1,x2,…,xn; p (y) represents the probability of occurrence of landslide hazard; The information quantity provided by the evaluation index x 2 for the landslide susceptibility evaluation result in the presence of the evaluation index x 1 is represented;
When (when) I (y, x 1,x2,…xn) >0, the combination of the representation factors is favorable for landslide hazard occurrence; otherwise, I (y, x 1,x2,…xn) <0, indicating that the combination of factors is unfavorable for the occurrence of landslide hazard;
calculating information magnitude values provided by different levels of states of each evaluation index for landslide disasters, wherein the expression is as follows:
Wherein P (x i, D) represents the probability of occurrence in a certain state class of the evaluation index x i; p (x i) represents landslide hazard distribution probability;
Calculating the information quantity of each evaluation index, wherein the expression is as follows:
n i represents the number of developing geological disasters under a certain grading state (x i) of a certain evaluation index; n represents the total number of geological disasters occurring in the whole unit area; s i represents the number of grids of a certain rating state (x i) of a certain rating factor; s, the grid unit number of the whole unit area;
the total information value in each grid unit is calculated by the following expression:
wherein n represents the number of evaluation indexes involved in landslide susceptibility evaluation.
Further, the non-landslide negative sample is used for carrying out preliminary prediction on landslide liability of a unit area region by using a weighted information quantity model, non-landslide sample points with the same number as that of landslide samples are randomly selected in an extremely low-probability region in a prediction result, landslide disaster liability evaluation of the unit area region is completed by using a machine learning model based on a grid unit, and non-landslide points with the same number as that of landslide points are randomly selected in ArcGIS Pro3.0 software by using a random point generation tool by using a landslide liability region result of the weighted information quantity model.
Further, the XGBoost model establishes a landslide vulnerability evaluation model based on the XGBoost model by using randomly selected training set data, verifies the prediction accuracy of the model by using test set data, optimizes n_ estimators and max_depth parameters of the model by adopting a dung beetle algorithm, improves the accuracy of the model, n_ estimators represents the number of trees, max_depth represents the maximum depth of the trees, each tree in the XGBoost model is limited in depth, overfitting is prevented by limiting the depth of the tree, the iteration number of the algorithm is set to be 50 in the process of optimizing the XGBoost model by the dung beetle algorithm, the model accuracy obtained by each iteration is calculated, and the XGBoost model under the optimal parameter combination is obtained after the iteration is completed.
Further, the support vector machine model establishes a landslide vulnerability evaluation model based on XGBoost models by using randomly selected training set data, verifies the prediction accuracy of the model by using test set data, optimizes C and gamma parameters of the model by using a dung beetle algorithm, improves the accuracy of the model, and determines the tolerance of the degree of violation interval of sample data by using C as a punishment coefficient, and when the C value is small, the punishment of the model to misclassification points is very low; when the C value is larger, the number of misclassified samples is reduced by the model, the decision boundary has smaller interval, gamma parameter control data are distributed after being mapped to a high-dimensional space, the fitting degree of the model to training data is increased along with the increase of gamma parameters, and the overfitting of the model is caused by the overlarge gamma parameters.
Further, the random forest model establishes a landslide vulnerability evaluation model based on the random forest model by utilizing randomly selected training set data, verifies the prediction accuracy of the model by utilizing test set data, optimizes n_ estimators and max_depth parameters of the model by adopting a dung beetle algorithm, improves the accuracy of the model, sets the iteration number of the algorithm to be 50 times in the process of optimizing the random forest model by the dung beetle algorithm, and calculates the accuracy of the model obtained by each iteration; after iteration is completed, a random forest model under the optimal parameter combination is obtained, the model is reconstructed by the obtained optimal parameters, after the model training is completed, the whole data set of the unit area is input into the model, landslide susceptibility evaluation of the unit area is carried out, a prediction result is output, and finally the landslide occurrence probability of all grid units of the whole unit area is obtained; the closer the predicted value corresponding to the grid unit is to 0, the more stable the grid unit is, and the smaller the probability of landslide occurrence is; conversely, the closer the predicted value is to l, the more unstable the grid unit, and the greater the probability of landslide.
The beneficial effects are that:
The invention provides a landslide susceptibility dynamic evaluation method, which combines machine learning and SBAS-InSAR technology, and adopts a dung beetle algorithm to perform super-parameter optimization on a machine learning model on the basis of considering evaluation index interval division method selection and non-landslide negative sample selection, so as to construct a landslide susceptibility evaluation model, complete landslide susceptibility evaluation of a unit area region, establish a dynamic evaluation matrix, combine SBAS-InSAR deformation information and a landslide susceptibility evaluation division map, complete landslide susceptibility dynamic evaluation of the unit area region, and combine different data discretization methods (an equidistant classification method, a standard difference method, a quantile method, a geometric interval method, a natural break point method and a ChiMerge discretization method) and a geographic detector model to realize optimal discretization of landslide susceptibility prediction process evaluation indexes, thereby improving the regional applicability of the evaluation index interval division method. Aiming at the problem of strong subjectivity of discretization of the numerical evaluation index, q values of grouping results are obtained by calculating different partition methods based on a geographic detector model. The invention provides a novel non-landslide negative sample selection method, which ensures the randomness and low liability of the non-landslide negative sample selection, thereby improving the accuracy and reliability of the landslide liability prediction model. Aiming at the problem that the conventional negative sample sampling method has lack of randomness and low susceptibility to sample consideration, the invention provides a negative sample selection method based on a weighted information quantity model based on a geographical detector selection applicable evaluation index interval division method, calculates the information quantity weight of each index by combining the information quantity method and an analytic hierarchy process, establishes a landslide susceptibility drawing of a unit area region based on the weighted information quantity model, and selects non-landslide negative sample points consistent with the number of landslide samples from extremely low susceptibility regions in the susceptibility drawing. According to the invention, a dung beetle algorithm is applied to a machine learning landslide susceptibility evaluation model, and super-parameter optimization is performed on a XGBoost model, an SVM model and an RF model by using the dung beetle algorithm, so that the construction of the landslide susceptibility evaluation model in a unit area is completed. According to the method, on the basis of acquiring the ascending rail Sentinel-1A image data of a unit area region, the SBAS-InSAR technology is used for extracting the ground surface deformation information, the ground surface deformation information is divided into 5 grades, the landslide susceptibility result graph and the SABS-InSAR deformation information obtained by combining the optimal prediction model-DBO-RF model are used for completing the dynamic evaluation of the landslide susceptibility of the unit area region based on the dynamic evaluation matrix, theoretical reference is provided for landslide disaster prediction prevention, and the method is high in calculation accuracy and effectively prevents landslide accidents.
Drawings
FIG. 1 is a flow chart of the general steps of the present invention;
FIG. 2 is a flow chart of the analytic hierarchy process of the present invention;
FIG. 3 is a block diagram of an analytic hierarchy process model of the present invention;
FIG. 4 is a XGBoost model flow diagram of the present invention;
FIG. 5 is a flow chart of a support vector machine model of the present invention;
fig. 6 is a flow chart of a random forest model of the present invention.
Detailed Description
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other, and the present application will be further described in detail with reference to the drawings and the specific embodiments.
The dynamic landslide susceptibility evaluation method combines machine learning and SBAS-InSAR technology, develops researches aiming at the problems of construction of a landslide susceptibility prediction index system, a negative sample selection method, machine learning model optimization and the like, and (1) aims at the problem that index interval division in unit area landslide susceptibility evaluation is not widely representative, and combines a geographic detector to realize reasonable interval division of evaluation indexes; (2) Aiming at the problem that the non-landslide negative sample extraction is unreasonable, a negative sample selection method based on a weighted information quantity model is provided, and important sample data meeting requirements is provided for landslide susceptibility modeling. (3) Aiming at the problem of super parameter selection of the machine learning model in landslide susceptibility evaluation, a dung beetle algorithm is selected to automatically search the optimal parameters, so that the prediction accuracy of the machine learning model in landslide susceptibility evaluation is improved.
As shown in fig. 1, a dynamic landslide susceptibility evaluation method includes:
S1, landslide information is obtained; the landslide inventory information of the unit area is obtained by applying technical means, statistical data and other modes, the landslide disaster-pregnancy mechanism and the disaster-pregnancy environment of the unit area are comprehensively considered, the landslide distribution characteristics of the unit area are analyzed from two aspects of disaster type and distribution rule, and 10 evaluation indexes of 5 categories (landform, geological structure, meteorological hydrology, environmental conditions and human activities) are extracted, wherein the evaluation indexes are respectively elevation, gradient, slope direction, topographic relief, lithology, river distance, land utilization, road distance, POI nuclear density and human footprint index.
S2, constructing an evaluation index system; in order to explore a landslide susceptibility evaluation index interval division method suitable for a unit area region, an equidistant classification method, a standard deviation method, a quantile method, a geometric interval method, a natural break point method and a ChiMerge discretization method are respectively adopted to divide the interval of data type evaluation indexes (noun type evaluation indexes can be directly converted into discrete values and are not involved in discussion), a geographic detector is utilized to calculate q values of the evaluation indexes obtained by different interval methods, and a reasonable interval division method of the evaluation indexes is realized by selecting the evaluation index interval division method with the maximum q values; and selecting an evaluation index passing through the multiple collinearity test to perform preliminary prediction on the regional landslide susceptibility based on the weighted information quantity model, and randomly selecting non-landslide points with the same number as the landslide points in a very low susceptibility region of the weighted information quantity model as non-landslide negative samples, so that the selection of the negative samples meets the requirements of randomness and low susceptibility.
S3, constructing a machine learning model; integrating a 'historical landslide' positive sample and a non-landslide negative sample to form a training sample set for landslide susceptibility modeling of a unit area region, introducing a dung beetle algorithm to perform parameter optimization on a XGBoost model, a support vector machine model and a random forest model, selecting optimal parameters to establish a machine learning landslide susceptibility evaluation model, completing landslide susceptibility evaluation of the unit area region, drawing a landslide disaster susceptibility partition map, performing partition statistics and machine learning model precision evaluation, comparing the results of the XGBoost model, the support vector machine model and the random forest model from 3 aspects of confusion matrix, statistical parameters and a subject working curve, and analyzing the accuracy of the landslide susceptibility evaluation model of the unit area region.
S4, dynamically evaluating landslide susceptibility; based on the image data of the elevated rail Sentinel-1A of the unit area region 2021 month 1 to 2023 month 12, the SBAS-InSAR technology is used for extracting the earth surface deformation information, and the spatial distribution characteristics of landslide disasters of the unit area region are analyzed by combining the deformation information; and comparing the application and performance of the landslide susceptibility evaluation results of the DBO-XGBoost model, the DBO-SVM model and the DBO-RF model in the unit area region, and establishing a landslide susceptibility dynamic evaluation matrix in the unit area region by combining a landslide susceptibility result diagram obtained by the optimal evaluation model-the DBO-RF model and SABS-InSAR deformation information to develop the landslide susceptibility dynamic evaluation in the unit area region, thereby providing theoretical reference for landslide disaster prediction prevention.
The unit area is Fengjie county in the three gorges reservoir area, the unit area belongs to subtropical warm and humid monsoon climate, the regional rainfall is sufficient, the annual average precipitation is 1132 mm, the land of the eastern mountain area, the bus in front of the mountains edge and the jaw mountain area of the Sichuan basin are bordered, the surface water system in the area develops, the multi-order valley-level land and deep-channel canyon are formed through the water system erosion and accumulation for many years, and the stratum types are mainly fourth-series stratum, tri-stack stratum, dwarf system and the like. The unit area is the abdomen center of the Yangtze river three gorges reservoir region, the Wushan county, the south-to-Yunyang county, the West Lian Mo state region, the Yichang city, the northwest, the Hubei province, the east longitude 109 degrees 1 '17' to 109 degrees 45 '58', the North latitude 30 degrees 29 '19' to 31 degrees 22 '33', the area is about 4087 square kilometers, the structure is complex, the syncline and anticline are densely developed, the overall topography is high in the southeast and northeast directions, the middle part is slightly gentle in the West direction, the north part is the south of the big mountain, and the eastern and south parts are Wushan and seven-to-be mountains. The county is located in a central location throughout the administrative district, wherein Du Fuzhen, yongan town is the county government premises.
The unit area drawing unit is selected, the landslide susceptibility evaluation unit is the minimum unit which cannot be segmented in susceptibility evaluation, each unit is abstracted into a feature vector to be input into the machine learning model in the evaluation process, and the determination of the evaluation unit directly influences the evaluation result of the model. At present, the relatively mature landslide susceptibility evaluation unit mainly comprises a region unit, a uniform condition unit, a slope unit and a grid unit.
(1) The regional unit is an evaluation unit based on association and difference existing in natural geographic environment, and is determined by naturally formed regional boundary lines reflecting topography, geomorphology and geological properties. The regional units are commonly used for land resource investigation and land feature investigation, and the use of the regional units in landslide susceptibility assessment can more clearly reflect the relationship between land features and landslide occurrence. The regional units depend on a large amount of terrain and geological data and expert knowledge, and the arrangement is subjective and is more suitable for large-area macroscopic landslide susceptibility evaluation.
(2) And the uniform condition unit does not depend on priori knowledge, but is directly based on the spatial overlapping of each evaluation index, and takes the grouping result of the spatial overlapping as an evaluation unit. The division of the uniform condition units keeps the independence of each evaluation index, and the final evaluation unit is the maximum combination set of different values of each evaluation element, so that the method is suitable for the small-area landslide susceptibility fine evaluation with higher data precision. However, the uniform condition unit has a certain subjectivity in practical use due to the division of the evaluation index itself.
(3) And a ramp unit which can be created by extracting ridge lines and valley lines from high-precision DEM data or contour lines and dividing the area into individual ramp surfaces accordingly. The slope unit is compatible with the terrain and geological features and is suitable for deterministic modeling of small areas. However, the ramp unit requires high quality terrain data, requires manual modification to represent, and is not automated.
(4) The grid unit is a regular grid, and the unit area is divided into regular grids with equal size. The grid unit has the advantages of simple data structure, flexible sampling, high calculation speed and high data availability. However, compared with the first three methods, the data generated by dividing the grid units is more, and is insufficient to intuitively reflect the overall situation of the terrain and the topography.
The landslide evaluation index space database is constructed by collecting evaluation index data related to landslide occurrence such as topography, geological structure, meteorological hydrology, environmental conditions, human activities and the like of a unit area region. The topographic and geomorphic data comprises elevation, gradient, slope direction and topographic relief indexes; the geological structure comprises lithology indexes; the meteorological hydrology comprises a distance index from the river; environmental conditions include land use indicators; human activities include distance from the road, POI kernel density, and human footprint index.
The method uses ArcGIS Pro3.0 software to extract and preprocess data to obtain an evaluation index grid layer. All layers are converted into a WGS-84-UTM-zone-49N projection coordinate system in ArcGIS Pro3.0 software, and an evaluation index layer and a landslide cataloging grid layer are overlapped to realize bidirectional retrieval and inquiry of historical landslide disasters and evaluation indexes.
The index factors are classified according to a certain standard, wherein the index factor classification refers to the classification of a single evaluation index into a plurality of state categories. The evaluation index includes noun-type evaluation index and numerical-type evaluation index. The noun-type evaluation index is in a discrete discontinuous form, for example, the land use type has a value of: the noun type evaluation indexes of the invention comprise lithology, slope direction and land utilization. The numerical evaluation index is a continuous numerical representation, such as continuously-changing elevation data, and the numerical evaluation index comprises 'elevation, gradient, topography relief, river distance, road distance, POI nuclear density and human footprint index'.
Descriptive characters cannot be directly applied to landslide hazard susceptibility evaluation models, so that noun type evaluation indexes need to be converted into discrete numerical values. The invention uses model evaluation index to regularize the evaluation index of noun type, the processing mode is to use numbers to represent descriptive words.
The establishing process of the evaluation index system needs to discretize (namely interval division) the numerical evaluation indexes such as elevation, gradient, topography fluctuation and the like. Discretizing the data type evaluation index has the following advantages: ① Many machine learning models (such as XGBoost and random forests) cannot directly process continuous variables, and the discretization of continuous data can reduce the limitation in the modeling process; ② Discretization of continuous data can improve the discernability of the machine learning model, thereby improving the prediction capability of the model; ③ The discretization can simplify the data structure of the evaluation index, reduce the influence of abnormal values, reduce the dependence on data distribution assumptions and improve the calculation efficiency.
The discretization treatment is an important ring in the construction of the evaluation index body, and has direct influence on the susceptibility evaluation of machine learning landslide. At present, most researches adopt a section division mode of an equidistant method or a natural break point method for discretization treatment of continuous evaluation indexes, and the mode is simple and convenient, but is not necessarily applicable to different unit area areas. Aiming at the problem, the method is based on a classification method (equidistant classification method, standard deviation method, quantile method, geometric interval method and natural break point method) and a ChiMerge discretization method in ArcGIS Pro3.0 software to carry out interval division on 7 continuous evaluation indexes (as the slope direction, lithology and land utilization data types are consistent and are not independently discussed in different classification methods), loading interval division results into a geographic detector model, counting q value results, and selecting a method with the largest q value in each evaluation index corresponding classification method to carry out interval division and landslide susceptibility evaluation modeling.
The equidistant classification method uses equal intervals to divide the range of the evaluation index into several equal-sized sub-ranges. Dividing the equidistant classification method interval of the evaluation index grid layer by using a reclassification tool in ArcGIS Pro software, and setting the interval number to be 6 based on a subjective experience method;
the method comprises the steps of firstly sorting all data in an evaluation index, searching for difference data 'breaking points' among samples by combining a statistical method, completing classification according to errors among categories and in the categories, enabling attribute differences among different classification intervals to be maximum, enabling differences in the classification intervals to be minimum, firstly calculating variance of each classification interval, then calculating sum of variances, and finally obtaining an optimal classification result with the smallest sum of variances. The intervals of the evaluation index are automatically classified in ArcGIS Pro software. Performing natural break point classification method interval division of the evaluation index grid layer by using a reclassification tool in ArcGIS Pro software, and setting the interval number to be 6 based on a subjective experience method;
the classification objects are presented in approximately the same number by the quantile method, and the classification result of each evaluation index is uniformly distributed, namely, each classification interval contains an equal number of elements. Dividing intervals by using a reclassification tool in ArcGIS Pro software by a quantile classification method of an evaluation index grid layer, and setting the interval number to be 6 based on a subjective experience method;
The geometric interval method establishes interval intervals based on geometric series, and determines classification intervals of evaluation indexes, and the principle is that the square sum of the number of elements in each classification interval is minimized as much as possible, so as to ensure that the number of values owned in each class is the same as much as possible, and the change among the classification intervals is kept consistent. Performing geometric interval classification method interval division of the evaluation index grid layer by using a reclassification tool in ArcGIS Pro software, and setting the interval number to be 6 based on a subjective experience method;
The standard deviation method calculates the average value and standard deviation of each, and discrete points are used for creating classification intervals by calculating the equivalent range of the standard deviation in proportion to the average value. Dividing standard deviation classification method intervals of the evaluation index grid layer by using a reclassification tool in ArcGIS Pro software, wherein the intervals are set to be 1 time;
ChiMerge discretization method discretizes the continuous variable using χ 2 (chi-square statistic) and determines if the current breakpoint should be deleted and merged with the adjacent packet. The implementation steps of ChiMerge method are as follows: ① Sorting variable values; ② Defining an initial interval; ③ Counting and generating a frequency table; ④ Calculating the chi-square value of the adjacent sections, and judging whether the adjacent sections are combined according to the calculation result; ⑤ The operation of step ④ is repeated based on the new interval distribution. The computational expression is:
Eij=Ri×Cj/N
wherein χ 2, which reflects the uniformity of j sample distribution in two intervals adjacent to a node.
And stopping the chi-square box division when the number of the boxes is equal to the designated box division number or the minimum chi-square value is greater than the chi-square threshold.
The invention uses ChiMerge discretization method to divide the data type evaluation index section, uses the card box tool to realize ChiMerge discretization to the continuous evaluation index such as elevation, gradient, topography fluctuation and the like in the IBM SPSS STATISTICS 26.0.0 software, and sets the threshold value to be 0.05.
The geographic detector can reveal the driving force of the geographic detector by calculating the spatial heterogeneity of the evaluation index, including a dissimilarity and factor detector, an interaction detector, a risk zone detector and an ecological detector. The invention selects the difference and factor detector which can detect the landslide evaluation index to explain the landslide. And (3) expressing landslide by Y, and expressing an evaluation index affecting the landslide, wherein the influence of the landslide evaluation index on the landslide is measured by q (the value range is 0-1), and the expression is as follows:
wherein i represents the number of layers (the number of classifications or the number of divisions) of the landslide evaluation index, i=1, 2, …, m; n i represents the number of units of the ith layer of the landslide evaluation index; n represents the number of units in the whole region (all layers) of the evaluation index; Representing the variance of the landslide Y value corresponding to the ith layer of the landslide evaluation index; σ 2 represents the variance of the corresponding landslide Y values of the whole region (all layers) of the evaluation index; SL represents the sum of the intra-layer partitions; ST denotes the total variance of the whole region.
The q values obtained based on the geographic detector model are different from the q values obtained based on the geographic detector model due to the different division methods and the different classification numbers of the numerical evaluation index interval. The current method for discretizing the numerical evaluation index in the landslide evaluation index comprises an equidistant classification method, a natural break point classification method, a subjective experience method and the like (the subjective experience method is not included in the research of the invention because of excessive uncertainty in the process of dividing the evaluation index interval). The invention is based on reclassification tools (classification method: equidistant classification method, nature break point classification method, quantile classification method, geometric interval classification method and standard deviation classification method) in ArcGIS Pro3.0 software and IBM SPSS STATISTICS 26.0.0 software (classification method: chiMerge discretization method) to carry out interval division on 7 numerical evaluation indexes (evaluation index importance ranking by using the same geographic detector model for slope direction, lithology and land utilization), and the q value results are counted, and the method with the largest q value in each evaluation index corresponding classification method is selected to carry out interval division.
When establishing the landslide vulnerability evaluation model, the comprehensive influence among evaluation indexes must be considered and analyzed. In theory, the accuracy of the landslide susceptibility evaluation model and the complexity of the model are positively correlated with the number of evaluation indexes, and as the number of the landslide susceptibility evaluation indexes increases, the probability of multiple collinearity among the evaluation indexes also increases. If there is a linear correlation between the evaluation indexes, the stability and reliability of the landslide vulnerability prediction model will be negatively affected. Multiple collinearity between the evaluation indices is checked using a variance expansion factor (Variance Inflation Factor, VIF) and a tolerance (Tolerances, TOL), the main function of the multiple collinearity check being to find linear or nonlinear relationships between the variables. When TOL >0.1 and VIF <10, there is no multiple collinearity between the evaluation indexes, and the calculation expression is:
Wherein, Representing the decision coefficients.
The invention uses the collinearity diagnostic tool in the linear analysis of the IBM SPSS STATISTICS 26.0.0 software to calculate the VIF and TOL of the landslide susceptibility evaluation index of the unit area, excludes the evaluation index with high collinearity, the TOL values of all the evaluation indexes are more than 0.3, the VIF values are less than 4, and the critical values of TOL >0.1 and VIF <10 are satisfied. The TOL of the evaluation index elevation is minimum and the VIF is maximum, which are 0.313.27 respectively, the co-linearity diagnosis threshold value is not exceeded, and no co-linearity correlation exists between the evaluation index elevation and other evaluation indexes. Therefore, the 3 noun type evaluation indexes selected by the invention and the 7 evaluation indexes obtained by the geographical detector model in the optimal interval dividing method do not show strong collinearity, and the three noun type evaluation indexes can be used as input variables for constructing a landslide vulnerability evaluation model.
The calculation formula of the information quantity model is as follows:
Wherein, I (y, x 1,x2,…xn) represents the information quantity provided by the evaluation index combination x 1,x2,…,xn for geological disasters; p (y|x 1,x2,…,xn) represents the probability of occurrence of landslide hazard under the combined condition of the evaluation indexes x 1,x2, … and xn; p (y) represents the probability of occurrence of landslide hazard; The information amount provided by the evaluation index x 2 to the landslide susceptibility evaluation result in the presence of the evaluation index x 1 is shown.
When (when)I (y, x 1,x2,…xn) >0, the combination of the representation factors is favorable for landslide hazard occurrence; otherwise, I (y, x 1,x2,…xn) <0 indicates that the combination of factors is unfavorable for the occurrence of landslide hazard.
Calculating information magnitude values provided by different levels of states of each evaluation index for landslide disasters
Wherein P (x i, D) represents the probability of occurrence in a certain state class of the evaluation index x i; p (x i) represents the landslide hazard distribution probability.
For the calculation of each evaluation index information amount, the formula may be used:
N i represents the number of developing geological disasters under a certain grading state (x i) of a certain evaluation index; n represents the total number of geological disasters occurring in the whole unit area; s i represents the number of grids of a certain rating state (x i) of a certain rating factor; s represents the number of grid cells in the entire unit area.
For the calculation of the total information amount value within each grid cell, the formula may be used:
wherein n represents the number of evaluation indexes involved in landslide susceptibility evaluation.
The probability of occurrence of landslide hazard in the evaluation unit increases with the information amount. After the information magnitude is determined, the landslide hazard susceptibility areas can be divided for the unit area.
The Analytic Hierarchy Process (AHP) is a structure-based decision program, solves complex decision problems through a simple mathematical model, distributes optimal weights to input factors and obtains proper results (such as landslide susceptibility evaluation). The analytic hierarchy process relies on the relative importance of factors defined in terms of importance, which can assign reasonable weights to the evaluations due to their unique consistency checks.
As shown in fig. 2, the flow of the analytic hierarchy process is shown as follows: establishing a hierarchical structure, constructing a judgment matrix for each factor according to the importance scale of the target, ordering and consistency check of a hierarchical list, ordering and consistency check of a hierarchical total, determining whether the determined relative importance among the factors is proper, and finally making a decision.
The method comprises the steps of constructing a weighted information quantity model by combining an analytic hierarchy process and an information quantity process, calculating the weight of each evaluation index by using the analytic hierarchy process on the basis of the information quantity value of the existing evaluation index, multiplying the information quantity value of each evaluation index interval by the corresponding weight, and summing to obtain the information quantity weight, wherein the information quantity weight is shown in the following formula (3-14):
Wherein I represents the information amount of a unit in the evaluation area; w i represents the weight value of the ith evaluation index; i i represents the information amount of the I-th evaluation index; s represents the total number of evaluation units in the unit area; s i represents the number of units in a unit area containing the evaluation index X i; n represents the total number of units with disaster distribution in a unit area; n i represents the number of disaster units distributed in a specific category in the evaluation index X i.
As shown in fig. 3, the specific steps of carrying out preliminary prediction of landslide susceptibility in a unit area by applying the weighted information amount model are as follows:
(1) According to the relation among the elements in the evaluation system, a hierarchical structure is established, the hierarchical structure is divided into three layers, and landslide susceptibility evaluation (A) is taken as a target layer; the evaluation index type (B) is a criterion layer; the evaluation index (C) is a scheme layer.
(2) Constructing every two judgment matrixes of each layer, constructing the sequence of the hierarchical structure from the beginning of a criterion layer to the end of a scheme layer, constructing every two judgment matrixes for different indexes of the same layer by using a 1-9 scale method, solving a normalized relative important weight vector W of each element relative to an upper element by using a root method, and completing calculation in an Excel table. B2 and B3 only contain one influencing factor, and do not meet the standard of constructing a judgment matrix.
(3) Consistency test of the judgment matrix, test whether all the judgment matrices have satisfactory consistency,
(4) And calculating the combination weight of each evaluation index of the layer C, multiplying the weight of each factor of the layer C corresponding to each factor of the layer B by the weight of each factor of the layer C, and finally obtaining the combination weight of each factor of the layer C of the vulnerability evaluation scheme.
(5) Calculating the information quantity weight of each evaluation index, and calculating the calculation formula of the information weight of the single evaluation:
Wherein W i represents the combination weight of each evaluation index; n i represents the number of landslide disasters distributed within the evaluation index X i; n represents the number of landslide disasters in a unit area; s i represents the area occupied by the evaluation index X i in the unit area region; s represents the total area of the unit area region.
Landslide susceptibility evaluation is a process of calculating a landslide susceptibility index by constructing a relationship between a landslide and an evaluation index. In order to reduce the influence of uncertainty of non-landslide sample selection on landslide susceptibility evaluation results, the invention provides a novel non-landslide negative sample selection method, which is used for preliminarily predicting landslide susceptibility of a unit area based on a weighted information quantity model, randomly selecting non-landslide sample points which are consistent with the number of landslide samples in a very low susceptibility area in the prediction results, and completing landslide disaster susceptibility evaluation of the unit area through a machine learning model based on a grid unit. And randomly selecting 1437 non-landslide points with the same number as the landslide points in ArcGIS Pro3.0 software by using a random point generation tool based on landslide vulnerability region results of the weighted information quantity model.
The dung beetle algorithm (Dung beetle optimizer, DBO) is a new swarm intelligent optimization algorithm which simulates the habit of dung beetles in nature and establishes a search framework based on a 'rolling ball, dancing, foraging, stealing and breeding' model. Because the dung beetle algorithm has the characteristics of strong optimizing capability and rapid convergence in consideration of global exploration and local development, the method is used for carrying out parameter optimization on the landslide susceptibility evaluation model.
The XGBoost (Extreme Gradient Boosting, XGBoost) model adopts a gradient lifting algorithm, and a strong integrated model is formed by iteratively training a series of decision trees and combining the decision trees, which is proposed by Chen and Guestrin in 2016 and is an integrated model guided by Boosting integration ideas, and the structure and decision path of each tree are composed of a plurality of decision nodes and leaf nodes. Compared with GBDT model, XGBoost model adopts Taylor second order expansion to optimize the loss function, supports multi-thread parallel operation of CPU, adds regular term in the loss function, has better prediction effect, and has faster processing speed for large-scale data and supports self-defined loss function, so the invention selects the algorithm to evaluate landslide susceptibility in unit area.
XGBoost modeling process: XGBoost each iteration learns a decision tree, and a gradient lifting algorithm is adopted to fit the residual error between the decision tree prediction result and the true data value, so that a final prediction model is established, and the sample prediction result expression of XGBoost is:
wherein f k represents the kth policy tree; x i represents the feature vector corresponding to sample i; k represents the total number of decision trees; The predicted result of sample x i is shown.
XGBoost the objective function expression is:
Where i represents the ith sample in the dataset; m represents the number of samples; t represents the number of leaf nodes; gamma represents a penalty term that is cut off every time a leaf is added, and is taken as a regularization parameter of T; omega j represents the leaf weight on the j-th leaf node; lambda is a regularization parameter of the L2 paradigm; Representing a loss function; representing the complexity of the model.
Thus, the product can be obtained,Is the L2 paradigm of a leaf node weight vector. The essence of model building is a gradual iteration process based on a tree model, and the iteration result of the first k-1 trees can be expressed asSo for the kth iteration,Can be replaced byThe converted objective function expression is:
The approximate expansion is performed by using a Taylor binomial, and the obtained result expression is:
Wherein, -A constant term.
The constant term, since it is independent of the kth iteration, is removed here, and the new objective function expression is:
Obtaining an optimal tree structure by omega and T in solving, and defining: the set I j containing samples on the leaf with index j has the expression:
for convenience of description, definitions Substituting G j and H j into the objective function,
Wherein G j represents the sum of the first derivative summations of the samples contained by leaf node j; h j represents the sum of the sample second derivatives contained by leaf node j;
to minimize the calculated Obj value, we first derivative ω j and let it be 0, yielding the result:
the objective function calculates Obj values based on the corresponding structure of each tree, and an optimal XGBoost model is obtained by minimizing Obj values.
The support vector machine belongs to a supervised learning model, linear data and nonlinear data can be classified, the algorithm uses nonlinear mapping to map training data to a high-dimensional feature space according to a certain rule, an optimal classification hyperplane is found in the new feature space, and a linear classifier model with the largest interval is constructed on the feature space, so that the classification interval of the training data is maximized. According to the invention, a support vector machine model is constructed, a nonlinear mapping is used for mapping landslide evaluation indexes to a high-dimensional characteristic space, and an optimal separation hyperplane for distinguishing landslide from non-landslide sample data is searched in a new space.
For nonlinear separable data (x i,yi)(xi(xi∈Rd) as an evaluation index; y i represents whether landslide occurs or not; i is the number of samples), by nonlinear mappingThe data is mapped into a new high-dimensional feature space.For the hyperplane equation, the classification interval is equal to 2/|ω|, and to maximize 2/|ω|, the classification line constraint can be minimized:
yi(ω·xi+b)≥1-εi,(εi≥0)
wherein ε i represents the relaxation variable.
The smaller and better the value of epsilon i is, the more the classification hyperplane is found, so the original problem is converted into a solution under the constraint of the above formulaA quadratic programming problem of the minimum, wherein C is a penalty factor, and the judgment function is obtained by solving:
Wherein, Is a kernel function.
Common kernel functions for support vector machine models are: linear kernel functions, polynomial kernel functions, radial basis kernel functions, neural network kernel functions, and hyperbolic tangent kernel functions.
The Gaussian radial basis function is selected to be used for processing and analyzing steps because the Gaussian radial basis function can capture complex relations of data and has good effects on large and small samples.
Random Forest (RF) is an integrated algorithm, the core principle is derived from integrated learning and Bagging (Bootstrap aggregating, bagging), multiple decision trees are integrated through the idea of integrated learning (to avoid model limitations of a single model or a certain group of parameters and integrate more models), so that the accuracy and stability of the model are improved, the sensitivity to noise and outliers is reduced, the occurrence of excessive fitting is avoided, and the basic unit is generally a decision tree. A decision tree is a tree-like structure, each node representing a feature, and each leaf node representing a class or a value. The learning process is recursive, dividing the data into subsets according to the selected features until a stop condition is reached. In landslide susceptibility evaluation, random forests exploit the high variance between individual decision trees, making the decision trees vote one by one to determine class membership, and then assign each class according to a majority vote.
In landslide vulnerability modeling, it is essential to evaluate the performance of the model. Confusion matrix (Confusion Matrix, CM) is a major evaluation index in the field of machine learning. The confusion matrix measures the prediction effect of the classification problem model and is the most effective method for displaying the algorithm performance. Compared to other methods, the confusion matrix more easily distinguishes whether a model is confused in two categories, i.e. one category is misclassified into the other category,
Applying the confusion matrix to the evaluation of the landslide susceptibility model of the present invention: true Positive (TP): the number of samples correctly divided into "landslide" (the true class of samples is landslide, and the result of model recognition is also landslide); FALSE NEGATIVE (FN): the number of samples that are misclassified as non-landslide (the true class of samples is landslide, but the model identifies it as non-landslide); false Positive (FP): the number of samples that are misclassified as "landslide" (the true class of samples is non-landslide, but the model identifies it as landslide); true Negative (TN): the number of samples that are correctly divided into non-landslides (the true class of samples is landslides and the model identifies them as non-landslides).
The invention adopts the Accuracy (Accuracy), recall (Recall), precision (Precision) and F1-Score based on the confusion matrix as the auxiliary evaluation indexes of the model performance, and the calculation expressions of the indexes are as follows:
The range of values for accuracy, precision, recall and FI-Score is [0,1]. The closer the accuracy of the model is to l, the higher the overall accuracy of the landslide susceptibility evaluation model is; the closer the accuracy of the model is to l, the smaller the probability of erroneous judgment in the result of landslide susceptibility evaluation model prediction; the recall rate of the model is close to l, which indicates that the landslide susceptibility evaluation model has stronger prediction capability on landslide; F1-Score combines accuracy and recall, and the larger the value of the F1-Score is, the better the prediction performance of the landslide susceptibility evaluation model is.
In order to better evaluate the accuracy of the landslide susceptibility evaluation model, the invention adopts a subject working curve (Receiver Operating Characteristic, ROC) widely used in the current research to evaluate the performance of the model. The ROC curve is used for describing the trade-off between the sensitivity and the specificity of the classifier, and the recognition capability of the classification model on the sample at a certain threshold value can be intuitively found out. ROC curves have "sensitivity on the Y-axis and" specificity on the X-axis. The area enclosed by the ROC curve with the X-axis and x=1 is called AUC. The AUC calculation expression is:
Wherein P represents all landslide samples; n represents all non-landslide samples.
The overall prediction accuracy (Overall Prediction Accuracy, OPA) is achieved by superposing landslide susceptibility area map layers and landslide catalogue data, comparing the number of landslide in susceptibility areas and the superposition degree of the corresponding susceptibility areas, namely the number of landslide is increased along with the improvement of the susceptibility level, and the concrete expression is as follows:
Wherein P L represents the landslide density in each subarea, namely the proportion of the landslide quantity in each subarea to the total landslide quantity; p A represents the proportion of the partition area to the total unit area.
Applying machine learning to landslide susceptibility modeling is a binary classification problem that includes positive and negative examples, i.e., landslide and non-landslide. Considering that the machine learning model is more sensitive to the data of interval [0-1], the landslide property is set to be: landslide samples are assigned a label of "1", and non-landslide samples are assigned a label of "0". To avoid imbalance in model training due to too much or too little sample data, the following is 1:1, randomly selecting 1437 non-landslide negative sample points with low susceptibility, which are consistent with the number of landslide points, in an extremely low susceptibility area of a landslide susceptibility result obtained based on the preliminary prediction of the weighted information quantity model. And using SPATIAL ANALYST tools in ArcGIS Pro software to obtain the evaluation index attribute values of landslide points and non-landslide points, and constructing a sample database. And taking an evaluation index obtained after the section division method screening based on the geographic detector as input data of a model, and taking a susceptibility index (landslide: 1, non-landslide: 0) as an output item of model prediction to carry out susceptibility modeling.
The model training data and the test data set are divided according to the most widely used ratio of 7:3, 70% of landslide and non-landslide data are randomly selected from the established sample database to form training data of the model, and the training data is used as a training set for constructing a global model; the remaining 30% was used as test data for the model and as test set for the validation model. To verify the accuracy of the model, all training data and test data are summarized as a verification set to verify the accuracy of the model. And dividing the unit area into grid units of 30m multiplied by 30m by using a fishing net creating tool in ArcGIS Pro software, extracting the attribute values of all grid points, and obtaining the whole data.
And establishing a landslide vulnerability evaluation model based on the XGBoost model by using randomly selected training set data, and verifying the prediction accuracy of the model by using test set data. And optimizing n_ estimators and max_depth parameters of the model by adopting a dung beetle algorithm, so that the accuracy of the model is improved. n_ estimators is the number of trees, too many trees may make the model too complex, too much noise to fit the training data, and too few trees may not capture complex relationships in the data. The balance between model overfitting and model performance is explored by selecting the appropriate number of trees. max_depth is the maximum depth of the tree, and each tree in the XGBoost model is depth limited, and overfitting can be effectively prevented by limiting the depth of the tree. The XGBoost model building flow is shown in figure 4.
In the process of optimizing XGBoost models by using the dung beetle algorithm, setting the iteration number of the algorithm to be 50 times, and calculating the accuracy of the models obtained by each iteration. And after the iteration is completed, a XGBoost model under the optimal parameter combination is obtained.
And reconstructing the model by using the obtained optimal parameters, inputting the whole data set of the unit area into the model after the model is evaluated, evaluating the landslide liability of the unit area, outputting a prediction result, and finally obtaining the landslide occurrence probability of all grid units of the whole unit area. The closer the predicted value corresponding to the grid unit is to 0, the more stable the grid unit is, and the smaller the probability of landslide occurrence is; conversely, the closer the predicted value is to l, the more unstable the grid unit, and the greater the probability of landslide.
The DBO-SVM model construction flow is shown in FIG. 5, a landslide vulnerability evaluation model based on XGBoost model is built by using randomly selected training set data, and prediction accuracy of the model is verified by using test set data. And C and gamma parameters of the model are optimized by adopting a dung beetle algorithm, so that the accuracy of the model is improved. C is a penalty factor, which determines the tolerance of the degree to which the sample data violates the interval. If the C value is small, the punishment of the model to the misclassification point is low; when the value of C is large, the model will minimize the number of misclassified samples, resulting in smaller intervals for decision boundaries. The gamma parameter control data are mapped to the distribution of the high-dimensional space, the fitting degree of the model to the training data is increased along with the increase of the gamma parameter, and the excessive gamma parameter can cause the overfitting of the model.
In the process of optimizing the support vector machine model by using the dung beetle algorithm, the iteration number of the algorithm is set to be 50, and the accuracy of the model obtained by each iteration is calculated. And after the iteration is completed, obtaining a support vector machine model under the optimal parameter combination.
And reconstructing the model by using the obtained optimal parameters, inputting the whole data set of the unit area into the model after the model is evaluated, evaluating the landslide liability of the unit area, outputting a prediction result, and finally obtaining the landslide occurrence probability of all grid units of the whole unit area. The closer the predicted value corresponding to the grid unit is to 0, the more stable the grid unit is, and the smaller the probability of landslide occurrence is; conversely, the closer the predicted value is to l, the more unstable the grid unit, and the greater the probability of landslide.
The DBO-RF model construction flow is shown in fig. 6, a landslide vulnerability evaluation model based on an RF model is built by using randomly selected training set data, and prediction accuracy of the model is verified by using test set data. And optimizing n_ estimators and max_depth parameters of the model by adopting a dung beetle algorithm, so that the accuracy of the model is improved. n estimators is the number of trees and the balance between model overfitting and model performance is explored by selecting the appropriate number of trees. max_depth is the maximum depth of the tree, too deep a tree may be overfitted, while too shallow a tree may not fit the data enough, the deeper the depth, the more complex the model, the better the details of the training data can be captured, but also the easier the overfitting.
In the process of optimizing the random forest model by using the dung beetle algorithm, the iteration number of the algorithm is set to be 50, and the model accuracy obtained by each iteration is calculated. And after the iteration is completed, obtaining a random forest model under the optimal parameter combination.
And reconstructing the model by using the obtained optimal parameters, inputting the whole data set of the unit area into the model after the model is evaluated, evaluating the landslide liability of the unit area, outputting a prediction result, and finally obtaining the landslide occurrence probability of all grid units of the whole unit area. The closer the predicted value corresponding to the grid unit is to 0, the more stable the grid unit is, and the smaller the probability of landslide occurrence is; conversely, the closer the predicted value is to l, the more unstable the grid unit, and the greater the probability of landslide.
And reconstructing the model by using the obtained optimal parameters, inputting the whole data set of the unit area into the model after the model is evaluated, evaluating the landslide liability of the unit area, outputting a prediction result, and finally obtaining the landslide occurrence probability of all grid units of the whole unit area. The closer the predicted value corresponding to the grid unit is to 0, the more stable the grid unit is, and the smaller the probability of landslide occurrence is; conversely, the closer the predicted value is to l, the more unstable the grid unit, and the greater the probability of landslide.
The improvement of InSAR technology promotes the assessment of landslide hazard, in order to better assess the landslide hazard of research hazard, the invention combines the deformation result of the unit area track lifting synthetic aperture radar data and the landslide vulnerability partitioning result based on the DBO-RF model to unfold the landslide hazard assessment on the unit area, refers to a landslide vulnerability dynamic assessment matrix, combines the deformation information with the landslide vulnerability partitioning map, and draws a landslide vulnerability dynamic assessment map considering deformation factors in the unit area. The landslide is a dynamic process, the hazard degree of the landslide changes along with time, SBAS-InSAR deformation information is taken as a dynamic index to be incorporated into landslide susceptibility evaluation, and interference of dynamic change on an evaluation result can be effectively reduced. The method of considering only dynamic evaluation indexes can lead to the excessive and inaccurate evaluation results of regional landslide disasters. The landslide hazard can be dynamically evaluated by combining deformation information obtained by the SBAS-InSAR technology with landslide vulnerability evaluation.
The landslide susceptibility evaluation is to evaluate the possibility of landslide disasters in a unit area based on sample data, and provides scientific basis for early identification, prevention and control of landslide disasters and other decisions. The theoretical essence of landslide susceptibility evaluation is that the probability of landslide occurrence in the future is predicted according to the similarity between the current landslide occurrence and the past landslide occurrence through statistics of the actual situations of the landslide occurrence before.
Landslide evolution is a dynamic process, landslide risks in different regions are different, and SBAS-InSAR deformation information can provide important data for dynamic landslide disaster assessment. Displacement monitoring data of the unit area region 2023, 1 month and 2023, 12 months are acquired based on the Sentinel-1A radar image. The surface deformation of the area per unit area is obtained through interpolation analysis and resampling treatment, grading experiments are carried out based on subjective experience methods, and the best results are selected to divide the deformation into 5 grades, -1 to 1mm/y (stable), -3 to 3mm/y and 1 to 3mm/y (basically stable), -5 to 3mm/y and 3 to 5mm/y (relatively unstable), -7 to-5 mm/y and 5 to 7mm/y (unstable) and < 7mm/y and >7mm/y (extremely unstable).
And establishing a landslide vulnerability dynamic evaluation matrix of the research area by combining SABS-InSAR deformation information and a landslide vulnerability regional graph obtained based on a DBO-RF model as shown in table 1.
Table 1 dynamic evaluation matrix for landslide susceptibility
And according to the dynamic evaluation matrix, combining the landslide susceptibility regional graph and the ground deformation information to obtain a final landslide susceptibility dynamic evaluation result of the research area.
In the description of the present invention, it should be noted that, unless explicitly specified and limited otherwise, the terms "disposed," "mounted," "connected," and "fixed" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art in a specific case.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various equivalent changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. A dynamic evaluation method for landslide susceptibility is characterized by comprising the following steps:
S1, landslide information is obtained; acquiring landslide inventory information of a unit area by applying technical means and a statistical data mode, comprehensively considering a landslide disaster-pregnancy mechanism and a disaster-pregnancy environment of the unit area, analyzing landslide distribution characteristics of the unit area from two aspects of disaster type and distribution rule, and extracting 10 evaluation indexes of 5 categories;
S2, constructing an evaluation index system; dividing the data type evaluation index into intervals by adopting an equidistant classification method, a standard deviation method, a quantile method, a geometric interval method, a natural break point method and ChiMerge discretization methods respectively, calculating the q values of the evaluation index obtained by different interval division methods by using a geographic detector, and selecting the evaluation index interval division method with the maximum q value to realize reasonable interval division of the evaluation index; selecting an evaluation index passing through multiple collinearity test, carrying out preliminary prediction on the regional landslide susceptibility based on a weighted information quantity model, randomly selecting non-landslide points with the same number as the landslide points in a very low susceptibility region of the weighted information quantity model as non-landslide negative samples, wherein the selection of the negative samples meets the requirements of randomness and low susceptibility;
S3, constructing a machine learning model; integrating a historical landslide positive sample and a non-landslide negative sample to form a training sample set for landslide vulnerability modeling of a unit area region, introducing a dung beetle algorithm to perform parameter optimization on a XGBoost model, a support vector machine model and a random forest model, selecting optimal parameters to establish a machine learning landslide vulnerability prediction model, completing landslide vulnerability evaluation of the unit area region, drawing a landslide disaster vulnerability partition map, performing partition statistics and machine learning model accuracy evaluation, comparing the results of the XGBoost model, the support vector machine model and the random forest model from 3 aspects of confusion matrix, statistical parameters and a subject working curve, and analyzing the accuracy of the landslide vulnerability prediction model of the unit area region;
s4, dynamically evaluating landslide susceptibility; based on the ascending rail Sentinel-1A image data of the unit area region, extracting earth surface deformation information by utilizing an SBAS-InSAR technology, and analyzing the spatial distribution characteristics of landslide disasters of the unit area region by combining the deformation information; and comparing the application and performance of the landslide susceptibility prediction results of the DBO-XGBoost model, the DBO-SVM model and the DBO-RF model in the unit area region, and establishing a landslide susceptibility dynamic evaluation matrix in the unit area region by combining a landslide susceptibility result graph obtained by the DBO-RF model and SABS-InSAR deformation information, so as to dynamically evaluate the landslide susceptibility in the unit area region.
2. The dynamic landslide vulnerability assessment method of claim 1, wherein the 5 categories are topography, geologic structure, meteorological hydrology, environmental conditions, human activities, respectively; the 10 evaluation indexes are respectively elevation, gradient, slope direction, topography fluctuation degree, lithology, river distance, land utilization, road distance, POI nuclear density and human footprint index.
3. A dynamic evaluation method for landslide susceptibility according to claim 1, wherein,
The equidistant classification method comprises the steps of dividing the range of an evaluation index into a plurality of sub-ranges with equal size by using equal intervals, dividing the equidistant classification method interval of an evaluation index grid layer by using a reclassification tool in ArcGIS Pro software, and setting the interval number to be 6 based on a subjective experience method;
the natural break point method automatically classifies the intervals of the evaluation indexes in ArcGIS Pro software, the reclassification tool in the ArcGIS Pro software is used for classifying the intervals of the natural break point classification method of the evaluation index grid layer, and the interval number is set to be 6 based on a subjective experience method;
The score method utilizes a reclassification tool in ArcGIS Pro software to carry out the interval division of the score classification method of the evaluation index grid layer, and the interval number is set to be 6 based on a subjective experience method;
The geometric interval method utilizes a reclassification tool in ArcGIS Pro software to carry out interval division of the geometric interval classification method of the evaluation index grid layer, and the interval number is set to be 6 based on a subjective experience method;
The standard deviation method utilizes a reclassification tool in ArcGIS Pro software to divide standard deviation classification method intervals of an evaluation index grid layer, and the intervals are set to be 1 time;
the ChiMerge discretization method divides the interval of the data type evaluation index, uses the chi-square box division tool to discretize ChiMerge the continuous type evaluation index of the elevation, the gradient and the topography fluctuation degree in the IBM SPSS STATISTICS 26.0.0 software, and sets the threshold value to be 0.05.
4. The dynamic landslide susceptibility evaluation method of claim 1, wherein the geographic detector is used for revealing the driving force by calculating the spatial heterogeneity of the evaluation index, and the detection of the interpretation degree of the landslide by the differential and factor detector is selected by the following expression:
Wherein i represents the layering number of the landslide evaluation index, i=1, 2, …, m; n i represents the number of units of the ith layer of the landslide evaluation index; n represents the unit number of the whole area of the evaluation index; Representing the variance of the landslide Y value corresponding to the ith layer of the landslide evaluation index; σ 2 represents the variance of the landslide Y value corresponding to the whole region of the evaluation index; SL represents the sum of the intra-layer partitions; ST denotes the total variance of the whole region.
5. The dynamic evaluation method of landslide susceptibility according to claim 1, wherein the multiple collinearity test is used to find linear or nonlinear relationship between variables, and when TOL >0.1 and VIF <10, there is no multiple collinearity between evaluation indexes, and the calculation expression is:
Wherein, Representing the decision coefficients, tolerances representing the tolerance, VIF representing the variance expansion factor;
The VIF and TOL of the landslide susceptibility evaluation index of the unit area are calculated by using a co-linearity diagnostic tool in linear analysis of IBM SPSS STATISTICS 26.0.0 software, the evaluation index with high co-linearity is eliminated, and the 3 selected noun type evaluation indexes and 7 numerical type evaluation indexes obtained by a geographic detector model to obtain an optimal interval division method do not show strong co-linearity, and are used as input variables for constructing a landslide susceptibility prediction model.
6. The dynamic landslide vulnerability assessment method of claim 1, wherein the information quantity model has a calculation formula:
Wherein, I (y, x 1,x2,…xn) represents the information quantity provided by the evaluation index combination x 1,x2,…,xn for geological disasters; p (y|x 1,x2,…,xn) represents the probability of occurrence of landslide hazard under the combined condition of the evaluation index x 1,x2,…,xn; p (y) represents the probability of occurrence of landslide hazard; The information quantity provided by the evaluation index x 2 for the landslide susceptibility evaluation result in the presence of the evaluation index x 1 is represented;
When (when) I (y, x 1,x2,…xn) >0, the combination of the representation factors is favorable for landslide hazard occurrence; otherwise, I (y, x 1,x2,…xn) <0, indicating that the combination of factors is unfavorable for the occurrence of landslide hazard;
calculating information magnitude values provided by different levels of states of each evaluation index for landslide disasters, wherein the expression is as follows:
Wherein P (x i, D) represents the probability of occurrence in a certain state class of the evaluation index x i; p (x i) represents landslide hazard distribution probability;
Calculating the information quantity of each evaluation index, wherein the expression is as follows:
n i represents the number of developing geological disasters under a certain grading state (x i) of a certain evaluation index; n represents the total number of geological disasters occurring in the whole unit area; s i represents the number of grids of a certain rating state (x i) of a certain rating factor; s, the grid unit number of the whole unit area;
the total information value in each grid unit is calculated by the following expression:
wherein n represents the number of evaluation indexes involved in landslide susceptibility evaluation.
7. The dynamic landslide vulnerability assessment method of claim 1 is characterized in that the non-landslide negative sample is used for carrying out preliminary prediction on the landslide vulnerability of a unit area region by using a weighted information quantity model, non-landslide sample points with the same number as the landslide samples are randomly selected in an extremely low vulnerability region in a prediction result, landslide disaster vulnerability assessment of the unit area region is completed by using a machine learning model based on grid units, and non-landslide points with the same number as the landslide points are randomly selected in ArcGIS Pro3.0 software by using a random point generation tool by using a landslide vulnerability region result of the weighted information quantity model.
8. The landslide vulnerability dynamic evaluation method of claim 1, wherein the XGBoost model utilizes randomly selected training set data to establish a landslide vulnerability evaluation model based on a XGBoost model, utilizes test set data to verify prediction accuracy of the model, adopts a dung beetle algorithm to optimize n_ estimators and max_depth parameters of the model, improves accuracy of the model, n_ estimators represents the number of trees, max_depth represents the maximum depth of the trees, each tree in the XGBoost model is limited in depth, prevents overfitting by limiting the depth of the tree, sets the iteration number of the algorithm to be 50 times in the process of optimizing the XGBoost model by the dung beetle algorithm, calculates the model accuracy obtained by each iteration, and obtains the XGBoost model under the optimal parameter combination after the iteration is completed.
9. The landslide vulnerability dynamic evaluation method of claim 1, characterized in that the support vector machine model utilizes randomly selected training set data to establish a landslide vulnerability evaluation model based on XGBoost model, utilizes test set data to verify the prediction accuracy of the model, adopts dung beetle algorithm to optimize C and gamma parameters of the model, improves the accuracy of the model, C represents a punishment coefficient, determines the tolerance of the degree of violation interval of sample data, and has low punishment to misclassification points when the C value is small; when the C value is larger, the number of misclassified samples is reduced by the model, the decision boundary has smaller interval, gamma parameter control data are distributed after being mapped to a high-dimensional space, the fitting degree of the model to training data is increased along with the increase of gamma parameters, and the overfitting of the model is caused by the overlarge gamma parameters.
10. The landslide vulnerability dynamic evaluation method of claim 1, wherein the random forest model utilizes randomly selected training set data to establish a landslide vulnerability evaluation model based on the random forest model, utilizes test set data to verify the prediction accuracy of the model, adopts a dung beetle algorithm to optimize n_ estimators and max_depth parameters of the model, improves the accuracy of the model, sets the iteration number of the algorithm to 50 times in the process of optimizing the random forest model by the dung beetle algorithm, and calculates the model accuracy obtained by each iteration; after iteration is completed, a random forest model under the optimal parameter combination is obtained, the model is reconstructed by the obtained optimal parameters, after the model training is completed, the whole data set of the unit area is input into the model, landslide susceptibility evaluation of the unit area is carried out, a prediction result is output, and finally the landslide occurrence probability of all grid units of the whole unit area is obtained; the closer the predicted value corresponding to the grid unit is to 0, the more stable the grid unit is, and the smaller the probability of landslide occurrence is; conversely, the closer the predicted value is to l, the more unstable the grid unit, and the greater the probability of landslide.
CN202410468295.9A 2024-04-18 2024-04-18 Dynamic landslide susceptibility evaluation method Pending CN118296961A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410468295.9A CN118296961A (en) 2024-04-18 2024-04-18 Dynamic landslide susceptibility evaluation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410468295.9A CN118296961A (en) 2024-04-18 2024-04-18 Dynamic landslide susceptibility evaluation method

Publications (1)

Publication Number Publication Date
CN118296961A true CN118296961A (en) 2024-07-05

Family

ID=91676391

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410468295.9A Pending CN118296961A (en) 2024-04-18 2024-04-18 Dynamic landslide susceptibility evaluation method

Country Status (1)

Country Link
CN (1) CN118296961A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116699612A (en) * 2023-06-07 2023-09-05 哈尔滨工业大学 Landslide hazard evaluation method integrating InSAR deformation and rainfall
CN117195750A (en) * 2023-11-07 2023-12-08 武汉工程大学 Landslide disaster sensitivity model construction method based on time sequence deformation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116699612A (en) * 2023-06-07 2023-09-05 哈尔滨工业大学 Landslide hazard evaluation method integrating InSAR deformation and rainfall
CN117195750A (en) * 2023-11-07 2023-12-08 武汉工程大学 Landslide disaster sensitivity model construction method based on time sequence deformation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ISMA KULSOOM 等: "SBAS-InSAR based validated landslide susceptibility mapping along the Karakoram Highway: a case study of Gilgit-Baltistan, Pakistan", 《SCIENTIFIC REPORTS》, vol. 13, 27 February 2023 (2023-02-27), pages 1 - 20 *
朱月月: "基于XGBoost的滑坡易发性区划研究——以湘西自治州为例", 《中国优秀硕士学位论文全文数据库 基础科学辑》, no. 2, 15 February 2022 (2022-02-15), pages 011 - 48 *

Similar Documents

Publication Publication Date Title
Cao et al. Multi-geohazards susceptibility mapping based on machine learning—a case study in Jiuzhaigou, China
CN113642849B (en) Geological disaster risk comprehensive evaluation method and device considering spatial distribution characteristics
Pradhan A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS
Huang et al. The uncertainty of landslide susceptibility prediction modeling: Suitability of linear conditioning factors
Wan Entropy-based particle swarm optimization with clustering analysis on landslide susceptibility mapping
Ge et al. A comparison of five methods in landslide susceptibility assessment: a case study from the 330-kV transmission line in Gansu Region, China
de Souza et al. A data based model to predict landslide induced by rainfall in Rio de Janeiro city
Wen et al. A hybrid machine learning model for landslide-oriented risk assessment of long-distance pipelines
CN117035465B (en) Method and device for evaluating landslide susceptibility
Huang et al. Uncertainties of landslide susceptibility prediction: Influences of different spatial resolutions, machine learning models and proportions of training and testing dataset
Ke et al. Comparison of natural breaks method and frequency ratio dividing attribute intervals for landslide susceptibility mapping
Ma et al. Landslide susceptibility assessment using the certainty factor and deep neural network
CN117808214A (en) Hydraulic engineering data analysis system
Feng et al. Granular risk assessment of earthquake induced landslide via latent representations of stacked autoencoder
CN118095834A (en) Traffic accident risk studying and judging method based on interpretable random forest
CN116258279B (en) Landslide vulnerability evaluation method and device based on comprehensive weighting
CN117408167A (en) Debris flow disaster vulnerability prediction method based on deep neural network
CN112819208A (en) Spatial similarity geological disaster prediction method based on feature subset coupling model
Shi et al. Landslide risk assessment using granular fuzzy rule-based modeling: A case study on earthquake-triggered landslides
CN115169718A (en) Cellular automaton-based regional landslide risk dynamic prediction method and device
CN118296961A (en) Dynamic landslide susceptibility evaluation method
Wan et al. Optimized object-based image classification: development of landslide knowledge decision support system
Fang et al. Zonation and scaling of tropical cyclone hazards based on spatial clustering for coastal China
Oloyede et al. Descriptive and diagnostic analysis of NASA and NiMet big weather data
CN110674471A (en) Debris flow easiness prediction method based on GIS (geographic information System) and Logistic regression model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination