CN115905818A

CN115905818A - Landslide early warning method based on data mining

Info

Publication number: CN115905818A
Application number: CN202310142543.6A
Authority: CN
Inventors: 王海英; 李智; 陶建宇; 王晨光; 敖杨
Original assignee: Changan University
Current assignee: Changan University
Priority date: 2023-02-21
Filing date: 2023-02-21
Publication date: 2023-04-04

Abstract

The invention discloses a landslide early warning method based on data mining, which comprises the following steps: acquiring historical monitoring data of a target position, and storing the acquired data in a database; carrying out variation modal decomposition on the obtained surface absolute displacement data to obtain a limited number of inherent modal components; performing wavelet threshold denoising on each inherent modal component obtained by decomposition, and finally reconstructing the denoised component to obtain denoised and reconstructed displacement data; carrying out displacement prediction on the displacement data after noise reduction processing by using a long-short term memory neural network method; and constructing a deep belief neural network model, and utilizing landslide displacement prediction data to early warn whether landslide exists. The method solves the problem of poor prediction precision in landslide displacement early warning in the prior art, thereby realizing a more comprehensive and scientific early warning method.

Description

Landslide early warning method based on data mining

Technical Field

The invention relates to a geological disaster monitoring and early warning technology, in particular to a landslide early warning method based on data mining.

Background

At present, landslide disasters cause great harm to public infrastructure. Usually, landslide occurs and has the characteristics of concealment, large destructive power, strong randomness and the like, and traditional investigation and supervision are mainly carried out in a manual investigation mode. With the development of science and technology, the displacement of the landslide body after a period of time is presumed by learning historical monitoring data of the landslide body by means of a computer system, and finally, the occurrence position of the landslide in the future is reasonably predicted and early warning is sent out; therefore, the response scheme is made as soon as possible, and the life and property safety of people is saved.

From the aspect of data monitoring: in practical engineering, the monitoring of various sensors is difficult to popularize on a large scale due to the limitation of expense and manpower, and displacement monitoring is used as the most extensive monitoring means. Under the condition of field complexity and changeability, the monitored data usually cover various interference signals, the accuracy of people identification information can be improved by reducing noise of the signals, and the guarantee is improved for subsequent signal processing. The method mainly reduces noise of the signal, performs global information analysis on the signal from the most classical Fourier transform to window Fourier transform, and overcomes the defect of the Fourier transform in the process of processing non-stable and non-linear signals; the invention provides a signal denoising method based on VMD decomposition and wavelet threshold denoising based on the improvement of the wavelet denoising method, and the method can well eliminate the interference of Gaussian white noise to the signal to a certain extent.

In the aspect of landslide displacement time series prediction: most of landslide deformation prediction models at the beginning are researched from the aspects of machine learning and deep learning, for example, the landslide deformation trend is predicted through a convolutional neural network, a BP neural network and the like, but the result is not ideal; and then, predicting the landslide deformation trend through an RNN neural network, wherein the RNN can process sequences of any length. However, in practical applications, RNNs suffer from gradient vanishing problems. To overcome the shortcomings of conventional RNNs, a special RNN, called LSTM neural network, is proposed. In contrast to traditional RNNs, the basic unit of the hidden layer in LSTM is a memory block. The memory block comprises a memory unit and 3 gates (a forgetting gate, an input gate and an output gate). The flow of information into and out of the memory cell is regulated by these 3 "gates". The input gate controls the input vector into the memory cell. Forgetting controls whether information in a last time step was remembered or forgotten. The forgetting door can filter information, retain useful information and discard useless information; by introducing 3 "gates", the LSTM can advantageously deal with time series signals.

In the aspect of landslide early warning, compared with a traditional landslide method, the intelligent early warning method has higher accuracy, but still has respective problems and limitations, for example, expert experience knowledge required by an expert system is difficult to obtain, an artificial neural network is easy to fall into local optimum, a support vector machine belongs to a binary classification algorithm, the situation of classification overlapping or inseparable exists when the method is applied to a multi-classification problem, and the classification efficiency is not high. Compared with shallow machine learning methods, deep learning methods such as a Deep Belief Network (DBN) and a Convolutional Neural Network (CNN) have stronger feature extraction capability and fault tolerance characteristics, can achieve better classification effect, and have very wide application prospect in the landslide early warning field.

Disclosure of Invention

The invention mainly aims to provide a landslide early warning method based on data mining, which solves the problem of poor prediction precision in landslide displacement early warning in the prior art, and utilizes a deep confidence network to carry out early warning by combining historical monitoring data of landslide and the integration of real-time prediction data of landslide displacement, thereby realizing a more integrated and scientific early warning method.

The technical scheme adopted by the invention is as follows: a landslide early warning method based on data mining comprises the following steps:

step 1, acquiring historical monitoring data of a target position, and storing the acquired data in a database;

step 2, carrying out Variation Modal Decomposition (VMD) on the obtained surface absolute displacement data to obtain a limited number of inherent modal components (IMF);

step 3, performing wavelet threshold denoising on each intrinsic mode component (IMF) obtained by decomposition, and finally reconstructing the denoised component to obtain denoised and reconstructed displacement data;

step 4, performing displacement prediction on the displacement data after noise reduction processing by using a long-short term memory neural network (LSTM) method;

and 5, constructing a deep belief neural network (DBN) model, and utilizing landslide displacement prediction data to early warn whether landslide exists.

Further, in step 1, obtaining historical monitoring data of the target position, and storing the obtained data in a database includes the following steps: taking 6 disaster-causing factors such as gradient, slope shape, 1-hour rainfall, 24-hour rainfall, surface absolute displacement monitoring data, crack instrument monitoring displacement and the like as research data; and the acquired information data is transmitted to a remote server through a wireless network technology and stored in a database.

Furthermore, in the step 2, performing Variation Modal Decomposition (VMD) on the obtained time series landslide displacement data to obtain a limited number of intrinsic modal components (IMF); the decomposition of the displacement signal mainly determines the size of the modal decomposition number K value, and the process is as follows:

step 21 residual error analysis

Assuming different k values, inputting landslide monitoring data as an original input sequence, and performing VMD decomposition and reconstruction; and measuring the deviation between the reconstructed sequence and the original sequence by solving the Root Mean Square Error (RMSE) of each group of reconstructed sequence and the original input sequence:

；

step 22: ADF stability analysis, selecting a K value which minimizes the possibility of component or sequence non-stationarity according to the ADF value of the analyzed sequence;

step 23: analyzing the inherent modal components and the correlation coefficient between the residual error (Re) and the original input sequence under different K values, and finally determining the size of the K value; and inputting the displacement data into the VMD model to obtain K inherent modal component data.

Furthermore, in step 3, wavelet threshold denoising is performed on each IMF obtained by decomposition, and finally the denoised component is reconstructed to obtain reconstructed displacement data, and the specific process is as follows:

step 31, for landslide displacement monitoring data, preferably selecting a tightly-supported dual-orthogonal wavelet (dbN) as a wavelet basis function, wherein the processing effect is best when the number of decomposition layers is 3, then performing denoising processing by using a wavelet soft threshold method, and selecting a wavelet soft threshold Rigorous SURE as a threshold rule;

step 32, performing wavelet transformation on the IMF components by using db3 wavelet to obtain a group of wavelet coefficients corresponding to the decomposition level;

step 33, comparing the wavelet coefficient obtained by decomposition with the selected threshold rule, if the wavelet coefficient is greater than the threshold, then considering the wavelet coefficient to be mainly composed of useful signals, and keeping the wavelet coefficient, otherwise, considering the wavelet coefficient to be mainly composed of noise signals, and discarding the wavelet coefficient;

step 34, reconstructing the retained signal to obtain an IMF component after noise reduction;

step 35, performing the operations of the step 2-4 on each IMF component until all IMF components are subjected to wavelet threshold denoising operation;

and step 36, reconstructing the IMF component after the noise is removed to obtain a landslide displacement data signal after noise reduction reconstruction.

Furthermore, in step 4, the displacement prediction is performed on the displacement data after the noise reduction processing by using a long and short term memory neural network (LSTM) method, and the specific process is as follows:

step 41, establishing an LSTM landslide displacement time sequence prediction model which consists of an input layer, two hidden layers and an output layer;

step 42, data segmentation, namely dividing the displacement data subjected to noise reduction and reconstruction into 80% of training sets and 20% of verification sets; importing the data into an LSTM prediction model;

step 43, determination of hyper-parameters: determining the number of nodes of a hidden layer of the neural network through an empirical equation:

；

in the formula,

is the number of input layer neurons; />

Is the number of neurons in the output layer, and>

is the number of neurons in the training set>

Is an integer between 2 and 10;

and step 44, obtaining the number of the neurons by an empirical formula, and obtaining a more accurate effect only by continuous tests, wherein in order to visually compare the model prediction results under different parameters, the average error (MAE), the average error percentage (MAPE) and the Root Mean Square Error (RMSE) are introduced for different result comparisons:

；

in the formula

Indicates the ith group predictor, <' > is selected>

The real value of the m +1 th moment in the ith group of samples is represented, and L is the number of samples used in one iteration;

and step 45, step 5, continuously adjusting the parameters to obtain the optimal LSTM model parameters.

Furthermore, in the step 5, a Deep Belief Network (DBN) model is constructed by using the historical monitoring data, and then landslide displacement prediction data is input into the model to warn whether landslide occurs, wherein the model structure is as follows;

the structure of the DBN model is: the system comprises 3 RBM networks, 1 BP network and 1 Softmax classifier which are connected in sequence;

the input is as follows: monitoring rainfall at 1 hour and 24 hours, gradient and slope shape of a slope body, crack instrument monitoring displacement, surface absolute displacement and corresponding early warning types;

the output is the early warning category of the Softmax classifier: the first-stage early warning, the second-stage early warning and the third-stage early warning do not early warn 4 states;

introducing a Dropout algorithm in an unsupervised pre-training stage of the DBN, and utilizing a Dropout technology to carry out regularization processing: in the pre-training stage, on the premise that the input and output of the neural network are kept unchanged, the weights of the hidden layer nodes are randomly adopted by using a certain probability, and each time adjustment is carried out, part of neurons do not participate in the forward propagation training process; the probability of Dropout is specifically set to 50%.

Furthermore, in the step 5, the deep confidence network (DBN) is used to synthesize historical landslide monitoring data, slope support information data and landslide displacement prediction data to warn whether landslide is occurring, and the specific implementation process of the deep confidence network (DBN) is as follows:

step 51, calculating coefficients of all influence factors of the landslide hazard factors by using a deterministic Coefficient (CF), and normalizing landslide data to be in a [0,1] interval by using a Mapminmax normalization method;

step 52, dividing the collected and processed multiple groups of historical data into training samples and testing samples;

step 53, inputting data into the model, and initializing the link weight between each layer of the limited Boltzmann machine unit network model by adopting an unsupervised greedy algorithm

And a bias value>

Pre-training the deep belief network; in order to prevent laziness and overfitting among the neural nodes, 50% of dropout is introduced into a visible layer;

step 54, selecting an activation function for the hidden layer, executing a Gibbs sampling method and a contrast divergence algorithm to carry out multiple times of iteration pre-training on each layer to update the model parameters, and obtaining a deep confidence network model;

step 55: further fine tuning and optimizing model parameters of the pre-trained deep confidence network from top to bottom by adopting a BP neural network algorithm and a gradient descent method;

step 56: judging the type of the early warning by the output of the DBN through a Softmax classifier; carrying out verification by bringing the test sample into a training model to finally obtain an optimal early warning model;

and 57: and inputting landslide displacement prediction data in the optimal model, and obtaining the grade of landslide early warning through a Softmax classifier.

The invention has the advantages that:

the landslide early warning method based on data mining solves the problem of poor prediction precision in landslide displacement early warning in the prior art, and early warning is carried out by combining the historical monitoring data of landslide and the real-time prediction data of landslide displacement through a deep confidence network, so that a more comprehensive and scientific early warning method is realized.

In addition to the above-described objects, features and advantages, the present invention has other objects, features and advantages. The present invention will be described in further detail below with reference to the drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention.

FIG. 1 is a flow chart of the present invention of variational modal decomposition and wavelet threshold denoising;

FIG. 2 is a flow chart of the operation of the long and short term memory neural network of the present invention;

FIG. 3 is a flow diagram of the operation of the deep belief network of the present invention;

FIG. 4 is a graph of displacement time after VMD-wavelet de-noising of the present invention;

FIG. 5 is a prediction graph of the LSTM model of the present invention;

FIG. 6 is a graph of the error rate of the DBN test set of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Referring to fig. 1 to 6, a landslide warning method based on data mining includes:

step 2, carrying out Variational Modal Decomposition (VMD) on the obtained surface absolute displacement data to obtain a limited number of inherent modal components (IMF);

In the step 1, historical monitoring data of the target position is acquired, and the acquired data is stored in a database, wherein the data comprises the following aspects: taking 6 disaster-causing factors such as gradient, slope shape, 1-hour rainfall, 24-hour rainfall, surface absolute displacement monitoring data, crack instrument monitoring displacement and the like as research data; and the acquired information data is transmitted to a remote server through a wireless network technology and is stored in a database.

In the step 2, performing Variational Modal Decomposition (VMD) on the obtained time series landslide displacement data to obtain a limited number of inherent modal components (IMF); the decomposition of the displacement signal is mainly to determine the size of the number K of modal decomposition, and the process is as follows:

step 21 residual error analysis

；

In the step 3, wavelet threshold denoising is performed on each IMF obtained by decomposition, and finally, the denoised components are reconstructed to obtain reconstructed displacement data, and the specific process is as follows:

step 31, for landslide displacement monitoring data, preferably selecting a tightly-supported dual-orthogonal wavelet (dbN) as a wavelet basis function, wherein the processing effect is best when the number of decomposition layers is 3, then performing denoising by using a wavelet soft threshold method, and selecting a wavelet soft threshold Rigorous SURE as a threshold rule;

step 32, utilizing db3 wavelet to carry out wavelet transformation on IMF components to obtain a group of wavelet coefficients corresponding to the decomposition layer number;

In the step 4, the displacement prediction is performed on the displacement data after the noise reduction processing by using a long-short-term memory neural network (LSTM) method, and the specific process is as follows:

and 43, determining the hyper-parameters: determining the number of nodes of a hidden layer of the neural network through an empirical equation:

；

in the formula,

is the number of neurons in the input layer; />

Is the number of neurons in the output layer, and>

is the number of neurons in the training set>

Is an integer between 2 and 10;

；

in the formula

Indicates the ith group predictor, <' > is selected>

In the step 5, a Deep Belief Network (DBN) model is constructed by using historical monitoring data, and landslide displacement prediction data is input into the model to warn whether landslide exists or not, wherein the model structure is as follows;

the input is as follows: the method comprises the steps of monitoring rainfall capacity within 1 hour and 24 hours, the gradient and slope shape of a slope body, monitoring displacement of a crack instrument, absolute displacement of the earth surface and corresponding early warning categories;

in order to solve overfitting in network training, a Dropout algorithm is introduced in an unsupervised pre-training stage of the DBN, and a Dropout technology is utilized to carry out regularization: in the pre-training stage, on the premise that the input and output of the neural network are kept unchanged, the weights of the hidden layer nodes are randomly adopted by using a certain probability, and each time adjustment is carried out, part of neurons do not participate in the forward propagation training process; the probability of Dropout is set to 50% in particular.

In the step 5, the Deep Belief Network (DBN) is used to synthesize historical landslide monitoring data, slope support information data and landslide displacement prediction data to warn whether landslide is occurring, and the specific implementation process of using the Deep Belief Network (DBN) is as follows:

step 51, calculating coefficients of all influence factors of the landslide hazard factors by using a deterministic Coefficient (CF), and normalizing the landslide data to be in a [0,1] interval by using a Mapminmax normalization method;

And a bias value->

step 56: judging the type of the early warning by the output of the DBN through a Softmax classifier; bringing the test sample into a training model for verification to finally obtain an optimal early warning model;

Overview of the invention and data sources

Area of investigation

The study of the invention depends on HP21 (K60 +806 to K61+ 449) section landslide body of a south slope zone of an LJ-15 standard section Tao Yuhe of a high-speed south section of a Xian outer ring, and the total length of the engineering standard section is 3.393km. The landslide occurs in the fourth system in the late-middle period, the loess is in a rainstorm condition or a continuous rainfall condition, the rainwater seeps to the top surface of the mudstone along loess pores, the third system of the mudstone is compact in structure to form a water-resisting layer, the shearing strength of a soil body in a saturated state is fast attenuated, and the landslide is formed under the continuous rainfall. According to investigation, since Wenchuan earthquake occurred in 2008, the secondary landslides HP21-1 and HP21-2 generate different degrees of sliding, which leads to house cracking and inclination, and analysis shows that the secondary landslides HP21-1 and HP21-2 are in an under-stable state at present. The front edge shear outlet of the main landslide is positioned 5-10 m below the modern river channel, no sliding free surface exists, and the whole landslide is in a stable state.

Data acquisition

Landslide is produced by a plurality of harmful factors, so the invention combines external geographic environment and cheap selection of collected data: and 6 disaster-causing factors such as gradient, slope shape, 1-hour rainfall, 24-hour rainfall, surface absolute displacement monitoring data, crack instrument monitoring displacement and the like are used as research data. And the acquired information data is transmitted to a remote server through a wireless network technology and is stored in a database.

Variational modal decomposition

In the landslide early warning model based on the deep belief network, firstly, the acquired earth surface absolute displacement data are used for prediction, then, an early warning model is constructed through the acquired monitoring data, and finally, the predicted displacement data are led into the optimal early warning model to carry out real-time early warning on whether landslide exists or not. Before displacement prediction is carried out by using monitoring data, abnormal values need to be removed and noise reduction processing needs to be carried out on the data: the noise reduction is realized by a method of the variational modal decomposition plus wavelet threshold noise reduction. The data of the invention is that 14 days of 8 months and 14 days of 2022 years are recorded every 1 hour from 8 months and 1 days of 2022 years, and 311 groups of data are totally recorded, wherein the monitoring data of the first two days are as follows in the following table 1:

；

the Variational Modal Decomposition (VMD) is a non-transfer signal processing method, and by iteratively searching for the optimal solution of the variational modal, time series data can be decomposed into a series of Intrinsic Mode Functions (IMFs) with limited bandwidth, and the optimal central frequency and bandwidth of each IMF can be updated adaptively. And the VMD has better anti-noise capability and can overcome the problem of frequency aliasing. And carrying out variation modal decomposition on the obtained landslide displacement data to obtain a limited number of IFMs. In the process of variable modal decomposition, input parameters of the VMD include f, alpha, tau, K, DC, init, tol. Wherein f represents the time series signal to be decomposed; alpha represents modal constraint strength, i.e. the limited bandwidth of the frequency band when the time series signal is decomposed into different IMFs; tau represents the tolerance to noise; k represents the number of modes to be decomposed; DC represents whether the mode is a direct current component; init represents an initialization parameter determined for the center frequency of each IMF; tol represents the accuracy of the function convergence. The selection of the value of the mode number K needs to be considered comprehensively, and the selection of the value influences whether the under-decomposition or the overlapping phenomenon occurs in the decomposition process of the variation mode. Determining the optimal K value by performing residual error analysis, ADF stability analysis and correlation analysis on different K values, wherein the specific process comprises the following steps:

step 1-residual error analysis

Different values of K are assumed: k =2,3 …, inputting the landslide displacement data as an original input sequence, and performing VMD decomposition reconstruction. And measuring the deviation between the reconstructed sequence and the original sequence by solving the Root Mean Square Error (RMSE) of each group of reconstructed sequence and the original input sequence:

；

step 2: the ADF test is a unit root test method that is stable in classical test time order. In general, the smaller the estimator of the ADF test statistic is, the less negative the original assumption is, the less feasible the sequence is unstable, so the K value that minimizes the possibility of component or sequence instability is chosen by the ADF value of the analyzed sequence.

And 3, step 3: analyzing the correlation coefficient between the intrinsic mode components (IMF) and the residual error (Re) under different K values and the original input sequence, finally determining the size of the K value, inputting the displacement data into the VMD model to obtain K intrinsic mode components, and taking the table 2 as an experimental parameter.

；

Performing wavelet threshold denoising on each IMF obtained by decomposition, and finally reconstructing denoised components to obtain reconstructed earth surface absolute displacement data;

step 1, through multiple practical applications, a tight-support set biorthogonal wavelet (dbN) is preferentially selected as a wavelet basis function for landslide displacement monitoring data, the best processing effect is achieved when the number of decomposition layers is 3, then a wavelet soft threshold method is selected for denoising, and a wavelet soft threshold Rigorou step URE is selected as a threshold rule;

step 2, carrying out wavelet transformation on inherent modal component (IFM) obtained by variable-division modal decomposition (VMD) by utilizing db3 wavelet to obtain a group of wavelet coefficients corresponding to the decomposition layer number;

step 3, comparing the wavelet coefficient obtained by decomposition with a selected threshold, if the wavelet coefficient is greater than the threshold, considering that the wavelet coefficient is mainly composed of useful signals, and reserving the wavelet coefficient, otherwise, considering that the wavelet coefficient is mainly composed of noise signals, and discarding the wavelet coefficient;

step 4, reconstructing the retained signal to obtain an IMF component after noise reduction;

step 5, performing the operations of the step 2-4 on each IMF component until all IMF components are subjected to wavelet threshold denoising;

and 6, finally reconstructing IMF components after noise removal to obtain a landslide displacement data signal after noise reduction

：

；

The processed data of the original data after the variation modal decomposition and the wavelet threshold denoising is shown in figure 4, so that the input monitoring displacement time sequence signal retains the peak and the sudden change of the original displacement time curve after the processing, and the non-stationary phenomenon existing in the time sequence is solved.

And (3) optimizing parameters to establish a long-short-term memory neural network (LSTM) model, inputting the original monitoring data and VMD-wavelet de-noised data into the LSTM model for prediction, and comparing the quality of a prediction result.

Predicting the shift data of the target area by using a long-short-term memory neural network (LSTM) model; the specific process is as follows:

step 1, establishing an LSTM landslide displacement time sequence prediction model which consists of an input layer, two hidden layers and an output layer;

step 2, data segmentation, namely dividing the displacement data subjected to noise reduction reconstruction into 80% of training sets and 20% of verification sets; importing the data into an LSTM prediction model;

step 3, determining the hyper-parameters: when an LSTM network prediction model is constructed, a plurality of original parameters of the network are preset, and the original parameters are called as hyper-parameters and specifically comprise the number of hidden layers of the network, the number of nodes of each layer, the neuron inactivation frequency, a loss function, reactivation parameters, the iteration frequency and the like. At present, a more perfect method for realizing selection of the LSTM neural network super-parameters is not available, and the selection needs to be adjusted through test effects and parameter adjustment experience; and determining the number of nodes of the hidden layer of the neural network by combining an empirical equation:

；

in the formula,

is the number of input layer neurons; />

Is the number of neurons in the output layer, and>

is the number of neurons in the training set, and>

is an integer between 2 and 10;

and 4, obtaining the number of the neurons by an empirical formula, and obtaining a more accurate effect only by continuous tests, and introducing an average error (MAE), an average error percentage (MAPE) and a Root Mean Square Error (RMSE) to compare different results in order to visually compare the quality of a model prediction result under different parameters:

；

in the formula

Indicates the ith group predictor, <' > is selected>

And the real value of the m +1 th time in the ith group of samples is shown, and L is the number of samples used in one iteration.

And 5, continuously adjusting the parameters to obtain the optimal LSTM model.

In order to analyze the advantages and disadvantages of the model, the simply processed original displacement data is input into the model to be used as the model 1, and the data input model after VMD-wavelet denoising is used as the model 2 to carry out displacement prediction, as shown in FIG. 5, the prediction result obtained after the data processing adopted by the invention has obvious advantages and is more consistent with the actual monitoring data.

Model training is carried out on a Deep Belief Network (DBN) by utilizing historical monitoring data of landslide points, and then prediction data obtained by utilizing an LSTM model is input into the trained DBN model to carry out landslide early warning, so that a landslide early warning model is constructed. How to construct the DBN model will be explained in detail below:

the structure of the constructed DBN model is as follows: the system comprises 3 RBM networks, 1 BP network and 1 Softmax classifier which are connected in sequence;

the output is the early warning category of the Softmax classifier: 0-no early warning, 1-three-level early warning, 2-two-level early warning and 3-one-level early warning 4 states and coding, wherein the output codes corresponding to the states are shown in the following table 3:

；

pre-processing of data

Factors influencing geological disasters are complex in series, description modes are mostly expressed in a qualitative and nonuniform mode, mathematical calculation cannot be directly carried out, and in order to fully evaluate the sensitivity among different disaster factors, the method utilizes a deterministic Coefficient (CF) to calculate the coefficient of each influencing factor of the landslide disaster factor.

Firstly, dividing each disaster factor into different subsets according to the existing common classification method, and then calculating the sensitivity of each disaster factor subset by using a formula. The specific expression of the CF function is as follows:

；

in the formula:

for the possibility of a landslide hazard occurring in class a, the @>

For the possibility of landslide disaster in the whole research area, the change range of CF is [ -1,1]. Positive values represent high probability of landslide occurrence and poor geological environmental conditions; negative values represent a low likelihood of landslide occurring and a good geological environment. A CF value of 0 represents uncertainty as to whether or not a landslide has occurred. Of subsets of disaster factors calculated by means of deterministic coefficientsCoefficient of sensitivity, as shown in table 4 below:

；

and then normalization treatment is carried out: the landslide data and the slope geometric parameters are normalized to be within a [0,1] interval by a Mapminmax normalization method:

；

wherein,

is a normalized value>

And &>

Representing the maximum and minimum values of the raw data, data set, respectively.

Building a deep belief network model

In the machine learning model, if the parameters of the model are too many and the training samples are too few, the trained model is easy to generate an overfitting phenomenon. In order to prevent the over-fitting phenomenon, a Dropout algorithm is introduced in the unsupervised pre-training stage of the DBN, and a Dropout technology is utilized to carry out regularization: in the pre-training stage, on the premise that the input and output of the neural network are kept unchanged, the weights of the hidden layer nodes are randomly adopted by using a certain probability, and each time adjustment is carried out, part of neurons do not participate in the forward propagation training process; the probability of Dropout is specifically set to 50%.

The specific implementation process of training the Deep Belief Network (DBN) comprises the following steps:

dividing a plurality of groups of collected and processed historical data into training samples and testing samples;

step 2, inputting data into the model, and initializing link weights among layers of the limited Boltzmann machine unit network model by adopting an unsupervised greedy algorithm

And a bias value->

And pre-training the deep confidence network. In order to prevent laziness between neural nodes and overfitting, 50% of dropout (50% of nodes are randomly frozen, and weights of the nodes are reserved in network training) is introduced into a visible layer, the weights reserved before the selected nodes recover in the next training process are selected, and a part of nodes are randomly selected again to repeat the process, so that a structural schematic diagram of a deep confidence network in the landslide early warning method based on data mining is shown in fig. 3.

And 3, step 3: selecting an activation function for the hidden layer, executing a Gibbs sampling method and a contrast divergence algorithm to perform multiple times of iterative pre-training on each layer to update the model parameters to obtain a deep confidence network model, wherein the parameter updating formula is as follows:

；

in the formula:

represents an updated value of the weight matrix, and>

an updated value, representing the i-th input layer bias>

An update value, representing the jth implied layer bias>

Representing the bias between neurons in the visible layer (hidden layer), device for selecting or keeping>

Representing the bias reconstruction between neurons in the visible layer (hidden layer). />

The learning rate is 0.0001-0.5.

And 4, step 4: and further fine tuning and optimizing the model parameters of the pre-trained deep confidence network from top to bottom by adopting a BP neural network algorithm and a gradient descent method. In addition, since the dropout technology is used in the training process, in the fine tuning optimization process, the error is set to zero at a probability of 50% when the network calculates the node error. And estimating the error of the previous layer of the output layer by using the output error, obtaining all error estimates of the rest layers through back propagation learning layer by layer, calculating and updating the weight of each node by using a gradient descent method, and minimizing reconstruction errors layer by layer.

And 5: judging the type of the early warning by the output of the DBN through a Softmax classifier; carrying out verification by bringing the test sample into a training model to finally obtain an optimal early warning model;

and 6: and inputting environment monitoring data and landslide displacement prediction data into the optimal model, and obtaining the grade of landslide early warning through a Softmax classifier.

According to the method, 6 disaster-causing factors such as gradient, slope shape, rainfall in 1 hour, rainfall in 24 hours, surface absolute displacement monitoring data, crack instrument monitoring displacement and the like are selected as input of a neural network, the weight of nodes of a hidden layer is randomly sampled by using a Dropout mechanism with a certain probability, the occurrence of an over-fitting phenomenon is avoided, and finally, the parameter value of each layer is finely adjusted and optimized by using a BP algorithm. As shown in fig. 6, the error rate of the early warning effect of the DBN model with the Dropout mechanism is lower, and the effect is better.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A landslide early warning method based on data mining is characterized by comprising the following steps:

step 2, carrying out variation modal decomposition on the obtained earth surface absolute displacement data to obtain a limited number of inherent modal components;

step 3, performing wavelet threshold denoising on each inherent modal component obtained by decomposition, and finally reconstructing the denoised component to obtain denoised and reconstructed displacement data;

step 4, performing displacement prediction on the displacement data after the noise reduction treatment by using a long-term and short-term memory neural network method;

and 5, constructing a deep belief neural network model, and utilizing landslide displacement prediction data to early warn whether landslide exists.

2. The data mining-based landslide warning method according to claim 1, wherein in step 1, obtaining historical monitoring data of a target location, storing the obtained data in a database comprises: taking 6 disaster-causing factors such as gradient, slope shape, 1-hour rainfall, 24-hour rainfall, surface absolute displacement monitoring data, crack instrument monitoring displacement and the like as research data; and the acquired information data is transmitted to a remote server through a wireless network technology and is stored in a database.

3. The data mining-based landslide early warning method according to claim 1, wherein in step 2, the obtained time series landslide displacement data is subjected to variation modal decomposition to obtain a limited number of inherent modal components; the decomposition of the displacement signal mainly determines the size of the modal decomposition number K value, and the process is as follows:

step 21 residual error analysis

Assuming different k values, inputting landslide monitoring data as an original input sequence, and performing VMD decomposition and reconstruction; and measuring the deviation between the reconstructed sequence and the original sequence by solving the root mean square error of each group of reconstructed sequence and the original input sequence:

；

step 23: analyzing the inherent modal components and the correlation coefficient between the residual error and the original input sequence under different K values, and finally determining the size of the K value; and inputting the displacement data into the VMD model to obtain K intrinsic mode component data.

4. The data mining-based landslide early warning method according to claim 1, wherein in step 3, wavelet threshold denoising is performed on each IMF obtained by decomposition, and finally the components after denoising are reconstructed to obtain reconstructed displacement data, and the specific process is as follows:

step 31, for landslide displacement monitoring data, preferably selecting a tightly-supported set biorthogonal wavelet as a wavelet basis function, wherein the best processing effect is achieved when the number of decomposition layers is 3, then performing denoising processing by using a wavelet soft threshold method, and selecting a wavelet soft threshold Rigorous SURE as a threshold rule;

5. The landslide early warning method based on data mining according to claim 1, wherein in the step 4, displacement prediction is performed on the displacement data after noise reduction processing by using a long and short term memory neural network method, and the specific process is as follows:

，

in the formula,

is the number of input layer neurons; />

Is the number of neurons in the output layer, and>

is the number of neurons in the training set, and>

is an integer between 2 and 10;

and 44, obtaining more accurate effect through continuous tests according to the number of the neurons obtained by the empirical formula, and comparing different results by introducing an average error, an average error percentage and a root mean square error in order to visually compare the quality of a model prediction result under different parameters:

；

in the formula

Represents the i-th group of predictor values>

The real value of the (m + 1) th time in the ith group of samples is represented, and L is the number of samples used in one iteration;

6. The landslide early warning method based on data mining according to claim 1, wherein in the step 5, a depth confidence network model is constructed by using historical monitoring data, and landslide displacement prediction data is input into the model to early warn whether landslide exists, wherein the model structure is as follows;

the output is the early warning category of the Softmax classifier: first-stage early warning, second-stage early warning, third-stage early warning, and no early warning of 4 states;

introducing a Dropot algorithm in an unsupervised pre-training stage of the DBN, and utilizing a Dropot technology to carry out regularization processing: in the pre-training stage, on the premise that the input and output of the neural network are kept unchanged, the weights of the hidden layer nodes are randomly adopted by using a certain probability, and each time adjustment is carried out, part of neurons do not participate in the forward propagation training process; the probability of Dropout is set to 50% in particular.

7. The landslide early warning method based on data mining according to claim 1, wherein in the step 5, the deep confidence network is used for synthesizing landslide historical monitoring data, slope support information data and landslide displacement prediction data to early warn whether landslide exists, and the specific implementation process of the deep confidence network is as follows:

51, calculating coefficients of all influence factors of the landslide hazard factors by using the certainty coefficients, and normalizing landslide data to be in a 0,1 interval by using a Mapminmax normalization method;

step 53, inputting data into the model, and initializing the link weights among layers of the restricted Boltzmann machine unit network model by adopting an unsupervised greedy algorithm

And a bias value->

step 54, selecting an activation function for the hidden layer, executing a Gibbs sampling method and a contrast divergence algorithm to perform multiple times of iterative pre-training on each layer to update the model parameters, and obtaining a deep confidence network model;