CN114358192A

CN114358192A - Multi-source heterogeneous landslide data monitoring and fusing method

Info

Publication number: CN114358192A
Application number: CN202210013094.0A
Authority: CN
Inventors: 王利; 张懿恺; 许豪; 赵超英; 刘万林; 成伟
Original assignee: Changan University
Current assignee: Changan University
Priority date: 2022-01-06
Filing date: 2022-01-06
Publication date: 2022-04-15
Anticipated expiration: 2042-01-06
Also published as: CN114358192B

Abstract

The invention discloses a multi-source heterogeneous landslide data monitoring and fusing method, which comprises the following steps: obtaining a weighted correlation degree by combining a maximum mutual information coefficient method MIC and a gray correlation analysis method GRA to reflect the influence of multisource heterogeneous landslide monitoring variables on landslide deformation displacement and a common change trend, further carrying out optimization on characteristic factors according to the weighted correlation degree, carrying out stepwise regression fitting analysis on the optimally obtained characteristic factors to obtain corresponding regression coefficients, further calculating to obtain a final regression equation and a fusion result, and carrying out reliability and effectiveness evaluation on the data fusion result by adopting a landslide stage judgment and trend prediction method. The method adopts a multi-source heterogeneous data fusion means to carry out multivariate data effective analysis and processing so as to obtain a more reliable and accurate data fusion result and provide valuable reference information for landslide prediction, thereby effectively improving the precision of landslide prediction.

Description

Multi-source heterogeneous landslide data monitoring and fusing method

Technical Field

The invention relates to the technical field of data fusion, in particular to a multi-source heterogeneous landslide data monitoring fusion method.

Background

The multi-source data fusion technology is used as an emerging interdisciplinary subject in multiple fields, and has been widely applied to landslide deformation monitoring after more than ten years of exploration and development. With the appearance of a plurality of sensors, how to extract comprehensive information of multi-source heterogeneous sensor information and perform effective fusion processing is a current research difficulty. In landslide monitoring, single sensor information cannot comprehensively reflect the deformation characteristics of landslide, and the obtained prediction result is not high in reliability, so that effective feature extraction and comprehensive analysis processing need to be carried out by combining multi-source heterogeneous sensor information to eliminate redundancy and mutual exclusion among data, and a more reliable and accurate prediction result can be obtained.

At present, effective feature extraction and comprehensive analysis processing of source heterogeneous sensor information have the problem that single landslide monitoring information is inaccurate in prediction due to one-sidedness and unreliability.

Disclosure of Invention

The embodiment of the invention provides a multi-source heterogeneous landslide data monitoring and fusing method, which comprises the following steps:

acquiring multi-source heterogeneous monitoring variable data;

dividing multi-source heterogeneous monitoring variables into dependent variables and characteristic variables;

calculating the maximum mutual information coefficient MIC of every two multisource heterogeneous landslide monitoring variables, and screening out the characteristic variables which influence the landslide to the maximum extent;

determining a single-point displacement sequence reflecting landslide deformation characteristics as a reference column, and determining a data sequence consisting of factors influencing landslide deformation as a comparison column;

calculating the grey correlation coefficient and grey correlation degree of the reference number sequence and the comparison number sequence;

calculating a weighted correlation degree according to the maximum mutual information coefficient MIC and the grey correlation degree;

performing characteristic optimization according to the weighted relevance degree, and screening out final characteristic variables;

constructing a feature optimization-stepwise regression feature level data fusion model based on the weighted relevance;

and performing multi-source heterogeneous information fusion by using a feature optimization-stepwise regression feature level data fusion model based on the weighted relevance, and providing effective auxiliary information for landslide prediction.

And in a near step, preprocessing multi-source heterogeneous monitoring variable data:

removing abnormal values, complementing missing values and smoothly denoising data.

And step one, calculating the maximum mutual information coefficient MIC of every two multi-source heterogeneous landslide monitoring variables, which comprises the following steps:

given variables i and j, carrying out i-column and j-row meshing on a scatter diagram formed by the two variables, and solving the maximum mutual information value;

carrying out normalization processing on the maximum mutual information value;

selecting the maximum value of mutual information under different scales as an MIC value;

and obtaining the characteristic variable with the highest correlation degree with the dependent variable.

And further, calculating a grey correlation coefficient by using a calculation formula comprising:

where ρ is a resolution coefficient, 0<ρ<1, the smaller rho is, the larger the difference between the correlation coefficients is, the stronger the distinguishing capability is, usually, rho is 0.5, | x₀(k)-x_i(k) L represents the absolute difference of the corresponding element of each comparison sequence and the reference sequence,

and

respectively representing two-level minimum difference and two-level maximumA large difference.

Further, the degree of association is weighted, and the calculation formula comprises:

where n is the total number of feature variables to be selected, MICs (A, B)_i) Representing a characteristic variable A and a characteristic variable B_iMaximum mutual information coefficient MIC.

In a further aspect, the method comprises the following steps:

sorting the calculated weighted association degrees from big to small;

sorting and screening the characteristic variables according to the weighted relevance;

calculating the weight of each sorted preferred feature;

when the feature weight is preferred

When the characteristic variable is selected, the screening is stopped, and the final characteristic variable is obtained;

wherein, J_SIs the sum of the weighted relevance of the characteristic variables, J_jIs the weighted relevance, omega, of the jth feature variable to be screened_jFor the jth preferred feature weight, α is a given threshold.

Further, the method comprises the following steps: analyzing and comparing the feature optimization-stepwise regression fusion result based on the weighted correlation degree with the BP neural network fusion result;

establishing a BP neural network fusion model, taking an independent variable as a system input variable and a dependent variable as a system output variable;

establishing a multi-input single-output BP neural network fusion model containing two hidden layers;

performing stage evaluation on a feature optimization-stepwise regression fusion and BP neural network data fusion model based on the weighted correlation degree by adopting two indexes of an improved tangent angle and a deformation rate;

and (3) performing prediction comparison analysis by adopting the long-short term memory network artificial neural network LSTM based on the feature optimization-stepwise regression fusion data and the BP neural network fusion data of the weighted correlation respectively.

The embodiment of the invention provides a multi-source heterogeneous landslide data monitoring and fusing method, which has the following beneficial effects compared with the prior art:

1. the MIC is used for measuring the correlation degree between two variables, and compared with other correlation analysis methods, the MIC is suitable for linear and nonlinear data, has universality, fairness and symmetry, and has high accuracy.

2. Combining the mutual information weight and the grey correlation degree, adopting the weighted correlation degree to measure the importance degree of the characteristic factors to the landslide deformation, calculating the optimal characteristic weight, screening out the characteristic factors according to a threshold value, and combining the characteristics of the mutual information and the grey correlation to carry out characteristic optimization, so that the characteristic optimization result is more reliable.

Drawings

FIG. 1 is a flow chart of a feature optimization based on weighted relevance in the fusion method of the present invention;

FIG. 2 is a diagram of an RNN model in the evaluation analysis of the present invention;

FIG. 3 is a schematic diagram showing the structure of RNN-model cryptic layer cells in the evaluation analysis of the present invention;

FIG. 4 is a schematic diagram of the cell structure of the hidden layer of the LSTM model in the evaluation analysis of the present invention;

FIG. 5 is a distribution diagram of monitoring points in an experimental study area according to the present invention;

FIG. 6 is a graph of a feature optimization-stepwise regression fusion result based in part on weighted relevance for the experiments of the present invention;

FIG. 7 shows the fusion result of BP neural network model in the experimental part of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1 to 7, an embodiment of the invention provides a multi-source heterogeneous landslide data monitoring and fusion method, which includes:

and S1, calculating the maximum mutual information. The MIC algorithm is realized by adopting a miniature class library in python program software. The method mainly comprises the following three steps: 1) giving i and j, carrying out i-column and j-row meshing on a scatter diagram formed by the two variables, and solving the maximum mutual information value; 2) carrying out normalization processing on the maximum mutual information value; 3) selecting the maximum value of mutual information under different scales as an MIC value; 4) and obtaining the characteristic variable with the highest correlation degree with the dependent variable for subsequent regression prediction. The MIC is used for measuring the correlation degree between two variables, and compared with other correlation analysis methods, the MIC is suitable for linear and nonlinear data, has universality, fairness and symmetry, and has high accuracy.

And S2, calculating the grey correlation degree. Determining a single-point displacement sequence capable of reflecting landslide deformation characteristics as a reference column, determining a data sequence composed of factors influencing landslide deformation as a comparison column, carrying out non-dimensionalization processing on the reference column and the comparison column, and then obtaining a gray correlation coefficient and a gray correlation degree of the reference column and the comparison column.

S3, weighted correlation feature is preferred. The mutual information value measures the influence of the features on the landslide deformation, the weight of the mutual information value reflects the effectiveness of the features, and the grey correlation quantifies the degree of consistency between the features and the landslide deformation. And combining the mutual information weight with the grey correlation degree, measuring the importance degree of the characteristic factors to the landslide deformation by adopting the weighted correlation degree, calculating the optimal characteristic weight, and screening out the characteristic factors according to a threshold value. Feature optimization is performed by combining the characteristics of mutual information and grey correlation, so that the feature optimization result is more reliable.

S4, stepwise regression analysis. And introducing the characteristic factors into the model one by one, performing F test after introducing each explanatory variable, performing t test on the selected explanatory variables one by one, and deleting the introduced explanatory variables when the introduction of the originally introduced explanatory variables is not obvious any more. To ensure that the regression equation preceding each new variable introduced contains only significant variables. This is repeated until neither significant explanatory variables are selected into the regression equation, nor insignificant explanatory variables are removed from the regression equation.

And S5, evaluation and analysis. The first step, the stage of fusion result comparison. In order to evaluate the reliability of feature optimization-stepwise regression feature level data fusion based on the weighted association degree, the feature optimization-stepwise regression fusion result based on the weighted association degree and the BP neural network fusion result are adopted to carry out stage discrimination analysis and comparison. Firstly, establishing a BP neural network fusion model, taking an independent variable as a system input variable and a dependent variable as a system output variable, and establishing a multi-input single-output BP neural network fusion model containing two hidden layers. And the two indexes of the improved tangent angle and the improved deformation rate are adopted to carry out stage evaluation on the feature optimization-stepwise regression fusion and BP neural network data fusion model based on the weighted association degree so as to prove the effectiveness of the fusion result of the invention. And secondly, predicting and comparing fusion results. And (3) respectively carrying out prediction comparison analysis on the dependent variable single-point data of the GNSS monitoring point, the feature preference-stepwise regression fusion data based on the weighted correlation degree and the BP neural network fusion data by adopting an LSTM (long-short term memory network artificial neural network). And (3) constructing an LSTM model by using Keras in a python program, and evaluating the prediction precision of a prediction result by using two indexes of MRE (mean relative error) and MAE (mean absolute error) so as to prove the reliability of a fusion result of the invention.

Specifically, the method comprises the following steps:

and S1, calculating the maximum mutual information. The multi-source heterogeneous sensors on the landslide body are correlated, one sensor can be comprehensively influenced by a plurality of other sensors, so that correlation calculation needs to be carried out on multi-source heterogeneous landslide monitoring variables, and characteristic factors influencing the landslide with the maximum deformation are screened out for subsequent fusion prediction. Prior to MIC (maximum mutual information coefficient) calculation, knowledge of information entropy and mutual information is first needed. Mutual information refers to the degree of association between two random variables, and the average information amount after redundancy is eliminated in the information is called "information entropy", so that the characteristic factors influencing landslide deformation can be screened out by adopting an MI (mutual information) mode, and less redundant information is provided. WhileMIC has a higher accuracy than MI and is not limited to a particular function type, and the degree of correlation between variables can be obtained. If there is a correlation between two variables, the sets of their corresponding data points are distributed in a two-dimensional space, the data space is divided using a grid of m times n, with the frequency of the data points falling in the (x, y) -th grid as an estimate of P (x, y),

(wherein n is_x,yThe number of data points falling in the (x, y) -th grid, n being the total number of data points, and the same applies to obtain estimates of P (x), P (y). Then, mutual information between random variables is calculated, and since the way of meshing data points by multiplying m by n is more than one, the meshing which maximizes the mutual information is obtained, and the value of the mutual information is converted into the (0,1) interval using a normalization factor. And finally, finding out the grid resolution which can maximize the normalized mutual information as the measurement value of the MIC. Wherein the resolution of the grid is limited to mxn<B,B＝f(data_size)＝n^0.6MIC is calculated as

The specific calculation steps are as follows: 1) the maximum mutual information value is calculated. Given i and j, a scatter diagram formed by the two variables X, Y is gridded in i columns and j rows, and the maximum mutual information value is obtained. However, given i and j, a plurality of different gridding schemes can be obtained, and then a mutual information value corresponding to each scheme needs to be calculated to find out a gridding scheme which enables mutual information to be maximum. 2) And normalizing the maximum mutual information value. The obtained maximum mutual information is divided by log (min (X, Y)), so that normalization is obtained. 3) And selecting the maximum value of mutual information at different scales as the MIC value. And then selecting the features with larger influence on landslide deformation, and eliminating the features with less information quantity, so that the variables for modeling are more representative.

And S2, calculating the grey correlation degree. Feature selection using MIC results onlyWithout convincing, MIC analysis representing the influence degree and grey correlation analysis representing the consistency degree can be combined for comprehensive analysis to obtain the characteristic factors more suitable for data fusion. Carrying out non-dimensionalization processing on the multi-source heterogeneous landslide monitoring characteristic sequence, and calculating a correlation coefficient and a correlation degree, wherein the specific process comprises the following steps: 1) determining a reference column sequence and comparing the column sequences. Let reference column sequence Y ═ { Y (k) | k ═ 1,2, … n }; the comparison sequence is X_i＝{X_i(k) I |, k ═ 1,2, …, n }, i ═ 1,2, …, m. 2) Dimensionless of the variables. Because the different characteristic factor dimensions are inconvenient to compare, dimensionless processing is needed.

The non-dimensionalized data sequences form the following matrix:

3) calculating the absolute difference value of the corresponding element of each evaluated object index sequence (comparison sequence) and the reference sequence one by one, namely | x₀(k)-x_i(k) And | n (k ═ 1, …, m; i ═ 1, …, n,) represents the number of objects to be evaluated. 4) Determining

And

5) and calculating the correlation coefficient. The correlation coefficient calculation formula is as follows:

where ρ is the resolution coefficient, 0<ρ<1, if rho is smaller, the difference between the correlation coefficients is larger, and the region isThe stronger the separation capacity, usually ρ is 0.5. 6) And calculating the association degree. Calculating the mean value of the association coefficients of the indexes and the elements corresponding to the reference sequence for each evaluation object (comparison sequence) respectively to reflect the association relationship between each evaluation object and the reference sequence, and recording the mean value as the association degree:

and S3, calculating the weighted association degree. And combining the maximum mutual information weight with the grey correlation degree to obtain the weighted correlation degree of the corresponding characteristics to reflect the characteristics. The greater the weighted relevance, the more important the feature is, the calculation formula is:

where n is the total number of features to be selected.

S4, preferred features. And sorting the calculated weighted association degrees from big to small, selecting the features with the largest weighted association degrees, adding the features into the preferred set, and removing the features from the set to be preferred. Sequentially screening from large to small, and calculating the optimal characteristic weight:

in the formula J_SIs the sum of weighted relevance of each feature in the preferred set, J_jIs the weighted relevance, omega, of the jth feature to be screened_jAnd when the j is the preferred feature weight and the value is less than alpha, the feature screening is considered to be finished.

S5, stepwise regression analysis. The basic idea of stepwise regression is to introduce variables into the model one by one, perform an F-test after each introduction of an explanatory variable, perform a t-test on the already selected explanatory variables one by one, and delete the originally introduced explanatory variables when their introduction becomes no longer significant. To ensure that the regression equation preceding each new variable introduced contains only significant variables. This is an iterative process until neither significant explanatory variables are selected into the regression equation, nor insignificant explanatory variables are removed from the regression equation. The stepwise regression specifically comprises the following steps:

the first step is as follows: building an augmented matrix

Calculating l_ij，l_iy，l_yyAnd r_ij，r_iyThe formulas are respectively as follows:

wherein

An expanded augmentation matrix can be obtained

Wherein R ═ R_ij)_m×m，r_yy＝1，r_y＝(r_1y,r_2y,…,r_my)'。

The second step is that: the s-th step is subjected to elimination transformation, and the result is

Wherein

The third step: and (5) factor elimination.

Selecting j₀So that

2 calculation of

(if F)>F_{Go out}If yes, executing the fourth step; and otherwise, performing s +1 times of elimination transformation, and then performing calculation in two steps and three steps.

The fourth step: a regression factor was introduced. Let s, { j }, f still be defined in step two.

Selecting k₀So that

2 calculation of

Wherein

(if F)<F_IntoIf yes, executing the fifth step; otherwise, performing s +1 times of elimination transformation, and introducing kth₀And (4) calculating the regression factors by two, three and four steps.

And fifthly, neither introducing variables nor removing variables. The regression equation obtained finally is ^ y ^ b₀+∑_j∈{j}^b_jx_jWherein

And S6, evaluation and analysis.

The first step, the stage of fusion result comparison. In order to evaluate the reliability of feature optimization-stepwise regression feature level data fusion based on weighted relevance, the model fusion result and the BP neural network fusion result are adopted for stage judgmentAnd (5) analyzing and comparing. Firstly, establishing a BP neural network fusion model, taking an independent variable as a system input variable and a dependent variable as a system output variable, and establishing a multi-input single-output BP neural network fusion model containing two hidden layers. The BP neural network topology structure comprises an input layer, a hidden layer and an output layer, and simultaneously comprises a forward multilayer feedforward stage and a reverse error correction stage. The forward multi-layer feedforward stage is a forward process for sequentially calculating the actual input and output of each node of each layer from the input layer, and the mathematical model is

In the formula

Is the output value of the ith node of the l layer;

the activation value of the ith node of the ith layer;

the connection weight value from the jth node of the l-1 layer to the ith node of the l layer is obtained;

a threshold value of the ith node of the ith layer; n is a radical of_lThe number of nodes of the l layer. In order to improve the error precision of neurons in an output layer, a gradient descent algorithm is adopted for reverse error propagation. The connection weights between each layer of neurons are adjusted by a gradient descent algorithm, so that the final overall error changes along the reduction direction. The algorithm formula is as follows:

(eta is learning rate) and weight adjustment formula is

And a feature optimization-stepwise regression fusion model based on the weighted joint degree and a BP neural network data fusion model are subjected to stage evaluation by adopting two indexes of an improved tangent angle (reference self-allowance, an improved tangent angle and a corresponding landslide early warning criterion) and a deformation rate, and the evaluation is shown as a comprehensive early warning basis based on a deformation rate threshold and a deformation process.

TABLE 2 comprehensive early warning criteria based on deformation rate threshold and deformation process

And secondly, predicting and comparing fusion results. The feature level fusion is to effectively analyze and process multi-source heterogeneous information obtained by landslide monitoring. Thereby improving the accuracy of prediction. In order to discuss the effectiveness of feature level fusion in improving the landslide prediction and forecast accuracy, LSTM (long-short term memory network artificial neural network) is adopted to perform prediction comparison analysis on the GNSS monitoring point dependent variable single-point data, the feature preference-stepwise regression fusion data based on the weighting union degree and the BP neural network fusion data. The construction of the LSTM model is carried out by adopting Keras in a python program, the LSTM neural network model is a novel neural network algorithm improved based on a common recurrent neural network, and the LSTM model replaces RNN cells of a hidden layer with LSTM cells, so that the problem that the gradient possibly disappears quickly in the process of back propagation can be effectively solved, the LSTM model has long-term memory capacity, and long-time sequence data can be processed. Compared with the RNN model, the LSTM unit has 3 gating switches inside, as shown in fig. 3, where i is an input gate, f is a forgetting gate, c is a cell state, o is an output, and σ and tanh are Sigmoid and hyperbolic tangent activation functions, respectively.

Forget to check door through h_t-1And x_tUsing a Sigmoid unit to output a vector between 0 and 1, wherein the value of 0 to 1 in the vector represents the cell state c_t-1Which information is retained or how much is discarded. 0 means no reservation and 1 means both reservations. f. of_t＝σ(W_f·[h_t-1,x_t]+b_f). The input gate is used to update the cell state. Previously hidden stateThe information and the currently input information are input to a Sigmoid function, and the output value is adjusted between 0 and 1 to decide which information to update, with 0 indicating no importance and 1 indicating importance. While the hidden state and the current input are transmitted to the tanh function and the values are compressed between-1 and 1 to adjust the network, and then the tanh output is multiplied by the Sigmoid output, which will determine what information is important in the tanh output and needs to be preserved. i.e. i_t＝σ[W_f·[h_t-1,x_t]+b_i],～C＝tanh(W_C·[h_t-1,x_t]+b_C). The output gate controls the value of the next hidden state, which can be used for prediction. Firstly, transmitting the previous hidden state and the current input to a Sigmoid function, simultaneously transmitting the newly obtained unit state to a tanh function, and then multiplying the tanh output and the Sigmoid output to obtain new information of the hidden state, wherein the new information is output as an output value of the current unit; and finally, synchronizing the new unit state and the hidden state to the next time step. o_t＝σ[W_o·[h_t-1,x_t]+b_o],h_t＝o_t*tanh(C_t). The LSTM model training process adopts a classical back propagation algorithm and is divided into 4 steps: (1) the output values of LSTM cells were calculated according to the forward calculation method,

the first derivative and the second derivative of the loss function l () are respectively, and the final obtained objective function is:

and W and b respectively correspond to a weight coefficient matrix and an offset term. (2) The error term for each LSTM cell was calculated backwards, including 2 back propagation directions by time and network level. (3) The gradient of each weight is calculated according to the corresponding error term. (4) A gradient-based optimization algorithm is applied to update the weights.

Using both MRE (mean relative error) and MAE (mean absolute error)And evaluating the prediction accuracy of each index.

And detecting the distance between the expected value and the actual value, and measuring the prediction precision.

Example (b):

the experimental data of the invention adopts landslide monitoring data from 3 months 28 days to 10 months 4 days of a landslide body 7# in 2019, namely, Chang Jing county, Hei, Gansu province, and takes the day as a sampling rate, wherein the landslide monitoring data is 191d in total, and comprises two groups of GNSS monitoring data (HF06 and HF07), three groups of displacement meter monitoring data (DCF11, DCF14 and DCF15) and 3 kinds of meteorological data of humidity, temperature and rainfall. When the landslide body landslides in 2019, 10 months, 5 days and 4 days, 5 groups of monitoring equipment all monitor displacement change data of landslide deformation, and data fusion and precision judgment can be carried out by adopting a multi-source heterogeneous sensor. The distribution of the monitoring points in the experimental area is shown in figure 5.

The method comprises the steps of firstly carrying out data preprocessing on multi-sensor variables and environmental factors, wherein the preprocessing comprises abnormal value elimination, missing value completion and data smoothing and denoising, carrying out MIC (maximum mutual information) calculation on preprocessed data, determining mutual information weight, and obtaining characteristic factors which have the largest influence on landslide deformation. And then, performing grey correlation calculation to obtain a grey correlation value, finally obtaining a weighted correlation value by adopting a weighted correlation formula, and determining a final characteristic factor by calculating a characteristic preferred weight. Table 3 shows the MIC weight, gray correlation and weighted correlation results of the GNSS monitoring point HF06 and other GNSS monitoring data, displacement meter monitoring data, rainfall, temperature and humidity. The higher the weighted relevance is, the higher the influence and the approach degree of the characteristic on the landslide deformation are.

Table 3 weighted association calculation table

And sequencing results obtained by a weighted correlation method combining MIC and grey correlation: the GNSS monitoring point HF07, the displacement meter DCF11, the displacement meter DCF14, the displacement meter DCF15, the accumulated rainfall in the first 48 hours, the humidity and the rainfall, namely the influence of the displacement sensor monitoring data, the GNSS monitoring data, the accumulated rainfall in the first 48 hours, the temperature and the humidity on the landslide is large. Table 4 shows the feature preference weight calculated from the weighted relevance degree, and feature preference is performed by this value.

Table 4 characteristic preferred results

And sequencing the weighted relevance and calculating the optimal feature weight, wherein a threshold value alpha is selected to be 0.1, namely when the feature weight is less than 0.1, the influence of the feature on landslide deformation is considered to be negligible, and the feature factor is selected completely. And performing stepwise regression fitting analysis on factors influencing landslide deformation, which are obtained by weighting the relevance optimization, according to the analysis result, performing stepwise regression analysis by using the GNSS monitoring point HF07 data, the displacement meter monitoring data, the previous 48-hour accumulated rainfall data, the temperature and the humidity as independent variables and the GNSS monitoring point HF06 data as dependent variables to obtain corresponding regression coefficients, and further calculating to obtain the final characteristics and the fusion result. In the step-by-step regression analysis, the optimal result of the model is obtained by carrying out comparison analysis on the correlation coefficient, the variance of the residual error, the F value, the significance and the like of different models. The obtained regression coefficients are shown in table 5.

TABLE 5 regression coefficient Table

Obtaining a stepwise regression model expression of the surface displacement of the landslide surface, wherein the stepwise regression model expression is as follows: the landslide surface displacement is (GNSS monitoring point HF07 data × 0.587) + (displacement meter DCF11 data × 0.036) + (displacement meter DCF14 data × 0.519) - (displacement meter DCF15 data × 0.159) + (temperature × 0.028) + (humidity × 0.026) - (previous 48 hours accumulated rainfall × 0.010), and then a feature preference-stepwise regression feature level fusion result based on the weighted correlation is obtained, as shown in fig. 6.

And performing stage discrimination analysis and comparison on the feature preference-stepwise regression fusion result based on the weighted association degree and the BP neural network fusion result. Firstly, establishing a BP neural network fusion model, taking GNSS monitoring point HF07 data, displacement meter monitoring data, accumulated rainfall in the first 48 hours, temperature, humidity and the like as input data of the BP neural network model, taking the GNSS monitoring point HF06 data as expected output data, and establishing a multi-input single-output BP neural network fusion model containing two hidden layers by referring to an MIC analysis result. The BP neural network feature level fusion result is obtained through experimental analysis, as shown in fig. 7. According to literature data, when the tangent angle is larger than 80 degrees, the landslide is in the middle acceleration stage, in the experiment, two fusion results are compared only aiming at the tangent angle before the landslide, the tangent angle analysis and comparison result is shown in table 6, and the deformation rate analysis and comparison result is shown in table 7.

TABLE 6 improved tangent angle analysis results of two fusion results

TABLE 7 results of deformation Rate analysis of two fusion results

According to the comparison of indexes of the two stages, the improved tangent angle obtained by the feature optimization-stepwise regression based on the weighted relevance degree is closer to the landslide instability moment, and is used for judging the true development state of the more fit landslide in the landslide stage. Therefore, the feature optimization-stepwise regression fusion result based on the weighted relevance has better reliability and accuracy in the judgment analysis of the landslide stage, and the fusion result is better.

And then, respectively carrying out landslide trend prediction comparison analysis on the GNSS monitoring point HF06 data, the feature preference-stepwise regression fusion data based on the weighted correlation degree and the BP neural network fusion data by adopting an LSTM network algorithm, and carrying out prediction result precision comparison by adopting two precision evaluation indexes of MRE and MAE.

TABLE 8 comparison of prediction accuracy of two fusion results

As can be seen from Table 8, the preferred features based on weighted correlation-stepwise regression fusion results predicted MAE and MRE of 9.9mm and 3.46%, respectively, BP neural network fusion results predicted MAE and MRE of 15.1mm and 4.33%, respectively, and GNSS monitoring point HF06 data predicted MAE and MRE of 19.9mm and 4.51%, respectively. Namely, the prediction precision of the feature optimization-stepwise regression fusion result based on the weighted relevance is higher, and the feature level fusion result is proved to be more accurate and reliable.

Although the embodiments of the present invention have been disclosed in the foregoing for illustrative purposes, those skilled in the art will appreciate that various modifications, additions and substitutions are possible, without departing from the scope and spirit of the invention as disclosed in the accompanying drawings.

Claims

1. A multi-source heterogeneous landslide data monitoring and fusion method is characterized by comprising the following steps:

acquiring multi-source heterogeneous monitoring variable data;

calculating the maximum mutual information coefficient MIC of the dependent variable and the characteristic variable, and screening out the characteristic variable which has the maximum influence on landslide deformation;

calculating a gray correlation coefficient and a gray correlation degree between the reference number series and the comparison number series;

performing stepwise regression fitting analysis on the preferably obtained characteristic variables;

2. The multi-source heterogeneous landslide data monitoring fusion method of claim 1, further comprising preprocessing multi-source heterogeneous monitoring variable data:

3. The multi-source heterogeneous landslide data monitoring and fusion method of claim 1, wherein the step of calculating the maximum mutual information coefficient MIC of the dependent variable and the characteristic variable comprises:

carrying out normalization processing on the maximum mutual information value;

4. The multi-source heterogeneous landslide data monitoring and fusing method of claim 1, wherein the grey correlation coefficient calculation formula comprises:

where ρ is a resolution coefficient, 0<ρ<1, if ρ is smaller, offThe larger the difference between the connection coefficients is, the stronger the distinguishing capability is, usually rho is 0.5, | x₀(k)-x_i(k) L represents the absolute difference of the corresponding element of each comparison sequence and the reference sequence,

and

representing the two-level minimum difference and the two-level maximum difference, respectively.

5. The multi-source heterogeneous landslide data monitoring and fusion method of claim 1, wherein the weighted correlation, the calculation formula comprises:

6. The multi-source heterogeneous landslide data monitoring and fusion method of claim 5, wherein the feature optimization step comprises:

sorting the calculated weighted association degrees from big to small;

calculating the weight of each sorted preferred feature;

when the feature weight is preferred

7. The multi-source heterogeneous landslide data monitoring fusion method of claim 1, further comprising: analyzing and comparing the feature optimization-stepwise regression fusion result based on the weighted correlation degree with the BP neural network fusion result, wherein the method comprises the following steps:

and performing stage evaluation on the feature optimization-stepwise regression fusion and BP neural network data fusion model based on the weighted association degree by adopting two indexes of an improved tangent angle and a deformation rate.

8. The multi-source heterogeneous landslide data monitoring fusion method of claim 7 further comprising: