CN114358192B - Multi-source heterogeneous landslide data monitoring and fusing method - Google Patents

Multi-source heterogeneous landslide data monitoring and fusing method Download PDF

Info

Publication number
CN114358192B
CN114358192B CN202210013094.0A CN202210013094A CN114358192B CN 114358192 B CN114358192 B CN 114358192B CN 202210013094 A CN202210013094 A CN 202210013094A CN 114358192 B CN114358192 B CN 114358192B
Authority
CN
China
Prior art keywords
landslide
variables
data
variable
fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210013094.0A
Other languages
Chinese (zh)
Other versions
CN114358192A (en
Inventor
王利
张懿恺
许豪
赵超英
刘万林
成伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changan University
Original Assignee
Changan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changan University filed Critical Changan University
Priority to CN202210013094.0A priority Critical patent/CN114358192B/en
Publication of CN114358192A publication Critical patent/CN114358192A/en
Application granted granted Critical
Publication of CN114358192B publication Critical patent/CN114358192B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Testing Or Calibration Of Command Recording Devices (AREA)

Abstract

The invention discloses a multi-source heterogeneous landslide data monitoring and fusing method, which comprises the following steps: obtaining a weighted correlation degree by combining a maximum mutual information coefficient method MIC and a gray correlation analysis method GRA to reflect the influence of multisource heterogeneous landslide monitoring variables on landslide deformation displacement and a common change trend, further carrying out optimization on characteristic factors according to the weighted correlation degree, carrying out stepwise regression fitting analysis on the optimally obtained characteristic factors to obtain corresponding regression coefficients, further calculating to obtain a final regression equation and a fusion result, and carrying out reliability and effectiveness evaluation on the data fusion result by adopting a landslide stage judgment and trend prediction method. The method adopts a multi-source heterogeneous data fusion means to carry out multivariate data effective analysis and processing so as to obtain a more reliable and accurate data fusion result and provide valuable reference information for landslide prediction, thereby effectively improving the precision of landslide prediction.

Description

Multi-source heterogeneous landslide data monitoring and fusing method
Technical Field
The invention relates to the technical field of data fusion, in particular to a multi-source heterogeneous landslide data monitoring fusion method.
Background
The multi-source data fusion technology is used as an emerging interdisciplinary subject in multiple fields, and has been widely applied to landslide deformation monitoring after more than ten years of exploration and development. With the appearance of a plurality of sensors, how to extract comprehensive information of multi-source heterogeneous sensor information and perform effective fusion processing is a current research difficulty. In landslide monitoring, single sensor information cannot comprehensively reflect the deformation characteristics of landslide, and the obtained prediction result is low in reliability, so that effective feature extraction and comprehensive analysis processing need to be carried out by combining multi-source heterogeneous sensor information to eliminate redundancy and mutual exclusion among data, and further more reliable and accurate prediction results can be obtained.
At present, effective feature extraction and comprehensive analysis processing of source heterogeneous sensor information have the problem that single landslide monitoring information is inaccurate in prediction due to one-sidedness and unreliability.
Disclosure of Invention
The embodiment of the invention provides a multi-source heterogeneous landslide data monitoring and fusing method, which comprises the following steps:
acquiring multi-source heterogeneous monitoring variable data;
dividing multi-source heterogeneous monitoring variables into dependent variables and characteristic variables;
calculating the maximum mutual information coefficient MIC of every two multisource heterogeneous landslide monitoring variables, and screening out the characteristic variables which influence the landslide to the maximum extent;
determining a single-point displacement sequence reflecting landslide deformation characteristics as a reference column, and determining a data sequence consisting of factors influencing landslide deformation as a comparison column;
calculating the grey correlation coefficient and grey correlation degree of the reference number sequence and the comparison number sequence;
calculating a weighted correlation degree according to the maximum mutual information coefficient MIC and the grey correlation degree;
performing characteristic optimization according to the weighted relevance degree, and screening out final characteristic variables;
constructing a feature optimization-stepwise regression feature level data fusion model based on the weighted association degree;
and performing multi-source heterogeneous information fusion by using a feature optimization-stepwise regression feature level data fusion model based on the weighted relevance, and providing effective auxiliary information for landslide prediction.
And in a near step, preprocessing multi-source heterogeneous monitoring variable data:
removing abnormal values, complementing missing values and smoothly denoising data.
And step one, calculating the maximum mutual information coefficient MIC of every two multi-source heterogeneous landslide monitoring variables, which comprises the following steps:
given variables i and j, carrying out i-column and j-row meshing on a scatter diagram formed by the two variables, and solving the maximum mutual information value;
carrying out normalization processing on the maximum mutual information value;
selecting the maximum value of mutual information under different scales as an MIC value;
and obtaining the characteristic variable with the highest degree of correlation with the dependent variable.
And further, calculating a grey correlation coefficient by using a calculation formula comprising:
Figure BDA0003458512130000021
where ρ is a resolution coefficient, 0<ρ<1, the smaller ρ is, the larger the difference between correlation coefficients is, and the stronger the discrimination ability is, and usually ρ is 0.5, | x 0 (k)-x i (k) L represents the absolute difference of the corresponding element of each comparison sequence and the reference sequence,
Figure BDA0003458512130000022
and with
Figure BDA0003458512130000023
Representing the two-level minimum difference and the two-level maximum difference, respectively.
Further, the calculation formula of the weighted relevance includes:
Figure BDA0003458512130000024
where n is the total number of feature variables to be selected, MICs (A, B) i ) Representing a characteristic variable A and a characteristic variable B i Maximum mutual information coefficient MIC.
In a further aspect, the method comprises the following steps:
sorting the calculated weighted association degrees from big to small;
sorting and screening the characteristic variables according to the weighted relevance;
calculating the weight of each sorted preferred feature;
when the feature weight is preferred
Figure BDA0003458512130000031
When the characteristic variable is selected, the screening is stopped, and the final characteristic variable is obtained;
wherein, J S Is the sum of the weighted relevance of the characteristic variables, J j Is the weighted relevance, omega, of the jth feature variable to be screened j For the jth preferred feature weight, α is a given threshold.
Further, the method comprises the following steps: analyzing and comparing the feature optimization-stepwise regression fusion result based on the weighted correlation degree with the BP neural network fusion result;
establishing a BP neural network fusion model, taking an independent variable as a system input variable and a dependent variable as a system output variable;
establishing a multi-input single-output BP neural network fusion model containing two hidden layers;
performing stage evaluation on a feature optimization-stepwise regression fusion and BP neural network data fusion model based on the weighted correlation degree by adopting two indexes of an improved tangent angle and a deformation rate;
and (3) performing prediction comparison analysis by adopting the long-short term memory network artificial neural network LSTM based on the feature optimization-stepwise regression fusion data and the BP neural network fusion data of the weighted correlation respectively.
The embodiment of the invention provides a multi-source heterogeneous landslide data monitoring and fusing method, which has the following beneficial effects compared with the prior art:
1. the MIC is used for measuring the correlation degree between two variables, and compared with other correlation analysis methods, the MIC is suitable for linear and nonlinear data, has universality, fairness and symmetry, and has high accuracy.
2. Combining the mutual information weight and the grey correlation degree, adopting the weighted correlation degree to measure the importance degree of the characteristic factors to the landslide deformation, calculating the optimal characteristic weight, screening out the characteristic factors according to a threshold value, and combining the characteristics of the mutual information and the grey correlation to carry out characteristic optimization, so that the characteristic optimization result is more reliable.
Drawings
FIG. 1 is a flow chart of the feature optimization based on weighted relevance in the fusion method of the present invention;
FIG. 2 is a diagram of an RNN model in the evaluation analysis of the present invention;
FIG. 3 is a schematic diagram showing the structure of RNN-model cryptic layer cells in the evaluation analysis of the present invention;
FIG. 4 is a schematic diagram of the cell structure of the hidden layer of the LSTM model in the evaluation analysis of the present invention;
FIG. 5 is a distribution diagram of monitoring points in an experimental study area according to the present invention;
FIG. 6 is a graph of a feature optimization-stepwise regression fusion result based in part on weighted relevance for the experiments of the present invention;
FIG. 7 shows the fusion result of BP neural network model in the experimental part of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1 to 7, an embodiment of the present invention provides a multi-source heterogeneous landslide data monitoring and fusion method, including:
and S1, calculating maximum mutual information. The MIC algorithm is realized by adopting a miniature class library in python program software. The method mainly comprises the following three steps: 1) Giving i and j, carrying out i-column and j-row meshing on a scatter diagram formed by the two variables, and solving the maximum mutual information value; 2) Normalizing the maximum mutual information value; 3) Selecting the maximum value of mutual information under different scales as an MIC value; 4) And obtaining the characteristic variable with the highest correlation degree with the dependent variable for subsequent regression prediction. The MIC is used for measuring the correlation degree between two variables, and compared with other correlation analysis methods, the MIC is suitable for linear and nonlinear data, has universality, fairness and symmetry, and has high accuracy.
And S2, calculating the grey correlation degree. Determining a single-point displacement sequence capable of reflecting landslide deformation characteristics as a reference column, determining a data sequence composed of factors influencing landslide deformation as a comparison column, carrying out non-dimensionalization processing on the reference column and the comparison column, and then obtaining a gray correlation coefficient and a gray correlation degree of the reference column and the comparison column.
And S3, preferably weighting the associated characteristics. The mutual information value measures the influence of the features on the landslide deformation, the weight of the mutual information value reflects the effectiveness of the features, and the grey correlation quantifies the degree of consistency between the features and the landslide deformation. And combining the mutual information weight with the grey correlation degree, measuring the importance degree of the characteristic factors to the landslide deformation by adopting the weighted correlation degree, calculating the optimal characteristic weight, and screening out the characteristic factors according to a threshold value. Feature optimization is performed by combining the characteristics of mutual information and grey correlation, so that the feature optimization result is more reliable.
And S4, stepwise regression analysis. And introducing the characteristic factors into the model one by one, performing F test after introducing each explanatory variable, performing t test on the selected explanatory variables one by one, and deleting the introduced explanatory variables when the introduction of the originally introduced explanatory variables is not obvious any more. To ensure that the regression equation preceding each new variable introduced contains only significant variables. This is repeated until neither significant explanatory variables are selected into the regression equation, nor insignificant explanatory variables are removed from the regression equation.
And S5, evaluating and analyzing. The first step, the stage of fusion result comparison. In order to evaluate the reliability of feature optimization-stepwise regression feature level data fusion based on the weighted association degree, the feature optimization-stepwise regression fusion result based on the weighted association degree and the BP neural network fusion result are adopted to carry out stage discrimination analysis and comparison. Firstly, a BP neural network fusion model is established, an independent variable is used as a system input variable, a dependent variable is used as a system output variable, and the multi-input single-output BP neural network fusion model containing two hidden layers is established. And the two indexes of the improved tangent angle and the improved deformation rate are adopted to carry out stage evaluation on the feature optimization-stepwise regression fusion and BP neural network data fusion model based on the weighted association degree so as to prove the effectiveness of the fusion result of the invention. And secondly, predicting and comparing fusion results. And (3) respectively carrying out prediction comparison analysis on the dependent variable single-point data of the GNSS monitoring point, the feature preference-stepwise regression fusion data based on the weighted correlation degree and the BP neural network fusion data by adopting an LSTM (long-short term memory network artificial neural network). And (3) constructing an LSTM model by using Keras in a python program, and evaluating the prediction precision of a prediction result by using two indexes of MRE (mean relative error) and MAE (mean absolute error) so as to prove the reliability of a fusion result of the invention.
Specifically, the method comprises the following steps:
and S1, calculating maximum mutual information. The correlation exists among the multi-source heterogeneous sensors on the landslide body, one sensor can be comprehensively influenced by a plurality of other sensors, so that the correlation degree calculation needs to be carried out on the multi-source heterogeneous landslide monitoring variable, and the characteristic factor which influences the landslide deformation to the maximum extent is screened out to be used for subsequent fusion prediction. Prior to MIC (maximum mutual information coefficient) computation, knowledge of information entropy and mutual information is first needed. Mutual information refers to the degree of association between two random variables, and the average information amount after redundancy is eliminated in the information is called "information entropy", so that the characteristic factors influencing landslide deformation can be screened out by adopting an MI (mutual information) mode, and less redundant information is provided. MIC is more accurate than MI and is not limited to a particular function type, and the degree of correlation between variables can be obtained. If there is a correlation between two variables, the sets of their corresponding data points are distributed in a two-dimensional space, the data space is divided using a grid of m times n, with the frequency of the data points falling in the (x, y) -th grid as an estimate of P (x, y),
Figure BDA0003458512130000061
(wherein n is x,y For the number of data points falling in the (x, y) -th lattice, n is the total number of data points), the estimates of P (x), P (y) are obtained similarly. Then, mutual information between random variables is calculated, and since the mesh of m multiplied by n divides data points in more than one way, the mesh division which maximizes the mutual information is obtained, and the value of the mutual information is converted into the interval of (0,1) using a normalization factor. And finally, finding out the grid resolution which can maximize the normalized mutual information as the measurement value of the MIC. Wherein the resolution of the grid is limited to mxn<B,B=f(data_size)=n 0.6 MIC is calculated as
Figure BDA0003458512130000062
Figure BDA0003458512130000063
The specific calculation steps are as follows: 1) The maximum mutual information value is calculated. Given i and j, a scatter diagram formed by two variables X, Y is gridded in i columns and j rows, and the maximum mutual information value is obtained. However, given i and j, a plurality of different gridding schemes can be obtained, and then the mutual information value corresponding to each scheme needs to be calculated to find out the gridding scheme which maximizes the mutual information. 2) And normalizing the maximum mutual information value. The obtained maximum mutual information is divided by log (min (X, Y)), so that normalization is obtained. 3) And selecting the maximum value of mutual information at different scales as the MIC value. And then selecting the features with larger influence on landslide deformation, and eliminating the features with less information quantity, so that the variables for modeling are more representative.
And S2, calculating the grey correlation degree. The MIC result is only adopted to carry out feature selection without convincing, and MIC analysis representing the influence degree and gray correlation analysis representing the consistency degree can be combined for comprehensive analysis to obtain the feature factor more suitable for data fusion. Carrying out non-dimensionalization processing on the multi-source heterogeneous landslide monitoring characteristic sequence, and calculating a correlation coefficient and a correlation degree, wherein the specific process comprises the following steps: 1) Determining a reference column sequence and comparing the column sequences. Let the reference column sequence be Y = { Y (k) | k =1,2, … n }; the comparison sequence is X i ={X i (k)|k=1,2,…N }, i =1,2, …, m. 2) Dimensionless of the variables. Because the different characteristic factor dimensions are inconvenient to compare, dimensionless processing is needed.
Figure BDA0003458512130000071
Figure BDA0003458512130000072
The non-dimensionalized data sequences form the following matrix:
Figure BDA0003458512130000073
Figure BDA0003458512130000074
3) Calculating the absolute difference value of the corresponding element of each evaluated object index sequence (comparison sequence) and the reference sequence one by one, namely | x 0 (k)-x i (k) And | n (k =1, …, m; i =1, …, n,) represents the number of objects to be evaluated. 4) Determining
Figure BDA0003458512130000075
And
Figure BDA0003458512130000076
5) And calculating the correlation coefficient. The correlation coefficient calculation formula is as follows:
Figure BDA0003458512130000077
Figure BDA0003458512130000078
where ρ is the resolution coefficient, 0<ρ<1, the smaller ρ is, the larger the difference between correlation coefficients is, and the stronger the discrimination ability is, and ρ is usually 0.5. 6) And calculating the association degree. Calculating the mean value of the association coefficients of the indexes and the elements corresponding to the reference sequence for each evaluation object (comparison sequence) respectively to reflect the association relationship between each evaluation object and the reference sequence, and recording the mean value as the association degree:
Figure BDA0003458512130000079
Figure BDA00034585121300000710
and S3, calculating the weighted association degree. And combining the maximum mutual information weight with the grey correlation degree to obtain the weighted correlation degree of the corresponding characteristics to reflect the characteristics. The greater the weighted relevance, the more important the feature is, the calculation formula is:
Figure BDA00034585121300000711
where n is the total number of features to be selected.
S4, the characteristics are preferred. And sorting the calculated weighted relevance according to the descending order, selecting the features with the largest weighted relevance, adding the features into the preferred set, and removing the features from the set to be preferred. Sequentially screening from large to small, and calculating the optimal characteristic weight:
Figure BDA00034585121300000712
in the formula J S Is the sum of weighted relevance of each feature in the preferred set, J j Is the weighted relevance, omega, of the jth feature to be screened j And when the j is the preferred feature weight and the value is less than alpha, the feature screening is considered to be finished.
And S5, stepwise regression analysis. The basic idea of stepwise regression is to introduce variables into the model one by one, perform an F-test after each introduction of an explanatory variable, perform a t-test on the already selected explanatory variables one by one, and delete the originally introduced explanatory variables when their introduction becomes no longer significant. To ensure that the regression equation preceding each new variable introduced contains only significant variables. This is an iterative process until neither significant explanatory variables are selected into the regression equation, nor insignificant explanatory variables are removed from the regression equation. The stepwise regression specifically comprises the following steps:
the first step is as follows: building an augmented matrix
Calculating l ij ,l iy ,l yy And r ij ,r iy The formulas are respectively as follows:
Figure BDA0003458512130000081
Figure BDA0003458512130000082
wherein
Figure BDA0003458512130000083
Figure BDA0003458512130000084
An expanded augmentation matrix can be obtained
Figure BDA0003458512130000085
Wherein R = (R) ij ) m×m ,r yy =1,r y =(r 1y ,r 2y ,…,r my )'。
The second step is that: the s-th step is subjected to elimination transformation, and the result is
Figure BDA0003458512130000086
Figure BDA0003458512130000087
Wherein
Figure BDA0003458512130000088
Figure BDA0003458512130000089
The third step: and (5) factor elimination.
(1) Selection j 0 So that
Figure BDA0003458512130000091
(2) Computing
Figure BDA0003458512130000092
(3) If F>F Go out If yes, executing the fourth step; and otherwise, performing s +1 times of elimination transformation, and then performing calculation in two steps and three steps.
The fourth step: a regression factor was introduced. Let s, { j }, f still be defined in step two.
(1) Selection of k 0 So that
Figure BDA0003458512130000093
(2) Computing
Figure BDA0003458512130000094
Wherein
Figure BDA0003458512130000095
Figure BDA0003458512130000096
(3) If F<F Into If yes, executing the fifth step; otherwise, performing s +1 times of elimination transformation, and introducing kth 0 And (4) calculating the regression factors by two, three and four steps.
And fifthly, neither introducing variables nor removing variables. The regression equation obtained finally is ^ y = ^ b 0 +∑ j∈{j} ^b j x j Wherein
Figure BDA0003458512130000097
Figure BDA0003458512130000098
And S6, evaluating and analyzing.
The first step, the stage of fusion result comparison. In order to evaluate the reliability of the feature optimization-stepwise regression feature level data fusion based on the weighted relevance, the model fusion result and the BP neural network fusion result are adopted to carry out stage discrimination analysis and comparison. Firstly, establishing a BP neural network fusion model, taking an independent variable as a system input variable and a dependent variable as a system output variable, and establishing a multi-input single-output BP neural network fusion model containing two hidden layers. The BP neural network topology structure comprises an input layer, a hidden layer and an output layer, and simultaneously comprises a forward multilayer feedforward stage and a reverse error correction stage. The forward multi-layer feedforward stage is a forward process which starts from the input layerCalculating the actual input and output of each node of each layer, and the mathematical model is
Figure BDA0003458512130000099
In the formula
Figure BDA00034585121300000910
Is the output value of the ith node of the l layer;
Figure BDA00034585121300000911
the activation value of the ith node of the ith layer;
Figure BDA00034585121300000912
the connection weight value from the jth node of the l-1 layer to the ith node of the l layer is obtained;
Figure BDA00034585121300000913
a threshold value of the ith node of the ith layer; n is a radical of l The number of nodes of the l layer. In order to improve the error precision of neurons in an output layer, a gradient descent algorithm is adopted for reverse error propagation. The connection weights between each layer of neurons are adjusted by a gradient descent algorithm, so that the final overall error changes along the reduction direction. The algorithm formula is as follows:
Figure BDA0003458512130000101
(eta is learning rate), weight is adjusted by formula
Figure BDA0003458512130000102
And a feature optimization-stepwise regression fusion model based on the weighted union degree and a BP neural network data fusion model are subjected to stage evaluation by adopting two indexes of an improved tangent angle (reference self-intensity, an improved tangent angle and a corresponding landslide early warning criterion) and a deformation rate, and the evaluation is shown as a comprehensive early warning basis based on a deformation rate threshold value and a deformation process.
TABLE 2 comprehensive early warning criteria based on deformation rate threshold and deformation process
Figure BDA0003458512130000103
And secondly, predicting and comparing fusion results. The feature level fusion is to effectively analyze and process multi-source heterogeneous information obtained by landslide monitoring. Thereby improving the accuracy of prediction. In order to discuss the effectiveness of feature level fusion in improving the landslide prediction and forecast accuracy, LSTM (long-short term memory network artificial neural network) is adopted to perform prediction comparison analysis on the GNSS monitoring point dependent variable single-point data, the feature preference-stepwise regression fusion data based on the weighting union degree and the BP neural network fusion data. The LSTM model is built by adopting Keras in a python program, the LSTM neural network model is a novel neural network algorithm improved based on a common recurrent neural network, and RNN cells of a hidden layer are replaced by LSTM cells in the LSTM model, so that the problem that the gradient possibly disappears quickly in the back propagation process can be effectively solved, the LSTM model has long-term memory capacity, and long-term sequence data can be processed. Compared with the RNN model, the LSTM unit has 3 gating switches inside, as shown in fig. 3, where i is an input gate, f is a forgetting gate, c is a cell state, o is an output, and σ and tanh are Sigmoid and hyperbolic tangent activation functions, respectively.
Forget to check door through h t-1 And x t Using a Sigmoid unit to output a vector between 0 and 1, wherein the value of 0 to 1 in the vector indicates the cell state c t-1 Which information is retained or how much is discarded. 0 means no reservation and 1 means both reservations. f. of t =σ(W f ·[h t-1 ,x t ]+b f ). The input gate is used to update the cell state. The information of the previous hidden state and the currently input information are input into the Sigmoid function, and the output value is adjusted between 0 and 1 to decide which information to update, wherein 0 means unimportant and 1 means important. While the hidden state and the current input are transmitted to the tanh function and the values are compressed between-1 and 1 to adjust the network, and then the tanh output is multiplied by the Sigmoid output, which will determine what information is important in the tanh output and needs to be preserved. i.e. i t =σ[W f ·[h t-1 ,x t ]+b i ],~C=tanh(W C ·[h t-1 ,x t ]+b C ). The output gate controls the value of the next hidden state that can be used for prediction. Firstly, transmitting the previous hidden state and the current input to a Sigmoid function, simultaneously transmitting the newly obtained unit state to a tanh function, and then multiplying the tanh output and the Sigmoid output to obtain new information of the hidden state, wherein the new information is output as an output value of the current unit; and finally, synchronizing the new unit state and the hidden state to the next time step. o t =σ[W o ·[h t-1 ,x t ]+b o ],h t =o t *tanh(C t ). The LSTM model training process adopts a classical back propagation algorithm and is divided into 4 steps: (1) The output values of LSTM cells were calculated according to the forward calculation method,
Figure BDA0003458512130000111
the first derivative and the second derivative of the loss function l () are respectively, and the final obtained objective function is:
Figure BDA0003458512130000112
Figure BDA0003458512130000113
and W and b respectively correspond to a weight coefficient matrix and an offset term. (2) The error term for each LSTM cell was calculated backwards, including 2 back propagation directions by time and network level. (3) The gradient of each weight is calculated according to the corresponding error term. And (4) applying a gradient-based optimization algorithm to update the weights.
And (4) evaluating the prediction accuracy by using two indexes of MRE (mean relative error) and MAE (mean absolute error).
Figure BDA0003458512130000114
And detecting the distance between the expected value and the actual value, and measuring the prediction precision.
The embodiment is as follows:
the experimental data of the invention adopts landslide monitoring data from 3 months, 28 days to 10 months, 4 days of a 7# landslide body of black square typhoon, yongjing county, gansu province in 2019, and the data takes the days as sampling rates, and is 191d in total, wherein the landslide monitoring data comprises two groups of GNSS monitoring data (HF 06 and HF 07), three groups of displacement meter monitoring data (DCF 11, DCF14 and DCF 15) and 3 kinds of meteorological data of humidity, temperature and rainfall. When the landslide body landslides in 2019, 10, 5 and 4, 5 groups of monitoring equipment all monitor displacement change data of landslide deformation, and data fusion and precision judgment can be carried out by adopting a multi-source heterogeneous sensor. The distribution of the monitoring points in the experimental area is shown in figure 5.
The method comprises the steps of firstly carrying out data preprocessing on multi-sensor variables and environmental factors, wherein the preprocessing comprises abnormal value elimination, missing value completion and data smoothing and denoising, carrying out MIC (maximum mutual information) calculation on preprocessed data, determining mutual information weight, and obtaining characteristic factors which have the largest influence on landslide deformation. And then, performing grey correlation calculation to obtain a grey correlation value, finally obtaining a weighted correlation value by adopting a weighted correlation formula, and determining a final characteristic factor by calculating a characteristic preferred weight. Table 3 shows the MIC weight, gray correlation, and weighted correlation results of the GNSS monitoring point HF06 and other GNSS monitoring data, displacement meter monitoring data, rainfall, temperature, and humidity. The higher the weighted relevance is, the higher the influence and the approach degree of the characteristic on the landslide deformation are.
Table 3 weighted association calculation table
Figure BDA0003458512130000121
And sequencing results obtained by a weighted correlation method combining MIC and grey correlation: the GNSS monitoring point HF07, the displacement meter DCF11, the displacement meter DCF14, the displacement meter DCF15, the accumulated rainfall in the first 48 hours, the humidity, the temperature and the rainfall are all larger, namely the influence of the displacement sensor monitoring data, the GNSS monitoring data, the accumulated rainfall data in the first 48 hours, the temperature and the humidity on the landslide is larger. Table 4 shows the feature preference weight calculated from the weighted relevance degree, and the feature preference is performed by this value.
Table 4 characteristic preferred results
Figure BDA0003458512130000131
And sequencing the weighted relevance and calculating the optimal feature weight, wherein a threshold value alpha is selected to be 0.1, namely when the feature weight is less than 0.1, the influence of the feature on landslide deformation is considered to be negligible, and the feature factor is selected completely. And performing stepwise regression fitting analysis on factors influencing landslide deformation, which are obtained by weighting the relevance optimization, according to the analysis result, performing stepwise regression analysis by using GNSS monitoring point HF07 data, displacement meter monitoring data, accumulated rainfall data in the previous 48 hours, temperature and humidity as independent variables and GNSS monitoring point HF06 data as dependent variables to obtain corresponding regression coefficients, and then calculating to obtain the final characteristics and the fusion result. In the step-by-step regression analysis, the optimal result of the model is obtained by carrying out comparison analysis on the correlation coefficient, the variance of the residual error, the F value, the significance and the like of different models. The obtained regression coefficients are shown in table 5.
TABLE 5 regression coefficient Table
Figure BDA0003458512130000132
Obtaining a stepwise regression model expression of the surface displacement of the landslide surface, wherein the stepwise regression model expression is as follows: landslide surface displacement = (GNSS monitoring point HF07 data × 0.587) + (displacement meter DCF11 data × 0.036) + (displacement meter DCF14 data × 0.519) - (displacement meter DCF15 data × 0.159) + (temperature × 0.028) + (humidity × 0.026) - (accumulated rainfall over the first 48 hours), and then a feature-preferred-stepwise regression feature level fusion result based on the weighted correlation degree is obtained, as shown in fig. 6.
And performing stage discrimination analysis and comparison on the feature preference-stepwise regression fusion result based on the weighted association degree and the BP neural network fusion result. Firstly, establishing a BP neural network fusion model, taking GNSS monitoring point HF07 data, displacement meter monitoring data, accumulated rainfall in the first 48 hours, temperature, humidity and the like as input data of the BP neural network model, taking the GNSS monitoring point HF06 data as expected output data, and establishing a multi-input single-output BP neural network fusion model containing two hidden layers by referring to an MIC analysis result. The BP neural network feature level fusion result is obtained through experimental analysis, as shown in fig. 7. According to literature data, when the tangent angle is larger than 80 degrees, the landslide is in the middle acceleration stage, in the experiment, two fusion results are compared only aiming at the tangent angle before the landslide, the tangent angle analysis and comparison result is shown in table 6, and the deformation rate analysis and comparison result is shown in table 7.
TABLE 6 improved tangent angle analysis results of two fusion results
Figure BDA0003458512130000141
TABLE 7 results of deformation Rate analysis of two fusion results
Figure BDA0003458512130000142
According to the comparison of indexes of the two stages, the improved tangent angle obtained by the feature optimization-stepwise regression based on the weighted relevance degree is closer to the landslide instability moment, and is used for judging the true development state of the more fit landslide in the landslide stage. Therefore, the feature optimization-stepwise regression fusion result based on the weighted relevance has better reliability and accuracy in the judgment analysis of the landslide stage, and the fusion result is better.
And then, performing landslide trend prediction comparison analysis on the GNSS monitoring point HF06 data, the feature optimization-stepwise regression fusion data based on the weighted correlation degree and the BP neural network fusion data respectively by adopting an LSTM network algorithm, and performing prediction result precision comparison by adopting two precision evaluation indexes of MRE and MAE.
TABLE 8 comparison of prediction accuracy of two fusion results
Figure BDA0003458512130000151
As can be seen from table 8, the feature preference based on weighted correlation-stepwise regression fusion results predicted MAE and MRE of 9.9mm and 3.46%, respectively, BP neural network fusion results predicted MAE and MRE of 15.1mm and 4.33%, respectively, and GNSS monitoring point HF06 data predicted MAE and MRE of 19.9mm and 4.51%, respectively. Namely, the prediction precision of the feature optimization-stepwise regression fusion result based on the weighted relevance is higher, and the feature level fusion result is proved to be more accurate and reliable.
Although the embodiments of the present invention have been disclosed in the foregoing for illustrative purposes, those skilled in the art will appreciate that various modifications, additions and substitutions are possible, without departing from the scope and spirit of the invention as disclosed in the accompanying drawings.

Claims (5)

1. A multi-source heterogeneous landslide data monitoring and fusing method is characterized by comprising the following steps:
acquiring multi-source heterogeneous monitoring variable data;
dividing multi-source heterogeneous monitoring variables into dependent variables and characteristic variables;
calculating the maximum mutual information coefficient MIC of the dependent variable and the characteristic variable, and screening out the characteristic variable which has the maximum influence on landslide deformation;
determining a single-point displacement sequence reflecting landslide deformation characteristics as a reference column, and determining a data sequence consisting of factors influencing landslide deformation as a comparison column;
calculating a gray correlation coefficient and a gray correlation degree between the reference number series and the comparison number series;
calculating a weighted correlation degree according to the maximum mutual information coefficient MIC and the grey correlation degree;
performing characteristic optimization according to the weighted relevance degree, and screening out final characteristic variables;
performing stepwise regression fitting analysis on the preferably obtained characteristic variables;
constructing a feature optimization-stepwise regression feature level data fusion model based on the weighted relevance;
performing multi-source heterogeneous information fusion by using a feature optimization-stepwise regression feature level data fusion model based on the weighted relevance degree, and providing effective auxiliary information for landslide prediction;
the grey correlation coefficient is calculated according to the formula:
Figure FDA0003898362060000011
where ρ is a resolution coefficient of 0<ρ<1, the smaller rho is, the larger the difference between the correlation coefficients is, the stronger the distinguishing capability is, usually, rho is 0.5, | x 0 (k)-x i (k) L represents the absolute difference of corresponding elements of each comparison sequence and the reference sequence,
Figure FDA0003898362060000012
and
Figure FDA0003898362060000013
respectively representing two-stage minimum difference and two-stage maximum difference; n is the number of the evaluated objects;
calculating the relevance: calculating the mean value of the association coefficients of the indexes and the elements corresponding to the reference sequence of each evaluation object respectively to reflect the association relationship between each evaluation object and the reference sequence, and recording the mean value as:
Figure FDA0003898362060000014
the weighted association degree, the calculation formula comprises:
Figure FDA0003898362060000021
where n is the total number of feature variables to be selected, MICs (A, B) i ) Representing a characteristic variable A and a characteristic variable B i Maximum mutual information coefficient MIC;
the characteristic is preferably the steps of:
sorting the calculated weighted association degrees from big to small;
sorting and screening the characteristic variables according to the weighted relevance;
calculating the weight of each sorted preferred feature;
when the feature weight is preferred
Figure FDA0003898362060000022
When the characteristic variable is selected, the screening is stopped, and the final characteristic variable is obtained;
wherein, J S Is the sum of the weighted relevance of the characteristic variables, J j Is the weighted relevance, omega, of the jth feature variable to be screened j Is the jth preferred feature weight, α is a given threshold;
the stepwise regression fit analysis includes: introducing the characteristic factors into the model one by one, performing F test after introducing each explanatory variable, performing t test on the selected explanatory variables one by one, and deleting the introduced explanatory variables when the introduction of the originally introduced explanatory variables becomes no longer significant so as to ensure that the regression equation before the new variables introduced each time only contains significant variables; this is repeated until neither significant explanatory variables are selected into the regression equation, nor insignificant explanatory variables are removed from the regression equation.
2. The multi-source heterogeneous landslide data monitoring fusion method of claim 1, further comprising preprocessing multi-source heterogeneous monitoring variable data:
removing abnormal values, complementing missing values and smoothly denoising data.
3. The multi-source heterogeneous landslide data monitoring and fusion method of claim 1, wherein the step of calculating the maximum mutual information coefficient MIC of the dependent variable and the characteristic variable comprises:
given variables i and j, carrying out i-column and j-row meshing on a scatter diagram formed by the two variables, and solving the maximum mutual information value;
carrying out normalization processing on the maximum mutual information value;
selecting the maximum value of mutual information under different scales as an MIC value;
and obtaining the characteristic variable with the highest correlation degree with the dependent variable.
4. The multi-source heterogeneous landslide data monitoring fusion method of claim 1, further comprising: analyzing and comparing the feature optimization-stepwise regression fusion result based on the weighted correlation degree with the BP neural network fusion result, wherein the method comprises the following steps:
establishing a BP neural network fusion model, taking an independent variable as a system input variable and a dependent variable as a system output variable;
establishing a multi-input single-output BP neural network fusion model containing two hidden layers;
and performing stage evaluation on the feature optimization-stepwise regression fusion and BP neural network data fusion model based on the weighted correlation degree by adopting two indexes of the improved tangent angle and the deformation rate.
5. The multi-source heterogeneous landslide data monitoring fusion method of claim 4 further comprising:
and (3) performing prediction comparison analysis by adopting the long-short term memory network artificial neural network LSTM based on the feature optimization-stepwise regression fusion data and the BP neural network fusion data of the weighted correlation respectively.
CN202210013094.0A 2022-01-06 2022-01-06 Multi-source heterogeneous landslide data monitoring and fusing method Active CN114358192B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210013094.0A CN114358192B (en) 2022-01-06 2022-01-06 Multi-source heterogeneous landslide data monitoring and fusing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210013094.0A CN114358192B (en) 2022-01-06 2022-01-06 Multi-source heterogeneous landslide data monitoring and fusing method

Publications (2)

Publication Number Publication Date
CN114358192A CN114358192A (en) 2022-04-15
CN114358192B true CN114358192B (en) 2022-11-25

Family

ID=81106472

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210013094.0A Active CN114358192B (en) 2022-01-06 2022-01-06 Multi-source heterogeneous landslide data monitoring and fusing method

Country Status (1)

Country Link
CN (1) CN114358192B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115238850B (en) * 2022-06-30 2023-05-09 西南交通大学 Mountain area slope displacement prediction method based on MI-GRA and improved PSO-LSTM
CN115928810A (en) * 2022-11-09 2023-04-07 中国十七冶集团有限公司 Foundation pit intelligent monitoring method based on multi-sensor data fusion
CN116975008B (en) * 2023-09-22 2023-12-15 青岛海联智信息科技有限公司 Ship meteorological monitoring data optimal storage method
CN117490675B (en) * 2024-01-03 2024-03-15 西北工业大学 High-precision anti-interference control method for array MEMS gyroscope
CN118311362B (en) * 2024-06-07 2024-08-09 湖州积微电子科技有限公司 Running state monitoring method for energy-saving medium-power direct-current speed regulating device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112270400A (en) * 2020-10-16 2021-01-26 西安工程大学 Landslide displacement dynamic prediction method based on multiple influence factors
CN112488395A (en) * 2020-12-01 2021-03-12 湖南大学 Power distribution network line loss prediction method and system
CN113507118A (en) * 2021-07-11 2021-10-15 湘潭大学 Wind power prediction method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112270400A (en) * 2020-10-16 2021-01-26 西安工程大学 Landslide displacement dynamic prediction method based on multiple influence factors
CN112488395A (en) * 2020-12-01 2021-03-12 湖南大学 Power distribution network line loss prediction method and system
CN113507118A (en) * 2021-07-11 2021-10-15 湘潭大学 Wind power prediction method and system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
I-Maculaweb: A Tool to Support Data;MONICA BONETTO et al.;《IEEE journal of translational engineering in health and medicine》;20151228;第1-10页 *
Intelligent Transportation Application and;Ang Li et al.;《IEEE SENSORS JOURNAL》;20211115;第21卷(第22期);第25035-25042页 *
MIC-PCA 耦合算法在径流预报因子筛选中的应用;王丽萍等;《中国农村水利水电》;20181231(第9期);第36-41、51页 *
基于改进有序聚类法的立式加工中心进给系统;李传珍等;《工程设计学报》;20200430;第27卷(第2期);第223-231页 *

Also Published As

Publication number Publication date
CN114358192A (en) 2022-04-15

Similar Documents

Publication Publication Date Title
CN114358192B (en) Multi-source heterogeneous landslide data monitoring and fusing method
CN108231201B (en) Construction method, system and application method of disease data analysis processing model
CN114757309B (en) Multi-physical-field monitoring data collaborative fusion engineering disaster early warning method and system
CN112446419B (en) Attention mechanism-based space-time neural network radar echo extrapolation prediction method
CN108596327B (en) Seismic velocity spectrum artificial intelligence picking method based on deep learning
CN115238850A (en) Mountain slope displacement prediction method based on MI-GRA and improved PSO-LSTM
CN112270400A (en) Landslide displacement dynamic prediction method based on multiple influence factors
Al-Zwainy et al. Development of the construction productivity estimation model using artificial neural network for finishing works for floors with marble
CN114547974A (en) Dynamic soft measurement modeling method based on input variable selection and LSTM neural network
CN112904756B (en) Pipe network big data detection system
Kadir et al. Wheat yield prediction: Artificial neural network based approach
CN113610945A (en) Ground stress curve prediction method based on hybrid neural network
CN115858609A (en) Electric vehicle charging pile state monitoring method, fault identification method and electronic equipment
CN111723990B (en) Shared bicycle flow prediction method based on bidirectional long-short term memory neural network
CN113988263A (en) Knowledge distillation-based space-time prediction method in industrial Internet of things edge equipment
CN115877483A (en) Typhoon path forecasting method based on random forest and GRU
CN113688770A (en) Long-term wind pressure missing data completion method and device for high-rise building
CN115907091A (en) Earthquake staff death assessment method based on PSO-SVR
CN114911185A (en) Security big data Internet of things intelligent system based on cloud platform and mobile terminal App
CN118228923A (en) Water area pollution prediction method based on multiple scales and multiple dimensions
CN114065335A (en) Building energy consumption prediction method based on multi-scale convolution cyclic neural network
Akinwale Adio et al. Translated Nigeria stock market price using artificial neural network for effective prediction
CN113537638A (en) Short-term wind pressure prediction method and abnormal data completion method and device for high-rise building
CN117272202A (en) Dam deformation abnormal value identification method and system
Boussabaine et al. Modelling cost‐flow forecasting for water pipeline projects using neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant