CN114358192A - Multi-source heterogeneous landslide data monitoring and fusing method - Google Patents

Multi-source heterogeneous landslide data monitoring and fusing method Download PDF

Info

Publication number
CN114358192A
CN114358192A CN202210013094.0A CN202210013094A CN114358192A CN 114358192 A CN114358192 A CN 114358192A CN 202210013094 A CN202210013094 A CN 202210013094A CN 114358192 A CN114358192 A CN 114358192A
Authority
CN
China
Prior art keywords
landslide
data
fusion
variable
weighted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210013094.0A
Other languages
Chinese (zh)
Other versions
CN114358192B (en
Inventor
王利
张懿恺
许豪
赵超英
刘万林
成伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changan University
Original Assignee
Changan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changan University filed Critical Changan University
Priority to CN202210013094.0A priority Critical patent/CN114358192B/en
Publication of CN114358192A publication Critical patent/CN114358192A/en
Application granted granted Critical
Publication of CN114358192B publication Critical patent/CN114358192B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a multi-source heterogeneous landslide data monitoring and fusing method, which comprises the following steps: obtaining a weighted correlation degree by combining a maximum mutual information coefficient method MIC and a gray correlation analysis method GRA to reflect the influence of multisource heterogeneous landslide monitoring variables on landslide deformation displacement and a common change trend, further carrying out optimization on characteristic factors according to the weighted correlation degree, carrying out stepwise regression fitting analysis on the optimally obtained characteristic factors to obtain corresponding regression coefficients, further calculating to obtain a final regression equation and a fusion result, and carrying out reliability and effectiveness evaluation on the data fusion result by adopting a landslide stage judgment and trend prediction method. The method adopts a multi-source heterogeneous data fusion means to carry out multivariate data effective analysis and processing so as to obtain a more reliable and accurate data fusion result and provide valuable reference information for landslide prediction, thereby effectively improving the precision of landslide prediction.

Description

Multi-source heterogeneous landslide data monitoring and fusing method
Technical Field
The invention relates to the technical field of data fusion, in particular to a multi-source heterogeneous landslide data monitoring fusion method.
Background
The multi-source data fusion technology is used as an emerging interdisciplinary subject in multiple fields, and has been widely applied to landslide deformation monitoring after more than ten years of exploration and development. With the appearance of a plurality of sensors, how to extract comprehensive information of multi-source heterogeneous sensor information and perform effective fusion processing is a current research difficulty. In landslide monitoring, single sensor information cannot comprehensively reflect the deformation characteristics of landslide, and the obtained prediction result is not high in reliability, so that effective feature extraction and comprehensive analysis processing need to be carried out by combining multi-source heterogeneous sensor information to eliminate redundancy and mutual exclusion among data, and a more reliable and accurate prediction result can be obtained.
At present, effective feature extraction and comprehensive analysis processing of source heterogeneous sensor information have the problem that single landslide monitoring information is inaccurate in prediction due to one-sidedness and unreliability.
Disclosure of Invention
The embodiment of the invention provides a multi-source heterogeneous landslide data monitoring and fusing method, which comprises the following steps:
acquiring multi-source heterogeneous monitoring variable data;
dividing multi-source heterogeneous monitoring variables into dependent variables and characteristic variables;
calculating the maximum mutual information coefficient MIC of every two multisource heterogeneous landslide monitoring variables, and screening out the characteristic variables which influence the landslide to the maximum extent;
determining a single-point displacement sequence reflecting landslide deformation characteristics as a reference column, and determining a data sequence consisting of factors influencing landslide deformation as a comparison column;
calculating the grey correlation coefficient and grey correlation degree of the reference number sequence and the comparison number sequence;
calculating a weighted correlation degree according to the maximum mutual information coefficient MIC and the grey correlation degree;
performing characteristic optimization according to the weighted relevance degree, and screening out final characteristic variables;
constructing a feature optimization-stepwise regression feature level data fusion model based on the weighted relevance;
and performing multi-source heterogeneous information fusion by using a feature optimization-stepwise regression feature level data fusion model based on the weighted relevance, and providing effective auxiliary information for landslide prediction.
And in a near step, preprocessing multi-source heterogeneous monitoring variable data:
removing abnormal values, complementing missing values and smoothly denoising data.
And step one, calculating the maximum mutual information coefficient MIC of every two multi-source heterogeneous landslide monitoring variables, which comprises the following steps:
given variables i and j, carrying out i-column and j-row meshing on a scatter diagram formed by the two variables, and solving the maximum mutual information value;
carrying out normalization processing on the maximum mutual information value;
selecting the maximum value of mutual information under different scales as an MIC value;
and obtaining the characteristic variable with the highest correlation degree with the dependent variable.
And further, calculating a grey correlation coefficient by using a calculation formula comprising:
Figure BDA0003458512130000021
where ρ is a resolution coefficient, 0<ρ<1, the smaller rho is, the larger the difference between the correlation coefficients is, the stronger the distinguishing capability is, usually, rho is 0.5, | x0(k)-xi(k) L represents the absolute difference of the corresponding element of each comparison sequence and the reference sequence,
Figure BDA0003458512130000022
and
Figure BDA0003458512130000023
respectively representing two-level minimum difference and two-level maximumA large difference.
Further, the degree of association is weighted, and the calculation formula comprises:
Figure BDA0003458512130000024
where n is the total number of feature variables to be selected, MICs (A, B)i) Representing a characteristic variable A and a characteristic variable BiMaximum mutual information coefficient MIC.
In a further aspect, the method comprises the following steps:
sorting the calculated weighted association degrees from big to small;
sorting and screening the characteristic variables according to the weighted relevance;
calculating the weight of each sorted preferred feature;
when the feature weight is preferred
Figure BDA0003458512130000031
When the characteristic variable is selected, the screening is stopped, and the final characteristic variable is obtained;
wherein, JSIs the sum of the weighted relevance of the characteristic variables, JjIs the weighted relevance, omega, of the jth feature variable to be screenedjFor the jth preferred feature weight, α is a given threshold.
Further, the method comprises the following steps: analyzing and comparing the feature optimization-stepwise regression fusion result based on the weighted correlation degree with the BP neural network fusion result;
establishing a BP neural network fusion model, taking an independent variable as a system input variable and a dependent variable as a system output variable;
establishing a multi-input single-output BP neural network fusion model containing two hidden layers;
performing stage evaluation on a feature optimization-stepwise regression fusion and BP neural network data fusion model based on the weighted correlation degree by adopting two indexes of an improved tangent angle and a deformation rate;
and (3) performing prediction comparison analysis by adopting the long-short term memory network artificial neural network LSTM based on the feature optimization-stepwise regression fusion data and the BP neural network fusion data of the weighted correlation respectively.
The embodiment of the invention provides a multi-source heterogeneous landslide data monitoring and fusing method, which has the following beneficial effects compared with the prior art:
1. the MIC is used for measuring the correlation degree between two variables, and compared with other correlation analysis methods, the MIC is suitable for linear and nonlinear data, has universality, fairness and symmetry, and has high accuracy.
2. Combining the mutual information weight and the grey correlation degree, adopting the weighted correlation degree to measure the importance degree of the characteristic factors to the landslide deformation, calculating the optimal characteristic weight, screening out the characteristic factors according to a threshold value, and combining the characteristics of the mutual information and the grey correlation to carry out characteristic optimization, so that the characteristic optimization result is more reliable.
Drawings
FIG. 1 is a flow chart of a feature optimization based on weighted relevance in the fusion method of the present invention;
FIG. 2 is a diagram of an RNN model in the evaluation analysis of the present invention;
FIG. 3 is a schematic diagram showing the structure of RNN-model cryptic layer cells in the evaluation analysis of the present invention;
FIG. 4 is a schematic diagram of the cell structure of the hidden layer of the LSTM model in the evaluation analysis of the present invention;
FIG. 5 is a distribution diagram of monitoring points in an experimental study area according to the present invention;
FIG. 6 is a graph of a feature optimization-stepwise regression fusion result based in part on weighted relevance for the experiments of the present invention;
FIG. 7 shows the fusion result of BP neural network model in the experimental part of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1 to 7, an embodiment of the invention provides a multi-source heterogeneous landslide data monitoring and fusion method, which includes:
and S1, calculating the maximum mutual information. The MIC algorithm is realized by adopting a miniature class library in python program software. The method mainly comprises the following three steps: 1) giving i and j, carrying out i-column and j-row meshing on a scatter diagram formed by the two variables, and solving the maximum mutual information value; 2) carrying out normalization processing on the maximum mutual information value; 3) selecting the maximum value of mutual information under different scales as an MIC value; 4) and obtaining the characteristic variable with the highest correlation degree with the dependent variable for subsequent regression prediction. The MIC is used for measuring the correlation degree between two variables, and compared with other correlation analysis methods, the MIC is suitable for linear and nonlinear data, has universality, fairness and symmetry, and has high accuracy.
And S2, calculating the grey correlation degree. Determining a single-point displacement sequence capable of reflecting landslide deformation characteristics as a reference column, determining a data sequence composed of factors influencing landslide deformation as a comparison column, carrying out non-dimensionalization processing on the reference column and the comparison column, and then obtaining a gray correlation coefficient and a gray correlation degree of the reference column and the comparison column.
S3, weighted correlation feature is preferred. The mutual information value measures the influence of the features on the landslide deformation, the weight of the mutual information value reflects the effectiveness of the features, and the grey correlation quantifies the degree of consistency between the features and the landslide deformation. And combining the mutual information weight with the grey correlation degree, measuring the importance degree of the characteristic factors to the landslide deformation by adopting the weighted correlation degree, calculating the optimal characteristic weight, and screening out the characteristic factors according to a threshold value. Feature optimization is performed by combining the characteristics of mutual information and grey correlation, so that the feature optimization result is more reliable.
S4, stepwise regression analysis. And introducing the characteristic factors into the model one by one, performing F test after introducing each explanatory variable, performing t test on the selected explanatory variables one by one, and deleting the introduced explanatory variables when the introduction of the originally introduced explanatory variables is not obvious any more. To ensure that the regression equation preceding each new variable introduced contains only significant variables. This is repeated until neither significant explanatory variables are selected into the regression equation, nor insignificant explanatory variables are removed from the regression equation.
And S5, evaluation and analysis. The first step, the stage of fusion result comparison. In order to evaluate the reliability of feature optimization-stepwise regression feature level data fusion based on the weighted association degree, the feature optimization-stepwise regression fusion result based on the weighted association degree and the BP neural network fusion result are adopted to carry out stage discrimination analysis and comparison. Firstly, establishing a BP neural network fusion model, taking an independent variable as a system input variable and a dependent variable as a system output variable, and establishing a multi-input single-output BP neural network fusion model containing two hidden layers. And the two indexes of the improved tangent angle and the improved deformation rate are adopted to carry out stage evaluation on the feature optimization-stepwise regression fusion and BP neural network data fusion model based on the weighted association degree so as to prove the effectiveness of the fusion result of the invention. And secondly, predicting and comparing fusion results. And (3) respectively carrying out prediction comparison analysis on the dependent variable single-point data of the GNSS monitoring point, the feature preference-stepwise regression fusion data based on the weighted correlation degree and the BP neural network fusion data by adopting an LSTM (long-short term memory network artificial neural network). And (3) constructing an LSTM model by using Keras in a python program, and evaluating the prediction precision of a prediction result by using two indexes of MRE (mean relative error) and MAE (mean absolute error) so as to prove the reliability of a fusion result of the invention.
Specifically, the method comprises the following steps:
and S1, calculating the maximum mutual information. The multi-source heterogeneous sensors on the landslide body are correlated, one sensor can be comprehensively influenced by a plurality of other sensors, so that correlation calculation needs to be carried out on multi-source heterogeneous landslide monitoring variables, and characteristic factors influencing the landslide with the maximum deformation are screened out for subsequent fusion prediction. Prior to MIC (maximum mutual information coefficient) calculation, knowledge of information entropy and mutual information is first needed. Mutual information refers to the degree of association between two random variables, and the average information amount after redundancy is eliminated in the information is called "information entropy", so that the characteristic factors influencing landslide deformation can be screened out by adopting an MI (mutual information) mode, and less redundant information is provided. WhileMIC has a higher accuracy than MI and is not limited to a particular function type, and the degree of correlation between variables can be obtained. If there is a correlation between two variables, the sets of their corresponding data points are distributed in a two-dimensional space, the data space is divided using a grid of m times n, with the frequency of the data points falling in the (x, y) -th grid as an estimate of P (x, y),
Figure BDA0003458512130000061
(wherein n isx,yThe number of data points falling in the (x, y) -th grid, n being the total number of data points, and the same applies to obtain estimates of P (x), P (y). Then, mutual information between random variables is calculated, and since the way of meshing data points by multiplying m by n is more than one, the meshing which maximizes the mutual information is obtained, and the value of the mutual information is converted into the (0,1) interval using a normalization factor. And finally, finding out the grid resolution which can maximize the normalized mutual information as the measurement value of the MIC. Wherein the resolution of the grid is limited to mxn<B,B=f(data_size)=n0.6MIC is calculated as
Figure BDA0003458512130000062
Figure BDA0003458512130000063
The specific calculation steps are as follows: 1) the maximum mutual information value is calculated. Given i and j, a scatter diagram formed by the two variables X, Y is gridded in i columns and j rows, and the maximum mutual information value is obtained. However, given i and j, a plurality of different gridding schemes can be obtained, and then a mutual information value corresponding to each scheme needs to be calculated to find out a gridding scheme which enables mutual information to be maximum. 2) And normalizing the maximum mutual information value. The obtained maximum mutual information is divided by log (min (X, Y)), so that normalization is obtained. 3) And selecting the maximum value of mutual information at different scales as the MIC value. And then selecting the features with larger influence on landslide deformation, and eliminating the features with less information quantity, so that the variables for modeling are more representative.
And S2, calculating the grey correlation degree. Feature selection using MIC results onlyWithout convincing, MIC analysis representing the influence degree and grey correlation analysis representing the consistency degree can be combined for comprehensive analysis to obtain the characteristic factors more suitable for data fusion. Carrying out non-dimensionalization processing on the multi-source heterogeneous landslide monitoring characteristic sequence, and calculating a correlation coefficient and a correlation degree, wherein the specific process comprises the following steps: 1) determining a reference column sequence and comparing the column sequences. Let reference column sequence Y ═ { Y (k) | k ═ 1,2, … n }; the comparison sequence is Xi={Xi(k) I |, k ═ 1,2, …, n }, i ═ 1,2, …, m. 2) Dimensionless of the variables. Because the different characteristic factor dimensions are inconvenient to compare, dimensionless processing is needed.
Figure BDA0003458512130000071
Figure BDA0003458512130000072
The non-dimensionalized data sequences form the following matrix:
Figure BDA0003458512130000073
Figure BDA0003458512130000074
3) calculating the absolute difference value of the corresponding element of each evaluated object index sequence (comparison sequence) and the reference sequence one by one, namely | x0(k)-xi(k) And | n (k ═ 1, …, m; i ═ 1, …, n,) represents the number of objects to be evaluated. 4) Determining
Figure BDA0003458512130000075
And
Figure BDA0003458512130000076
5) and calculating the correlation coefficient. The correlation coefficient calculation formula is as follows:
Figure BDA0003458512130000077
Figure BDA0003458512130000078
where ρ is the resolution coefficient, 0<ρ<1, if rho is smaller, the difference between the correlation coefficients is larger, and the region isThe stronger the separation capacity, usually ρ is 0.5. 6) And calculating the association degree. Calculating the mean value of the association coefficients of the indexes and the elements corresponding to the reference sequence for each evaluation object (comparison sequence) respectively to reflect the association relationship between each evaluation object and the reference sequence, and recording the mean value as the association degree:
Figure BDA0003458512130000079
Figure BDA00034585121300000710
and S3, calculating the weighted association degree. And combining the maximum mutual information weight with the grey correlation degree to obtain the weighted correlation degree of the corresponding characteristics to reflect the characteristics. The greater the weighted relevance, the more important the feature is, the calculation formula is:
Figure BDA00034585121300000711
where n is the total number of features to be selected.
S4, preferred features. And sorting the calculated weighted association degrees from big to small, selecting the features with the largest weighted association degrees, adding the features into the preferred set, and removing the features from the set to be preferred. Sequentially screening from large to small, and calculating the optimal characteristic weight:
Figure BDA00034585121300000712
in the formula JSIs the sum of weighted relevance of each feature in the preferred set, JjIs the weighted relevance, omega, of the jth feature to be screenedjAnd when the j is the preferred feature weight and the value is less than alpha, the feature screening is considered to be finished.
S5, stepwise regression analysis. The basic idea of stepwise regression is to introduce variables into the model one by one, perform an F-test after each introduction of an explanatory variable, perform a t-test on the already selected explanatory variables one by one, and delete the originally introduced explanatory variables when their introduction becomes no longer significant. To ensure that the regression equation preceding each new variable introduced contains only significant variables. This is an iterative process until neither significant explanatory variables are selected into the regression equation, nor insignificant explanatory variables are removed from the regression equation. The stepwise regression specifically comprises the following steps:
the first step is as follows: building an augmented matrix
Calculating lij,liy,lyyAnd rij,riyThe formulas are respectively as follows:
Figure BDA0003458512130000081
Figure BDA0003458512130000082
wherein
Figure BDA0003458512130000083
Figure BDA0003458512130000084
An expanded augmentation matrix can be obtained
Figure BDA0003458512130000085
Wherein R ═ Rij)m×m,ryy=1,ry=(r1y,r2y,…,rmy)'。
The second step is that: the s-th step is subjected to elimination transformation, and the result is
Figure BDA0003458512130000086
Figure BDA0003458512130000087
Wherein
Figure BDA0003458512130000088
Figure BDA0003458512130000089
The third step: and (5) factor elimination.
Selecting j0So that
Figure BDA0003458512130000091
2 calculation of
Figure BDA0003458512130000092
(if F)>FGo outIf yes, executing the fourth step; and otherwise, performing s +1 times of elimination transformation, and then performing calculation in two steps and three steps.
The fourth step: a regression factor was introduced. Let s, { j }, f still be defined in step two.
Selecting k0So that
Figure BDA0003458512130000093
2 calculation of
Figure BDA0003458512130000094
Wherein
Figure BDA0003458512130000095
Figure BDA0003458512130000096
(if F)<FIntoIf yes, executing the fifth step; otherwise, performing s +1 times of elimination transformation, and introducing kth0And (4) calculating the regression factors by two, three and four steps.
And fifthly, neither introducing variables nor removing variables. The regression equation obtained finally is ^ y ^ b0+∑j∈{j}^bjxjWherein
Figure BDA0003458512130000097
Figure BDA0003458512130000098
And S6, evaluation and analysis.
The first step, the stage of fusion result comparison. In order to evaluate the reliability of feature optimization-stepwise regression feature level data fusion based on weighted relevance, the model fusion result and the BP neural network fusion result are adopted for stage judgmentAnd (5) analyzing and comparing. Firstly, establishing a BP neural network fusion model, taking an independent variable as a system input variable and a dependent variable as a system output variable, and establishing a multi-input single-output BP neural network fusion model containing two hidden layers. The BP neural network topology structure comprises an input layer, a hidden layer and an output layer, and simultaneously comprises a forward multilayer feedforward stage and a reverse error correction stage. The forward multi-layer feedforward stage is a forward process for sequentially calculating the actual input and output of each node of each layer from the input layer, and the mathematical model is
Figure BDA0003458512130000099
In the formula
Figure BDA00034585121300000910
Is the output value of the ith node of the l layer;
Figure BDA00034585121300000911
the activation value of the ith node of the ith layer;
Figure BDA00034585121300000912
the connection weight value from the jth node of the l-1 layer to the ith node of the l layer is obtained;
Figure BDA00034585121300000913
a threshold value of the ith node of the ith layer; n is a radical oflThe number of nodes of the l layer. In order to improve the error precision of neurons in an output layer, a gradient descent algorithm is adopted for reverse error propagation. The connection weights between each layer of neurons are adjusted by a gradient descent algorithm, so that the final overall error changes along the reduction direction. The algorithm formula is as follows:
Figure BDA0003458512130000101
(eta is learning rate) and weight adjustment formula is
Figure BDA0003458512130000102
And a feature optimization-stepwise regression fusion model based on the weighted joint degree and a BP neural network data fusion model are subjected to stage evaluation by adopting two indexes of an improved tangent angle (reference self-allowance, an improved tangent angle and a corresponding landslide early warning criterion) and a deformation rate, and the evaluation is shown as a comprehensive early warning basis based on a deformation rate threshold and a deformation process.
TABLE 2 comprehensive early warning criteria based on deformation rate threshold and deformation process
Figure BDA0003458512130000103
And secondly, predicting and comparing fusion results. The feature level fusion is to effectively analyze and process multi-source heterogeneous information obtained by landslide monitoring. Thereby improving the accuracy of prediction. In order to discuss the effectiveness of feature level fusion in improving the landslide prediction and forecast accuracy, LSTM (long-short term memory network artificial neural network) is adopted to perform prediction comparison analysis on the GNSS monitoring point dependent variable single-point data, the feature preference-stepwise regression fusion data based on the weighting union degree and the BP neural network fusion data. The construction of the LSTM model is carried out by adopting Keras in a python program, the LSTM neural network model is a novel neural network algorithm improved based on a common recurrent neural network, and the LSTM model replaces RNN cells of a hidden layer with LSTM cells, so that the problem that the gradient possibly disappears quickly in the process of back propagation can be effectively solved, the LSTM model has long-term memory capacity, and long-time sequence data can be processed. Compared with the RNN model, the LSTM unit has 3 gating switches inside, as shown in fig. 3, where i is an input gate, f is a forgetting gate, c is a cell state, o is an output, and σ and tanh are Sigmoid and hyperbolic tangent activation functions, respectively.
Forget to check door through ht-1And xtUsing a Sigmoid unit to output a vector between 0 and 1, wherein the value of 0 to 1 in the vector represents the cell state ct-1Which information is retained or how much is discarded. 0 means no reservation and 1 means both reservations. f. oft=σ(Wf·[ht-1,xt]+bf). The input gate is used to update the cell state. Previously hidden stateThe information and the currently input information are input to a Sigmoid function, and the output value is adjusted between 0 and 1 to decide which information to update, with 0 indicating no importance and 1 indicating importance. While the hidden state and the current input are transmitted to the tanh function and the values are compressed between-1 and 1 to adjust the network, and then the tanh output is multiplied by the Sigmoid output, which will determine what information is important in the tanh output and needs to be preserved. i.e. it=σ[Wf·[ht-1,xt]+bi],~C=tanh(WC·[ht-1,xt]+bC). The output gate controls the value of the next hidden state, which can be used for prediction. Firstly, transmitting the previous hidden state and the current input to a Sigmoid function, simultaneously transmitting the newly obtained unit state to a tanh function, and then multiplying the tanh output and the Sigmoid output to obtain new information of the hidden state, wherein the new information is output as an output value of the current unit; and finally, synchronizing the new unit state and the hidden state to the next time step. ot=σ[Wo·[ht-1,xt]+bo],ht=ot*tanh(Ct). The LSTM model training process adopts a classical back propagation algorithm and is divided into 4 steps: (1) the output values of LSTM cells were calculated according to the forward calculation method,
Figure BDA0003458512130000111
the first derivative and the second derivative of the loss function l () are respectively, and the final obtained objective function is:
Figure BDA0003458512130000112
Figure BDA0003458512130000113
and W and b respectively correspond to a weight coefficient matrix and an offset term. (2) The error term for each LSTM cell was calculated backwards, including 2 back propagation directions by time and network level. (3) The gradient of each weight is calculated according to the corresponding error term. (4) A gradient-based optimization algorithm is applied to update the weights.
Using both MRE (mean relative error) and MAE (mean absolute error)And evaluating the prediction accuracy of each index.
Figure BDA0003458512130000114
And detecting the distance between the expected value and the actual value, and measuring the prediction precision.
Example (b):
the experimental data of the invention adopts landslide monitoring data from 3 months 28 days to 10 months 4 days of a landslide body 7# in 2019, namely, Chang Jing county, Hei, Gansu province, and takes the day as a sampling rate, wherein the landslide monitoring data is 191d in total, and comprises two groups of GNSS monitoring data (HF06 and HF07), three groups of displacement meter monitoring data (DCF11, DCF14 and DCF15) and 3 kinds of meteorological data of humidity, temperature and rainfall. When the landslide body landslides in 2019, 10 months, 5 days and 4 days, 5 groups of monitoring equipment all monitor displacement change data of landslide deformation, and data fusion and precision judgment can be carried out by adopting a multi-source heterogeneous sensor. The distribution of the monitoring points in the experimental area is shown in figure 5.
The method comprises the steps of firstly carrying out data preprocessing on multi-sensor variables and environmental factors, wherein the preprocessing comprises abnormal value elimination, missing value completion and data smoothing and denoising, carrying out MIC (maximum mutual information) calculation on preprocessed data, determining mutual information weight, and obtaining characteristic factors which have the largest influence on landslide deformation. And then, performing grey correlation calculation to obtain a grey correlation value, finally obtaining a weighted correlation value by adopting a weighted correlation formula, and determining a final characteristic factor by calculating a characteristic preferred weight. Table 3 shows the MIC weight, gray correlation and weighted correlation results of the GNSS monitoring point HF06 and other GNSS monitoring data, displacement meter monitoring data, rainfall, temperature and humidity. The higher the weighted relevance is, the higher the influence and the approach degree of the characteristic on the landslide deformation are.
Table 3 weighted association calculation table
Figure BDA0003458512130000121
And sequencing results obtained by a weighted correlation method combining MIC and grey correlation: the GNSS monitoring point HF07, the displacement meter DCF11, the displacement meter DCF14, the displacement meter DCF15, the accumulated rainfall in the first 48 hours, the humidity and the rainfall, namely the influence of the displacement sensor monitoring data, the GNSS monitoring data, the accumulated rainfall in the first 48 hours, the temperature and the humidity on the landslide is large. Table 4 shows the feature preference weight calculated from the weighted relevance degree, and feature preference is performed by this value.
Table 4 characteristic preferred results
Figure BDA0003458512130000131
And sequencing the weighted relevance and calculating the optimal feature weight, wherein a threshold value alpha is selected to be 0.1, namely when the feature weight is less than 0.1, the influence of the feature on landslide deformation is considered to be negligible, and the feature factor is selected completely. And performing stepwise regression fitting analysis on factors influencing landslide deformation, which are obtained by weighting the relevance optimization, according to the analysis result, performing stepwise regression analysis by using the GNSS monitoring point HF07 data, the displacement meter monitoring data, the previous 48-hour accumulated rainfall data, the temperature and the humidity as independent variables and the GNSS monitoring point HF06 data as dependent variables to obtain corresponding regression coefficients, and further calculating to obtain the final characteristics and the fusion result. In the step-by-step regression analysis, the optimal result of the model is obtained by carrying out comparison analysis on the correlation coefficient, the variance of the residual error, the F value, the significance and the like of different models. The obtained regression coefficients are shown in table 5.
TABLE 5 regression coefficient Table
Figure BDA0003458512130000132
Obtaining a stepwise regression model expression of the surface displacement of the landslide surface, wherein the stepwise regression model expression is as follows: the landslide surface displacement is (GNSS monitoring point HF07 data × 0.587) + (displacement meter DCF11 data × 0.036) + (displacement meter DCF14 data × 0.519) - (displacement meter DCF15 data × 0.159) + (temperature × 0.028) + (humidity × 0.026) - (previous 48 hours accumulated rainfall × 0.010), and then a feature preference-stepwise regression feature level fusion result based on the weighted correlation is obtained, as shown in fig. 6.
And performing stage discrimination analysis and comparison on the feature preference-stepwise regression fusion result based on the weighted association degree and the BP neural network fusion result. Firstly, establishing a BP neural network fusion model, taking GNSS monitoring point HF07 data, displacement meter monitoring data, accumulated rainfall in the first 48 hours, temperature, humidity and the like as input data of the BP neural network model, taking the GNSS monitoring point HF06 data as expected output data, and establishing a multi-input single-output BP neural network fusion model containing two hidden layers by referring to an MIC analysis result. The BP neural network feature level fusion result is obtained through experimental analysis, as shown in fig. 7. According to literature data, when the tangent angle is larger than 80 degrees, the landslide is in the middle acceleration stage, in the experiment, two fusion results are compared only aiming at the tangent angle before the landslide, the tangent angle analysis and comparison result is shown in table 6, and the deformation rate analysis and comparison result is shown in table 7.
TABLE 6 improved tangent angle analysis results of two fusion results
Figure BDA0003458512130000141
TABLE 7 results of deformation Rate analysis of two fusion results
Figure BDA0003458512130000142
According to the comparison of indexes of the two stages, the improved tangent angle obtained by the feature optimization-stepwise regression based on the weighted relevance degree is closer to the landslide instability moment, and is used for judging the true development state of the more fit landslide in the landslide stage. Therefore, the feature optimization-stepwise regression fusion result based on the weighted relevance has better reliability and accuracy in the judgment analysis of the landslide stage, and the fusion result is better.
And then, respectively carrying out landslide trend prediction comparison analysis on the GNSS monitoring point HF06 data, the feature preference-stepwise regression fusion data based on the weighted correlation degree and the BP neural network fusion data by adopting an LSTM network algorithm, and carrying out prediction result precision comparison by adopting two precision evaluation indexes of MRE and MAE.
TABLE 8 comparison of prediction accuracy of two fusion results
Figure BDA0003458512130000151
As can be seen from Table 8, the preferred features based on weighted correlation-stepwise regression fusion results predicted MAE and MRE of 9.9mm and 3.46%, respectively, BP neural network fusion results predicted MAE and MRE of 15.1mm and 4.33%, respectively, and GNSS monitoring point HF06 data predicted MAE and MRE of 19.9mm and 4.51%, respectively. Namely, the prediction precision of the feature optimization-stepwise regression fusion result based on the weighted relevance is higher, and the feature level fusion result is proved to be more accurate and reliable.
Although the embodiments of the present invention have been disclosed in the foregoing for illustrative purposes, those skilled in the art will appreciate that various modifications, additions and substitutions are possible, without departing from the scope and spirit of the invention as disclosed in the accompanying drawings.

Claims (8)

1. A multi-source heterogeneous landslide data monitoring and fusion method is characterized by comprising the following steps:
acquiring multi-source heterogeneous monitoring variable data;
dividing multi-source heterogeneous monitoring variables into dependent variables and characteristic variables;
calculating the maximum mutual information coefficient MIC of the dependent variable and the characteristic variable, and screening out the characteristic variable which has the maximum influence on landslide deformation;
determining a single-point displacement sequence reflecting landslide deformation characteristics as a reference column, and determining a data sequence consisting of factors influencing landslide deformation as a comparison column;
calculating a gray correlation coefficient and a gray correlation degree between the reference number series and the comparison number series;
calculating a weighted correlation degree according to the maximum mutual information coefficient MIC and the grey correlation degree;
performing characteristic optimization according to the weighted relevance degree, and screening out final characteristic variables;
performing stepwise regression fitting analysis on the preferably obtained characteristic variables;
constructing a feature optimization-stepwise regression feature level data fusion model based on the weighted relevance;
and performing multi-source heterogeneous information fusion by using a feature optimization-stepwise regression feature level data fusion model based on the weighted relevance, and providing effective auxiliary information for landslide prediction.
2. The multi-source heterogeneous landslide data monitoring fusion method of claim 1, further comprising preprocessing multi-source heterogeneous monitoring variable data:
removing abnormal values, complementing missing values and smoothly denoising data.
3. The multi-source heterogeneous landslide data monitoring and fusion method of claim 1, wherein the step of calculating the maximum mutual information coefficient MIC of the dependent variable and the characteristic variable comprises:
given variables i and j, carrying out i-column and j-row meshing on a scatter diagram formed by the two variables, and solving the maximum mutual information value;
carrying out normalization processing on the maximum mutual information value;
selecting the maximum value of mutual information under different scales as an MIC value;
and obtaining the characteristic variable with the highest correlation degree with the dependent variable.
4. The multi-source heterogeneous landslide data monitoring and fusing method of claim 1, wherein the grey correlation coefficient calculation formula comprises:
Figure FDA0003458512120000021
where ρ is a resolution coefficient, 0<ρ<1, if ρ is smaller, offThe larger the difference between the connection coefficients is, the stronger the distinguishing capability is, usually rho is 0.5, | x0(k)-xi(k) L represents the absolute difference of the corresponding element of each comparison sequence and the reference sequence,
Figure FDA0003458512120000022
and
Figure FDA0003458512120000023
representing the two-level minimum difference and the two-level maximum difference, respectively.
5. The multi-source heterogeneous landslide data monitoring and fusion method of claim 1, wherein the weighted correlation, the calculation formula comprises:
Figure FDA0003458512120000024
where n is the total number of feature variables to be selected, MICs (A, B)i) Representing a characteristic variable A and a characteristic variable BiMaximum mutual information coefficient MIC.
6. The multi-source heterogeneous landslide data monitoring and fusion method of claim 5, wherein the feature optimization step comprises:
sorting the calculated weighted association degrees from big to small;
sorting and screening the characteristic variables according to the weighted relevance;
calculating the weight of each sorted preferred feature;
when the feature weight is preferred
Figure FDA0003458512120000025
When the characteristic variable is selected, the screening is stopped, and the final characteristic variable is obtained;
wherein, JSIs the sum of the weighted relevance of the characteristic variables, JjIs the weighted relevance, omega, of the jth feature variable to be screenedjFor the jth preferred feature weight, α is a given threshold.
7. The multi-source heterogeneous landslide data monitoring fusion method of claim 1, further comprising: analyzing and comparing the feature optimization-stepwise regression fusion result based on the weighted correlation degree with the BP neural network fusion result, wherein the method comprises the following steps:
establishing a BP neural network fusion model, taking an independent variable as a system input variable and a dependent variable as a system output variable;
establishing a multi-input single-output BP neural network fusion model containing two hidden layers;
and performing stage evaluation on the feature optimization-stepwise regression fusion and BP neural network data fusion model based on the weighted association degree by adopting two indexes of an improved tangent angle and a deformation rate.
8. The multi-source heterogeneous landslide data monitoring fusion method of claim 7 further comprising:
and (3) performing prediction comparison analysis by adopting the long-short term memory network artificial neural network LSTM based on the feature optimization-stepwise regression fusion data and the BP neural network fusion data of the weighted correlation respectively.
CN202210013094.0A 2022-01-06 2022-01-06 Multi-source heterogeneous landslide data monitoring and fusing method Active CN114358192B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210013094.0A CN114358192B (en) 2022-01-06 2022-01-06 Multi-source heterogeneous landslide data monitoring and fusing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210013094.0A CN114358192B (en) 2022-01-06 2022-01-06 Multi-source heterogeneous landslide data monitoring and fusing method

Publications (2)

Publication Number Publication Date
CN114358192A true CN114358192A (en) 2022-04-15
CN114358192B CN114358192B (en) 2022-11-25

Family

ID=81106472

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210013094.0A Active CN114358192B (en) 2022-01-06 2022-01-06 Multi-source heterogeneous landslide data monitoring and fusing method

Country Status (1)

Country Link
CN (1) CN114358192B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115238850A (en) * 2022-06-30 2022-10-25 西南交通大学 Mountain slope displacement prediction method based on MI-GRA and improved PSO-LSTM
CN115928810A (en) * 2022-11-09 2023-04-07 中国十七冶集团有限公司 Foundation pit intelligent monitoring method based on multi-sensor data fusion
CN116975008A (en) * 2023-09-22 2023-10-31 青岛海联智信息科技有限公司 Ship meteorological monitoring data optimal storage method
CN117490675A (en) * 2024-01-03 2024-02-02 西北工业大学 High-precision anti-interference control method for array MEMS gyroscope

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112270400A (en) * 2020-10-16 2021-01-26 西安工程大学 Landslide displacement dynamic prediction method based on multiple influence factors
CN112488395A (en) * 2020-12-01 2021-03-12 湖南大学 Power distribution network line loss prediction method and system
CN113507118A (en) * 2021-07-11 2021-10-15 湘潭大学 Wind power prediction method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112270400A (en) * 2020-10-16 2021-01-26 西安工程大学 Landslide displacement dynamic prediction method based on multiple influence factors
CN112488395A (en) * 2020-12-01 2021-03-12 湖南大学 Power distribution network line loss prediction method and system
CN113507118A (en) * 2021-07-11 2021-10-15 湘潭大学 Wind power prediction method and system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ANG LI ET AL.: "Intelligent Transportation Application and", 《IEEE SENSORS JOURNAL》 *
MONICA BONETTO ET AL.: "I-Maculaweb: A Tool to Support Data", 《IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE》 *
李传珍等: "基于改进有序聚类法的立式加工中心进给系统", 《工程设计学报》 *
王丽萍等: "MIC-PCA 耦合算法在径流预报因子筛选中的应用", 《中国农村水利水电》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115238850A (en) * 2022-06-30 2022-10-25 西南交通大学 Mountain slope displacement prediction method based on MI-GRA and improved PSO-LSTM
CN115928810A (en) * 2022-11-09 2023-04-07 中国十七冶集团有限公司 Foundation pit intelligent monitoring method based on multi-sensor data fusion
CN116975008A (en) * 2023-09-22 2023-10-31 青岛海联智信息科技有限公司 Ship meteorological monitoring data optimal storage method
CN116975008B (en) * 2023-09-22 2023-12-15 青岛海联智信息科技有限公司 Ship meteorological monitoring data optimal storage method
CN117490675A (en) * 2024-01-03 2024-02-02 西北工业大学 High-precision anti-interference control method for array MEMS gyroscope
CN117490675B (en) * 2024-01-03 2024-03-15 西北工业大学 High-precision anti-interference control method for array MEMS gyroscope

Also Published As

Publication number Publication date
CN114358192B (en) 2022-11-25

Similar Documents

Publication Publication Date Title
CN114358192B (en) Multi-source heterogeneous landslide data monitoring and fusing method
CN108231201B (en) Construction method, system and application method of disease data analysis processing model
CN112446419B (en) Attention mechanism-based space-time neural network radar echo extrapolation prediction method
CN114757309B (en) Multi-physical-field monitoring data collaborative fusion engineering disaster early warning method and system
CN108596327B (en) Seismic velocity spectrum artificial intelligence picking method based on deep learning
CN107463993B (en) Medium-and-long-term runoff forecasting method based on mutual information-kernel principal component analysis-Elman network
Al-Zwainy et al. Development of the construction productivity estimation model using artificial neural network for finishing works for floors with marble
CN114547974A (en) Dynamic soft measurement modeling method based on input variable selection and LSTM neural network
CN115238850A (en) Mountain slope displacement prediction method based on MI-GRA and improved PSO-LSTM
CN112904756A (en) Pipe network big data detection system
CN112990585A (en) Hen laying rate prediction method based on LSTM-Kalman model
CN114911185A (en) Security big data Internet of things intelligent system based on cloud platform and mobile terminal App
CN111723990B (en) Shared bicycle flow prediction method based on bidirectional long-short term memory neural network
CN109635346B (en) Reliability analysis method of mechanical connection structure
CN113688770A (en) Long-term wind pressure missing data completion method and device for high-rise building
CN115062764B (en) Intelligent illuminance adjustment and environmental parameter Internet of things big data system
Boussabaine et al. Modelling cost‐flow forecasting for water pipeline projects using neural networks
CN112184037B (en) Multi-modal process fault detection method based on weighted SVDD
CN115359197A (en) Geological curved surface reconstruction method based on spatial autocorrelation neural network
CN115510948A (en) Block chain fishing detection method based on robust graph classification
CN114065335A (en) Building energy consumption prediction method based on multi-scale convolution cyclic neural network
CN114066036A (en) Cost prediction method and device based on self-correction fusion model
Stupen et al. Crop Yielding Capacity Modeling using Artificial Neural Networks
Huang The prediction of the earthquake based on neutral networks
Vadi et al. Artificial neural networks (ANNS) for prediction of California bearing ratio of soils

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant