CN114818500A

CN114818500A - Method for predicting soil bin pressure based on LSTM algorithm

Info

Publication number: CN114818500A
Application number: CN202210494436.5A
Authority: CN
Inventors: 凌静秀; 成晓元; 吴勉; 邵家诚
Original assignee: Fujian University of Technology
Current assignee: Fujian University of Technology
Priority date: 2022-05-07
Filing date: 2022-05-07
Publication date: 2022-07-29

Abstract

The invention discloses a method for predicting soil warehouse pressure based on an LSTM algorithm, which comprises the steps of obtaining real-time tunneling factor data by a shield tunneling machine data acquisition system, respectively carrying out spearman correlation analysis on the real-time tunneling factor data and the soil warehouse pressure data, and screening out factors having correlation influence on the soil warehouse pressure according to the sequence from big to small of absolute values of spearman correlation coefficients; randomly dividing the screened factor data into a training set and a testing set; obtaining an LSTM training model by taking training set data as input, obtaining optimal model parameters by a random gradient descent method, and storing the optimal model parameters in a PTH file form; taking test set data as input, loading stored optimal model parameters to obtain prediction data, and evaluating prediction performance; and predicting the pressure of the earth bin of the earth pressure balance shield machine by the LSTM prediction model. The method considers the over-fitting problem in neural network prediction and improves the prediction precision.

Description

Method for predicting soil bin pressure based on LSTM algorithm

Technical Field

The invention relates to the technical field of engineering tunneling, in particular to a method for predicting soil bin pressure based on an LSTM algorithm.

Background

With the rapid development of social economy in China in these years, the earth pressure balance shield machine has become one of the most main construction equipment for constructing subways and tunnels. The reasonable setting of the reference value of the pressure of the soil bin is an important guarantee for construction safety. Therefore, the method is very important for predicting the future soil bin pressure change, and has important significance for improving the construction safety and efficiency of the shield tunneling machine.

In the existing technology for predicting the soil bin pressure by adopting a neural network algorithm, the number of layers and the number of neurons of a hidden layer are increased in order to pursue the accuracy of a prediction result, so that the complexity of a neural network prediction model is too high, and the problem of overfitting is caused. Meanwhile, in the tunneling process of the earth pressure balance shield machine, numerous tunneling factors exist, obvious relevance exists, and the parameter data at the previous moment can influence the parameters at the next moment, namely, the tunneling data of the shield machine has the time sequence problem.

In the prior art, a method for predicting the earth pressure of a shield machine based on an XGboost algorithm with the application number of CN202011296026.7 is provided, the method for predicting the earth pressure of the shield machine based on the XGboost algorithm is provided, feature variables with small correlation values with the earth pressure change are screened out through the XGboost algorithm, and correlation variables related to the earth pressure change are selected as feature vectors; extracting the characteristics of the original data acquired by the sensor, and carrying out shift transformation to divide the data into a training set and a verification set; feeding a training set to obtain an initial XGboost regression model, and obtaining optimal model parameters in a grid search mode; inputting the data samples in the verification set into the XGboost regression model after parameter optimization to obtain an optimal XGboost soil pressure regression model; and calculating the soil pressure value in the future period by using the XGboost soil pressure regression model. The method realizes the prediction of the change condition of the soil bin pressure in the shield tunneling construction, provides technical support for early warning of soil pressure abnormity in the shield tunneling construction and solving potential construction safety hazards, and improves the safety of the shield tunneling construction. However, the XGBoost algorithm has too many parameters and complicated parameters, and the training set in the algorithm must include the total, so that once the features that never appear in the training set appear in the test set, the model effect is deteriorated, and the prediction accuracy is reduced.

Disclosure of Invention

The invention aims to provide a method for predicting the pressure of a soil bin based on an LSTM algorithm.

The technical scheme adopted by the invention is as follows:

a method for predicting soil bin pressure based on an LSTM algorithm comprises the following steps:

step 1: the method comprises the steps that real-time tunneling factor data are obtained by a shield tunneling machine data acquisition system, spearman correlation analysis is respectively carried out on the real-time tunneling factor data and soil bin pressure data, and factors having correlation influence on soil bin pressure are screened out according to the sequence from big to small of absolute values of spearman correlation coefficients;

step 2: randomly dividing the screened factor data into a training set and a testing set;

and step 3: obtaining an LSTM training model by taking training set data as input, preventing an overfitting phenomenon by using a Dropout and Early Stopping method, obtaining optimal model parameters by using a random gradient descent method and storing the optimal model parameters in a PTH file form;

and 4, step 4: and taking the test set data as input, loading the stored optimal model parameters to obtain prediction data, and evaluating the prediction performance of the prediction data.

And 5: predicting the pressure of the earth pressure balance shield machine earth bin by the LSTM prediction model: and (4) taking the factors screened out in the step one as the input of American time, loading the model file stored in the step three, and calculating the soil bin pressure value at the future time.

Further, the spearman correlation analysis calculation formula in step 1 is:

in the formula, R _i And S _i Respectively representing the value grades of the observed values i;

average levels of variables x and y, respectively; n is the total number of observations.

Specifically, for example, the spearman correlation coefficient value r is between-1 and 1, negative correlation is between-1 and 0, and positive correlation is between 0 and 1. The larger the absolute value | r | of the correlation coefficient is, the stronger the correlation between the two factors is, and therefore, the following factors are obtained in sequence, wherein the factors have a correlation relation with the soil bin pressure in the invention: the system comprises a total thrust, the rotating speed of a screw machine, a propelling speed, the rotating speed of a cutter head, the torque of the cutter head, a group B propelling pressure, a foam system average pressure, a front sealing bin shield tail sealing average pressure, a rear sealing bin shield tail sealing average pressure, the oil temperature of the screw machine, a group A hinged oil cylinder stroke and a group C hinged oil cylinder stroke.

Further, in the step 2, the screened factor data is randomly divided into a training set and a testing set through a train _ test _ split () algorithm function in the Sklearn algorithm kit, so that the influence of artificial factors on the soil bin pressure prediction result is reduced.

Further, the LSTM model in step 3 is based on the short-term memory function of the recurrent neural network, and establishes the dependency relationship between the states at long time intervals through a gating mechanism; the calculation process of the LSTM prediction model is as follows: using the external state h at the previous moment _t-1 And input x of the current time _t Calculating three gates and candidate state g _t And combined with a forgetting door f _t And an input gate o _t To update the memory cell c _t Finally through the output gate o _t Passing information of the internal state to the external state h _t I.e. LSTM introduces a new internal state c _t External state h specially for cyclic information transmission and outputting information to hidden layer _t Tool for measuringThe expression for the body is as follows:

wherein f is _t For forgetting to turn on the door, for controlling the internal state c at the previous moment _t-1 Information that needs to be forgotten; i.e. i _t For the input gate, the candidate state g at the current moment is controlled _t Information to be saved; o _t For the output gate, the current internal state c is controlled _t How much information needs to be output to the external state h _t (ii) a tanh is a hyperbolic tangent function with an output interval of (-1, 1).

The calculation formulas of the three gates are respectively:

in the formula, sigma is a Logistic function, and the output interval is (0, 1); x is the number of _t Input for the current time; w _if 、b _if 、W _ii 、b _ii 、W _ig 、b _ig 、W _io 、b _io Are respectively for x _t The linear transformation parameter matrix of (2); w _hf 、b _hf 、W _hi 、b _hi 、W _hg 、b _hg 、W _ho 、b _ho Is for h _t-1 And (5) performing linear transformation on the parameter matrix.

Furthermore, Dropout in step 3 refers to randomly discarding a part of neurons to avoid overfitting in the training process of deep learning, randomly generating discarding masks with the same number as the nodes by bernoulli distribution with a probability p, shielding the discarding masks and part of nodes after multiplying the inputs, and performing subsequent calculation by using the rest nodes;

early Stopping is to calculate the performance of the model parameters on the test set during training, and the training is stopped when the prediction accuracy of the model parameters on the test set begins to decrease, so as to avoid the over-fitting problem caused by continuous training.

Further, in step 3, the following method is adopted to determine the optimal model parameters;

step 3-1, calculating to obtain an absolute coefficient R based on the predicted value and the actual value ² The calculation formula is as follows;

wherein the content of the first and second substances,

to predict value, y _i Is an actual value;

step 3-2, Absolute coefficient R ² Calculated to be corrected

Evaluating the quality of the model so as to eliminate the influence on the evaluation index caused by different number of factors;

wherein n is the number of samples;

is an average value; p is the number of features;

step 3-3, judging the current corrected R ² _adj Is less than the optimum correction R ² _adj The difference of the value and 1; if so, with the current corrected R ² _adj As the optimum correction R ² _adj Taking the corresponding model parameter as the optimal parameter of the model; otherwise, the optimum correction R is maintained ² _adj The values and the optimal parameters of the model are unchanged;

step 3-4, optimally corrected R ² _adj Whether the difference between the value of (1) and 1 is less than a set value; if so, finishing training and storing the optimal model parameters in a PTH file form; otherwise, the next training is performed.

Specifically, the average absolute error MAE can reflect the actual error condition between the predicted value and the actual value, and the smaller the value is, the better the capability of the prediction model to accurately describe data is; the mean square error MSE is commonly used for evaluating the change degree of data, and the smaller the value of the MSE is, the better the capability of the prediction model for accurately describing the data is. Since the input dimensionality of the prediction model of the present invention increases with the number of influencing factors, corrected R is used ² _adj The evaluation of the model is more meaningful, and the closer the value is to 1, the better the capability of the prediction model to accurately describe data is.

In the formula, n is the number of samples;

is a predicted value; y is _i Is an actual value;

is an average value; p is the number of features.

Further, the method also comprises a step 6 of drawing a predicted value trend graph based on the calculated soil bin pressure value at the future moment.

By adopting the technical scheme, compared with the prior art, the invention has the following advantages: (1) model training and prediction are carried out by adopting an LSTM algorithm specially aiming at a time sequence problem, so that the characteristic dimension of input data is considered, the influence of the characteristic time dimension on the soil bin pressure is also considered, and the prediction precision is improved. (2) Because the geological environment is complex and changeable, and the tunneling factors of the earth pressure balance shield machine are numerous in the tunneling process, so that the correlation degree of the pressure fluctuation of the earth bin is unequal. (3) According to the method, the shield driver and the monitoring personnel can know the change condition of the soil bin pressure at the next moment in advance through the prediction of the soil bin pressure in the future, so that the possible soil pressure unbalance can be adjusted in time, and the construction safety is ensured. (4) The method considers the over-fitting problem in neural network prediction and improves the prediction precision.

Drawings

The invention is described in further detail below with reference to the accompanying drawings and the detailed description;

FIG. 1 is a schematic diagram of the LSTM structure of the present invention;

FIG. 2 is a schematic flow chart of a method for predicting soil bin pressure based on an LSTM algorithm according to the present invention;

FIG. 3 is a graph showing the results of spearman correlation analysis in accordance with the present invention;

FIG. 4 is a comparison graph of the actual pressure value and the predicted pressure value of the soil surrounding bin at line 427 of Fuzhou subway line IV according to the present invention;

FIG. 5 is a predicted value trend graph drawn by the present invention based on calculated soil bin pressure values at future times.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

Aiming at a large amount of tunneling factor data in the tunneling process of a shield tunneling machine, screening all factors related to the pressure of the soil bin as far as possible, adding Dropout and Early Stopping algorithms into an LSTM prediction model in order to prevent an overfitting problem in the prediction process, establishing a dynamic neural network by using a Pythrch deep learning framework, and providing a set of method for accurately predicting the pressure of the soil binThe method is carried out. The invention fills the prediction of the pressure of the soil bin in the market, and simultaneously has certain difficulty in establishing a mathematical model and the like due to the complexity of the construction condition of the shield tunneling machine, utilizes the LSTM neural network, can simulate all the currently known and other unknown conditions, fully analyzes the known and unknown rules contained in the original data, prevents the over-fitting phenomenon by using the Dropout and Early Stopping algorithm, and finally adopts the average absolute error MAE, the mean square error MSE and the corrected R ² _adj As an evaluation index, the accuracy and precision of the soil bin pressure prediction are improved.

As shown in one of fig. 1 to 5, the present invention discloses a method for predicting soil bin pressure based on LSTM algorithm, which comprises the following steps:

and step 3: obtaining the LSTM training model shown in FIG. 1 by using training set data as input, wherein for conventional technology, a person skilled in the art can simply and directly obtain the structure, prevent overfitting by using Dropout and Early Stopping methods, obtain optimal model parameters by using a random gradient descent method and store the optimal model parameters in a PTH file form;

Further, the spearman correlation analysis calculation formula in step 1 is:

Specifically, for better explanation of the present invention, data of 407-427 rings of subway lines four under construction in fuzhou city, fujian province are used as research objects, 20 rings are summed, 2579 groups of data and 31 tunneling factors are summed, and all the factors are shown in table 1.

The correlation between the 31 shield tunneling factors and the earth bin pressure is obtained by a spearman grade correlation coefficient method, and the result is shown in table 1 and fig. 3. The spearman correlation coefficient value r is between-1 and 1, negative correlation is between-1 and 0, and positive correlation is between 0 and 1. The larger the absolute value | r | of the correlation coefficient is, the stronger the correlation between the two factors is, and therefore the factors which have a correlation relation with the soil bin pressure are obtained by the following steps: the system comprises a total thrust, the rotating speed of a screw machine, a propelling speed, the rotating speed of a cutter head, the torque of the cutter head, a group B propelling pressure, a foam system average pressure, a front sealing bin shield tail sealing average pressure, a rear sealing bin shield tail sealing average pressure, the oil temperature of the screw machine, a group A hinged oil cylinder stroke and a group C hinged oil cylinder stroke.

Table 1 list of tunneling factors

The factors are used as input data of the LSTM, the 407-426 ring data are divided into training sets to predict the pressure of the soil bin of 427 rings, and the prediction result is shown in figure 4. The result shows that the soil bin pressure trend predicted by the LSTM is almost consistent with the actual value, and the prediction accuracy of the method is proved.

Further, the LSTM model in step 3 is based on the short-term memory function of the recurrent neural network, and establishes the dependency relationship between the states at long time intervals through a gating mechanism; the calculation process of the LSTM prediction model is as follows: using the external state h at the previous moment _t-1 And input x of the current time _t Calculating three gates and candidate state g _t And combined with a forgetting door f _t And an input gate o _t To update the memory cell c _t Finally through the output gate o _t Passing information of the internal state to the external state h _t I.e. LSTM introduces a new internal state c _t External state h specially for cyclic information transmission and outputting information to hidden layer _t The specific expression is as follows:

The calculation formulas of the three gates are respectively:

in the formula, sigma is a Logistic function, and the output interval is (0, 1); x is the number of _t Input for the current time; w _if 、b _if 、W _ii 、b _ii 、W _ig 、b _ig 、w _io 、b _io Are respectively for x _t The linear transformation parameter matrix of (2); w is a _hf 、b _hf 、W _hi 、b _hi 、W _hg 、b _hg 、W _ho 、b _ho Is for h _t-1 And (5) performing linear transformation on the parameter matrix.

wherein the content of the first and second substances,

to predict value, y _i Is an actual value;

step 3-2, Absolute coefficient R ² Calculated to be corrected

Evaluating the quality of the model to eliminate the evaluation index caused by different factorsThe influence of the composition;

wherein n is the number of samples;

is an average value; p is the number of features;

In the formula, n is the number of samples;

is a predicted value; y is _i Is an actual value;

is an average value; p is the number of features.

Due to the complexity of the working environment of the earth pressure balance shield machine, the number of factors related to the earth bin pressure in the step 1 changes, R ² Will increase with the number of factors, and thus corrected ones are used

Evaluating the quality of the model; note: r ² _adj Is at an absolute coefficient R ² In order to eliminate the influence on the evaluation index due to the difference of the number of factors.

By adopting the technical scheme, all the influence factors which have obvious relation with the soil bin pressure are screened out by using the spearman grade correlation coefficient method, so that the prediction precision is improved. By using the LSTM neural network, the information of all input data can be memorized, the influence of the past tunneling parameters on the future tunneling parameters is fully considered, and the prediction precision is further improved. The Dropout and Early Stopping algorithm is used to prevent the prediction model from overfitting. And (3) building a prediction model by using a Pytrch deep learning framework, and accelerating matrix multiplication in the neural network.

The invention adopts the LSTM algorithm specially aiming at the time sequence problem to carry out model training and prediction, not only considers the characteristic dimension of input data, but also considers the influence of the time dimension of the characteristic on the soil bin pressure, and improves the prediction precision. Because the geological environment is complex and changeable, and the tunneling factors of the earth pressure balance shield machine are numerous in the tunneling process, so that the correlation degree of the pressure fluctuation of the earth bin is unequal. According to the method, the shield driver and the monitoring personnel can know the change condition of the soil bin pressure at the next moment in advance through the prediction of the soil bin pressure in the future, so that the possible soil pressure unbalance can be adjusted in time, and the construction safety is ensured. The method considers the over-fitting problem in neural network prediction and improves the prediction precision.

It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. The embodiments and features of the embodiments in the present application may be combined with each other without conflict. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the detailed description of the embodiments of the present application is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Claims

1. A method for predicting soil bin pressure based on an LSTM algorithm is characterized by comprising the following steps: which comprises the following steps:

and 4, step 4: taking test set data as input, loading stored optimal model parameters to obtain prediction data, and evaluating prediction performance;

and 5: predicting the pressure of the earth pressure balance shield machine earth bin by the LSTM prediction model: and (5) taking the screened factors as the input of the LSTM training model, and loading the model files stored in the third step to calculate the soil bin pressure value at the future moment.

2. The method for predicting the pressure of the soil bin based on the LSTM algorithm, according to claim 1, wherein: the spearman correlation analysis calculation formula in the step 1 is as follows:

3. The method for predicting the pressure of the soil bin based on the LSTM algorithm, according to claim 1, wherein: the factors related to the soil bin pressure are as follows in sequence: the system comprises a total thrust, the rotating speed of a screw machine, a propelling speed, the rotating speed of a cutter head, the torque of the cutter head, a group B propelling pressure, a foam system average pressure, a front sealing bin shield tail sealing average pressure, a rear sealing bin shield tail sealing average pressure, the oil temperature of the screw machine, a group A hinged oil cylinder stroke and a group C hinged oil cylinder stroke.

4. The method for predicting the pressure of the soil bin based on the LSTM algorithm, according to claim 1, wherein: and 2, randomly dividing the screened factor data into a training set and a testing set through a train _ test _ split () algorithm function in the Sklearn algorithm kit, and reducing the influence of artificial factors on the soil bin pressure prediction result.

5. The method for predicting the pressure of the soil bin based on the LSTM algorithm, according to claim 1, wherein: the calculation process of the LSTM prediction model in the step 3 is as follows: using the external state h at the previous moment _t-1 And input x of the current time _t Calculating three gates and candidate state g _t And combined with a forgetting door f _t And an input gate o _t To update the memory cell c _t Finally through the output gate o _t Passing information of the internal state to the external state h _t I.e. LSTM introduces a new internal state c _t External state h specially for cyclic information transmission and outputting information to hidden layer _t The specific expression is as follows:

wherein f is _t For forgetting to turn on the door, for controlling the internal state c at the previous moment _t-1 Information that needs to be forgotten; i.e. i _t For the input gate, the candidate state g at the current moment is controlled _t Information to be saved; o _t For the output gate, the current internal state c is controlled _t How much information needs to be output to the external state h _t (ii) a tanh is a hyperbolic tangent function, and the output interval is (-1, 1);

the calculation formulas of the three gates are respectively:

6. The method for predicting the pressure of the soil bin based on the LSTM algorithm, according to claim 1, wherein: in the step 3, Dropout refers to that a part of neurons are randomly discarded to avoid overfitting in the training process of deep learning, discarding masks with the same number as the nodes are randomly generated by Bernoulli distribution with the probability of p, the discarding masks and part of nodes after input multiplication are shielded, and the rest nodes are used for subsequent calculation;

7. The method for predicting the pressure of the soil bin based on the LSTM algorithm, according to claim 1, wherein: determining optimal model parameters by adopting the following method in the step 3;

wherein the content of the first and second substances,

to predict value, y _i Is an actual value;

step 3-2, Absolute coefficient R ² Calculated to be corrected

wherein n is the number of samples;

is an average value; p is the number of features;

step 3-3, judging the current correction

Is less than the optimum correction

The difference of the value of (d) and 1;

if so, with the current correction

As an optimum correction

Taking the corresponding model parameter as the optimal parameter of the model; otherwise, maintaining the optimal correction

The values and the optimal parameters of the model are unchanged;

step 3-4, optimally corrected

Whether the difference between the value of (1) and 1 is less than a set value; if so, finishing training and storing the optimal model parameters in a PTH file form; otherwise, the next training is performed.

8. The method for predicting the pressure of the soil bin based on the LSTM algorithm, according to claim 1, wherein: and 6, drawing a trend graph based on the calculated soil bin pressure value at the future moment.