CN114117599A - Shield attitude position deviation prediction method - Google Patents
Shield attitude position deviation prediction method Download PDFInfo
- Publication number
- CN114117599A CN114117599A CN202111386994.1A CN202111386994A CN114117599A CN 114117599 A CN114117599 A CN 114117599A CN 202111386994 A CN202111386994 A CN 202111386994A CN 114117599 A CN114117599 A CN 114117599A
- Authority
- CN
- China
- Prior art keywords
- shield
- data
- layer
- training
- current
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000010276 construction Methods 0.000 claims abstract description 73
- 238000012549 training Methods 0.000 claims abstract description 67
- 230000005641 tunneling Effects 0.000 claims abstract description 39
- 238000000605 extraction Methods 0.000 claims abstract description 17
- 230000000694 effects Effects 0.000 claims abstract description 14
- 230000004913 activation Effects 0.000 claims description 22
- 230000006870 function Effects 0.000 claims description 22
- 238000013527 convolutional neural network Methods 0.000 claims description 20
- 238000013528 artificial neural network Methods 0.000 claims description 18
- 210000004027 cell Anatomy 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000012546 transfer Methods 0.000 claims description 6
- 210000002569 neuron Anatomy 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 230000001902 propagating effect Effects 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 238000012886 linear function Methods 0.000 claims description 3
- 238000013526 transfer learning Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 description 11
- 238000012937 correction Methods 0.000 description 6
- 230000007774 longterm Effects 0.000 description 6
- 239000004576 sand Substances 0.000 description 6
- 238000013507 mapping Methods 0.000 description 5
- 230000015556 catabolic process Effects 0.000 description 3
- 238000006731 degradation reaction Methods 0.000 description 3
- 230000008034 disappearance Effects 0.000 description 3
- 230000002708 enhancing effect Effects 0.000 description 3
- 238000004880 explosion Methods 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 241000270295 Serpentes Species 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000002689 soil Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/10—Geometric CAD
- G06F30/13—Architectural design, e.g. computer-aided architectural design [CAAD] related to design of buildings, bridges, landscapes, production plants or roads
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/10—Geometric CAD
- G06F30/17—Mechanical parametric or variational design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computer Hardware Design (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Architecture (AREA)
- Civil Engineering (AREA)
- Structural Engineering (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention belongs to the technical field of shield tunneling, and particularly relates to a shield attitude position deviation prediction method. The method comprises the steps of taking all parameters of a finished shield construction project as source domain data, training in a pre-training model, extracting relevant parameters of a feature extraction layer in the pre-training model, superposing two new full-connection layers behind the feature extraction layer to form a shield attitude deviation prediction model, taking all parameters in the current shield construction project as target data, and training on the shield attitude deviation prediction model by using the target data to obtain prediction of shield tunneling deviation; in addition, the invention takes various parameters of the finished shield construction project as source domain data, ensures that enough training data are trained in the initial stage of shield construction, and further ensures the prediction effect of the shield attitude deviation prediction model. The situation that the prediction effect is inaccurate due to the fact that only a small amount of data are trained in the initial stage of the shield is avoided.
Description
Technical Field
The invention belongs to the technical field of shield tunneling, and particularly relates to a shield attitude position deviation prediction method.
Background
In the shield tunneling process, some factors which are difficult to predict, such as blade unbalance loading, stratum change, shield self-weight interference and the like, often cause the deviation of the shield attitude position, so that the shield moves in a snake shape in the tunnel. When the offset exceeds the allowable range, the shield must be timely adjusted to return to the preset track axis. However, the control of the current shield tunneling attitude position lacks an effective auxiliary means, and numerous uncertain factors such as operation time and adjustment force are considered to exist in the adjustment of the shield tunneling attitude position, so that the deviation correction accuracy and efficiency cannot be guaranteed, the offset of the shield tunneling machine tunneling in a snake-shaped track is further amplified easily, the construction quality and period are seriously affected, and greater economic loss is caused. Therefore, it is urgently needed to design a shield attitude deviation prediction model, which predicts the change of the shield attitude position through shield related state parameters and historical data of shield driver operation, realizes the accurate control of shield tunneling and improves the tunnel construction quality.
Disclosure of Invention
The invention discloses a shield attitude position deviation prediction method, which aims to realize the prediction of shield deviation and avoid the technical problems that a shield machine tunnels in a snake-shaped track to influence the construction quality and period and cause economic loss.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a shield attitude position deviation prediction method comprises the following steps:
step 1: adopting shield tunneling parameters of a finished shield construction project and characteristic information of a construction geological environment as source domain data, and training on a pre-training model by using the source domain data;
step 2: adopting a feature extraction layer in a transfer learning transfer pre-training model, superposing two new full-connection layers behind the feature extraction layer to form a shield attitude deviation prediction model, and newly building a structural space suitable for storing the shield attitude deviation prediction model in a storage medium; the feature extraction layer comprises knowledge obtained by training source domain data in a pre-training model; the feature extraction layer comprises two CNN layers, five LSTM layers, a full connection layer and a jump link of a residual error network;
and step 3: and taking the shield tunneling parameters of the current shield construction project and the characteristic information of the construction geological environment as target data, reading out a shield attitude deviation prediction model from the structural space, and training on the shield attitude deviation prediction model by using the target data.
The method comprises the steps of taking all parameters of a finished shield construction project as source domain data, training in a pre-training model, extracting relevant parameters of a feature extraction layer in the pre-training model, superposing two new full-connection layers behind the feature extraction layer to form a shield attitude deviation prediction model, taking all parameters in the current shield construction project as target data, and training on the shield attitude deviation prediction model by using the target data to obtain prediction of shield tunneling deviation; in addition, the invention takes various parameters of the finished shield construction project as source domain data, ensures that enough training data are trained in the initial stage of shield construction, and further ensures the prediction effect of the shield attitude deviation prediction model. The situation that the prediction effect is inaccurate due to the fact that only a small amount of data are trained in the initial stage of the shield is avoided.
Preferably, the training step of the source domain data in the pre-training model is as follows:
step A: obtaining a predicted value through forward propagation of a hybrid neural network;
and B: calculating a predicted loss of source domain data;
and C: and (3) reversely propagating and updating parameters of the pre-training model, and optimizing the prediction effect of the pre-training model until a preset iteration number is reached, wherein the iteration number is calculated and stored through a first counter.
Preferably, the training step of the target data in the shield attitude deviation prediction model is as follows:
step A: obtaining a predicted value through forward propagation of a shield attitude position lateral prediction model;
and B: calculating a predicted loss of the target data;
and C: and (5) reversely propagating and updating the model, and optimizing the prediction effect of the shield attitude deviation prediction model until the preset iteration times are reached, wherein the iteration times are calculated and stored through a second counter.
Preferably, the pre-training model and the shield attitude deviation prediction model both comprise a plurality of CNN layers, a plurality of LSTM layers, a residual error network and a plurality of full connection layers;
the CNN layer is used for filtering noise and preliminarily extracting features; a plurality of filters are built through the CNN layer, noise of construction data is filtered, and a long output sequence is converted into a shorter sequence composed of high-level features, so that the shorter sequence composed of the high-level features can be better calculated and used by a subsequent LSTM layer.
The LSTM layer is used for extracting time variation characteristics; due to the fact that the shield construction data have obvious time sequence, prediction of the shield tunneling attitude position deviation can be regarded as a time sequence prediction problem. Therefore, by adopting the LSTM layer, the time variation characteristics of the data can be extracted by enhancing the method of carrying information across a plurality of time steps.
The residual error network is used for guaranteeing the performance of the deep neural network;
the full-connection layer is used for generating a prediction variable, and the feature information extracted by the CNN layer and the LSTM layer is integrated to predict the shield tunneling attitude position at the future moment, so that the nonlinear mapping from input to output results is realized.
Preferably, the CNN layer is composed of a convolutional layer and an active layer; the CNN layer extracts features through the convolution layer and the activation layer, inputs the extracted features into the LSTM layer, and finally outputs a regression result through the full connection layer;
the convolutional layer firstly performs convolution operation on the input of the current convolutional layer, and then uses a nonlinear activation function to construct the output of the convolutional network layer, and the formula is as follows:
ak=f(wkxk-1+bk);
in the formula: k is the k-th layer; x is the number ofk-1Is the input to the current convolutional layer; w is akIs a weight; bkIs an offset; f (w)kxk-1+bk) The expression f (-) represents the activation function; a iskIs the k-th layer output.
Further, a modified linear unit is adopted to make the output values of part of the neurons zero, and the calculation formula of the modified linear unit is as follows:
ak=f(yk)=max{0,yk};
in the formula: y iskAn output value representing a convolution operation; a iskIs ykAn activation value of; f (-) represents an activation function; max { } means take the maximum value.
The invention adopts the correction linear unit to make the output values of part of the neurons be zero, thereby improving the network sparsity and relieving overfitting.
Preferably, the gates in the LSTM layer include an input gate, a forgetting gate and an output gate, and the calculation formula is as follows:
ft=σ(wf·[ht-1,xt]+bf);
it=σ(wi·[ht-1,xt]+bi);
ot=σ(wo·[ht-1,xt]+bo);
ht=ot⊙tanh(ct);
in the formula, ht-1Is the output of the previous LSTM layer, xtIs the input to the current LSTM layer, σ () is the Sigmoid activation function, tanh () is the tanh activation function, wf,wi,wcIs the network weight, bf,bi,bcIs a deviation, ct-1Memory cells of the previous LSTM layer, ctIs the memory cell of the current LSTM layer, ftIs the output value of the forgetting gate, itIs the output value of the input gate, otIs the output value of the output gate,is a temporary memory cell of the current LSTM layer, htIs the output of the current LSTM.
In the pre-training model and the attitude deviation prediction model, a long-time memory network (LSTM) is adopted as a main time sequence extractor for predicting the attitude position deviation of the shield tunneling machine. The LSTM is a special RNN structure and aims to solve the problem of disappearance of gradient or gradient explosion of a circulating neural network caused by long-term dependence. LSTM introduces "cell states" that preserve meaningful information and add or lose information through "gates". In the weight correction process, errors pass through a gate or are directly forgotten by the gate, so that the problem of long-term dependence is effectively solved.
Further, the residual error network directly transfers the shallow features to the deep layer by adding jump links; therefore, the gradient of the loss function is calculated in the residual network by adopting the following formula:
in the formula: h (X)L) Expressed as H (-) as an identity or convolution function, FL(XL,WL,bL) Describing the non-linear function of the network, denoted as F (·); since the gradient calculation adds an identity term, the gradient can be effectively propagated backwards.
The present invention combines H (-) and F (-) through jump links. After the mapping of the shallow neural network H (-) is added, the degradation problem of the deep neural network can be relieved, and therefore the performance reduction in the deep neural network is effectively avoided.
The method for acquiring the source domain data and the target data comprises the following steps:
respectively acquiring shield tunneling parameters and construction data of a construction geological environment of a current shield construction project and a finished shield construction project;
respectively deleting the data of the non-tunneling state in the current and finished construction data;
respectively deleting data of a shield starting stage in the current and finished construction data;
deleting missing values, repeated value records and data outliers in the current and finished construction data respectively;
respectively carrying out one-hot coding on the classification features in the current and finished construction data after the data are deleted, and converting the classification features into numerical variables; such as geological data of round gravel, gravel sand, medium coarse sand, and the like.
Respectively carrying out normalization processing on the current and finished data after coding by using a MinMaxScale method, and scaling the data between [0,1], wherein the formula is as follows:
where x represents the original value of data, min () represents the minimum value, max () represents the maximum value, x*Represents the normalized value.
Dividing the scaled current and finished data into a plurality of samples respectively, wherein each sample consists of 89 input variables with 5 time intervals and 1 output variable with 1 time interval; the input of the next sample slides forward with 1 time interval as a time window.
Preferably, one of said time intervals takes 30 seconds.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
1. the method comprises the steps of taking all parameters of a finished shield construction project as source domain data, training in a pre-training model, extracting a feature extraction layer in the pre-training model, superposing two new full-connection layers behind the feature extraction layer to form a shield attitude deviation prediction model, taking all parameters in the current shield construction project as target data, and training on the shield attitude deviation prediction model by using the target data to obtain prediction of shield tunneling deviation; in addition, the invention takes various parameters of the finished shield construction project as source domain data, ensures that enough training data are trained in the initial stage of shield construction, and further ensures the prediction effect of the shield attitude deviation prediction model. The situation that the prediction effect is inaccurate due to the fact that only a small amount of data are trained in the initial stage of the shield is avoided.
2. A plurality of filters are built through the CNN layer, noise of construction data is filtered, and a long output sequence is converted into a shorter sequence composed of high-level features, so that the shorter sequence composed of the high-level features can be better calculated and used by a subsequent LSTM layer.
3. Due to the fact that the shield construction data have obvious time sequence, prediction of the shield tunneling attitude position deviation can be regarded as a time sequence prediction problem. Therefore, by adopting the LSTM layer, the time variation characteristics of the data can be extracted by enhancing the method of carrying information across a plurality of time steps.
4. In the pre-training model and the attitude deviation prediction model, a long-time memory network (LSTM) is adopted as a main time sequence extractor for predicting the attitude position deviation of the shield tunneling machine. The LSTM is a special RNN structure and aims to solve the problem of disappearance of gradient or gradient explosion of a circulating neural network caused by long-term dependence. LSTM introduces "cell states" that preserve meaningful information and add or lose information through "gates". In the weight correction process, errors pass through a gate or are directly forgotten by the gate, so that the problem of long-term dependence is effectively solved.
5. The present invention combines H (-) and F (-) through jump links. After the mapping of the shallow neural network H (-) is added, the degradation problem of the deep neural network can be relieved, and therefore the performance reduction in the deep neural network is effectively avoided.
Drawings
The invention will now be described, by way of example, with reference to the accompanying drawings, in which:
FIG. 1 is a model for predicting shield attitude deviation according to the present invention.
FIG. 2 is a flow chart of the training and testing of the shield attitude deviation prediction model of the present invention.
FIG. 3 is a training framework in the shield attitude deviation prediction model of the present invention.
FIG. 4 is a schematic diagram of a data set structure formed in an example of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of embodiments of the present application, generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1 to 3, a method for predicting a shield attitude position deviation includes the following steps:
step 1: adopting shield tunneling parameters of a finished shield construction project and characteristic information of a construction geological environment as source domain data, and training on a pre-training model by using the source domain data;
the method for acquiring the source domain data comprises the following steps:
acquiring shield tunneling parameters of a finished shield construction project and construction data of a construction geological environment;
deleting data in a non-tunneling state in the completed construction data;
deleting data of a shield starting stage in the completed construction data;
deleting missing values, repeated value records and data outliers in the finished construction data;
performing one-hot coding on the classification features in the construction data after the data are deleted, and converting the classification features into numerical variables; the classification features include geological data such as round gravel, gravel sand and medium coarse sand.
Respectively carrying out normalization processing on the current and finished data after coding by using a MinMaxScale method, and scaling the data between [0,1], wherein the formula is as follows:
where x represents the original value of data, min () represents the minimum value, max () represents the maximum value, x*Represents the normalized value.
Dividing the scaled finished data into a plurality of samples, each sample consisting of 89 input variables with 5 time intervals (2-3 minutes) and 1 output variable with 1 time interval (30 seconds); the input of the next sample slides forward with 1 time interval as a time window.
The training steps of the source domain data in the pre-training model are as follows:
step A: obtaining a predicted value through forward propagation of a hybrid neural network;
and B: calculating a predicted loss of source domain data;
and C: and (4) updating parameters of the pre-training model through back propagation, and optimizing the prediction effect of the pre-training model until the preset iteration times are reached.
Step 2: and (3) adopting a feature extraction layer in the transfer learning transfer pre-training model, and superposing two new full-connection layers behind the feature extraction layer to form a shield attitude deviation prediction model. The feature extraction layer comprises knowledge obtained by training source domain data in a pre-training model; the feature extraction layer comprises two CNN layers, five LSTM layers, a full connection layer and a jump link of a residual error network.
And step 3: and adopting the shield tunneling parameters of the current shield construction project and the characteristic information of the construction geological environment as target data, and training on a shield attitude deviation prediction model by using the target data.
The method for acquiring the target data comprises the following steps:
acquiring shield tunneling parameters of a current shield construction project and construction data of a construction geological environment;
deleting data in a non-tunneling state in the current construction data;
deleting data of a shield starting stage in the current construction data;
deleting missing values, repeated value records and data outliers in the current construction data;
performing one-hot coding on the classification features in the construction data after the data are deleted, and converting the classification features into numerical variables; such as geological data of round gravel, gravel sand, medium coarse sand, and the like.
Carrying out normalization processing on the current data after being coded by a MinMaxScale method, and scaling the data to be between [0 and 1], wherein the formula is as follows:
where x represents the original value of data, min () represents the minimum value, max () represents the maximum value, x*Represents the normalized value.
Dividing the scaled current data into a plurality of samples, each sample consisting of 89 input variables with 5 time intervals (2-3 minutes) and 1 time interval output variables; the input of the next sample slides forward with 1 time interval (30 seconds) as a time window.
The training steps of the target data in the shield attitude deviation prediction model are as follows:
step A: obtaining a predicted value through forward propagation of a shield attitude position lateral prediction model;
and B: calculating a predicted loss of the target data;
and C: and (4) reversely propagating and updating the model, and optimizing the prediction effect of the shield attitude deviation prediction model until the preset iteration times are reached.
The pre-training model and the shield attitude deviation prediction model respectively comprise a plurality of CNN layers, a plurality of LSTM layers, a residual error network and a plurality of full connection layers;
the CNN layer is used for filtering noise and preliminarily extracting features; a plurality of filters are built through the CNN layer, noise of construction data is filtered, and a long output sequence is converted into a shorter sequence composed of high-level features, so that the shorter sequence composed of the high-level features can be better calculated and used by a subsequent LSTM layer.
The LSTM layer is used for extracting time variation characteristics; due to the fact that the shield construction data have obvious time sequence, prediction of the shield tunneling attitude position deviation can be regarded as a time sequence prediction problem. Therefore, by adopting the LSTM layer, the time variation characteristics of the data can be extracted by enhancing the method of carrying information across a plurality of time steps.
The residual error network is used for guaranteeing the performance of the deep neural network;
the full-connection layer is used for generating a prediction variable, and the feature information extracted by the CNN layer and the LSTM layer is integrated to predict the shield tunneling attitude position at the future moment, so that the nonlinear mapping from input to output results is realized.
The CNN layer consists of a convolution layer and an activation layer; the CNN layer extracts features through the convolution layer and the activation layer, inputs the extracted features into the LSTM layer, and finally outputs a regression result through the full connection layer;
the convolutional layer firstly performs convolution operation on the input of the current convolutional layer, and then uses a nonlinear activation function to construct the output of the convolutional network layer, and the formula is as follows:
ak=f(wkxk-1+bk);
in the formula: k is the k-th layer; x is the number ofk-1Is the input to the current convolutional layer; w is akIs a weight; bkIs an offset; f (w)kxk-1+bk) The expression f (-) represents the activation function; a iskIs the k-th layer output.
Adopting a modified linear unit to enable the output values of part of neurons to be zero, wherein the calculation formula of the modified linear unit is as follows:
ak=f(yk)=max{0,yk};
in the formula: y iskAn output value representing a convolution operation; a iskIs ykAn activation value of; f (-) represents an activation function; max { } means take the maximum value.
The invention adopts the correction linear unit to make the output values of part of the neurons be zero, thereby improving the network sparsity and relieving overfitting.
Preferably, the gates in the LSTM layer include an input gate, a forgetting gate and an output gate, and the calculation formula is as follows:
ft=σ(wf·[ht-1,xt]+bf);
it=σ(wi·[ht-1,xt]+bi);
ot=σ(wo·[ht-1,xt]+bo);
ht=ot⊙tanh(ct);
in the formula, ht-1Is the output of the previous LSTM layer, xtIs the input to the current LSTM layer, σ () is the Sigmoid activation function, tanh () is the tanh activation function, wf,wi,wcIs the network weight, bf,bi,bcIs a deviation, ct-1Memory cells of the previous LSTM layer, ctIs the memory cell of the current LSTM layer, ftIs the output value of the forgetting gate, itIs the output value of the input gate, otIs the output value of the output gate,is a temporary memory cell of the current LSTM layer, htIs the output of the current LSTM.
In the pre-training model and the attitude deviation prediction model, a long-time memory network (LSTM) is adopted as a main time sequence extractor for predicting the attitude position deviation of the shield tunneling machine. The LSTM is a special RNN structure and aims to solve the problem of disappearance of gradient or gradient explosion of a circulating neural network caused by long-term dependence. LSTM introduces "cell states" that preserve meaningful information and add or lose information through "gates". In the weight correction process, errors pass through a gate or are directly forgotten by the gate, so that the problem of long-term dependence is effectively solved.
The residual error network directly transfers the characteristics of the shallow layer to the deep layer by adding jump links; therefore, the gradient of the loss function is calculated in the residual network by adopting the following formula:
in the formula: h (X)L) Expressed as H (-) as an identity or convolution function, FL(XL,WL,bL) Describing the non-linear function of the network, denoted as F (·); since the gradient calculation adds an identity term, the gradient can be effectively propagated backwards.
The present invention combines H (-) and F (-) through jump links. After the mapping of the shallow neural network H (-) is added, the degradation problem of the deep neural network can be relieved, and therefore the performance reduction in the deep neural network is effectively avoided.
The method comprises the steps of taking various parameters of a finished shield construction project as source domain data, training in a pre-training model, extracting relevant parameters in the pre-training model to form a shield attitude deviation prediction model, taking various parameters in the current shield construction project as target data, and training on the shield attitude deviation prediction model by using the target data so as to obtain prediction of shield tunneling deviation; in addition, the invention takes various parameters of the finished shield construction project as source domain data, ensures that enough training data are trained in the initial stage of shield construction, and further ensures the prediction effect of the shield attitude deviation prediction model. The situation that the prediction effect is inaccurate due to the fact that only a small amount of data are trained in the initial stage of the shield is avoided.
Referring to fig. 4, the following description is further made with reference to the implementation of the shield attitude deviation prediction model in the shield tunneling process, specifically as follows:
pre-processing of data
And integrating geological data, tunnel geometric characteristic data, construction data recorded by a shield machine acquisition system in real time and VMT guiding data which are surveyed on site in the current construction project as a current construction data set. Selecting the constructed shield project construction data to form a data completed data set in shield projects with similar address environments and construction environments; the completed data sets of the current construction data set are collectively called data sets hereinafter;
(a) and deleting the data in the non-tunneling state in the data set. Although the shield is influenced by self weight and soil environment when the shield is stopped, the attitude and the position of the shield can be changed, but the invention is not in the research scope. The invention aims to predict the movement track of the shield tunneling machine, so that only the data of the shield tunneling state is reserved in a data set.
(b) And deleting the data of the shield starting stage. As the front 100 rings of the shield belong to the excavation starting stage and the data reference significance is small, the data in the shield starting stage is deleted.
(c) Deleting missing value and repeated value records in the data, and deleting data outliers.
(d) The classification characteristics, such as the formation, are one-hot encoded and converted into numerical variables.
(e) Carrying out normalization processing on the data set by using a MinMaxScale method, and scaling the data between [0,1] according to the formula:
where x represents the original value of data, min () represents the minimum value, max () represents the maximum value, x*Represents the normalized value.
(f) Dividing the scaled finished data into a plurality of samples, each sample consisting of 89 input variables with 5 time intervals (2-3 minutes) and 1 time interval output variables; the input of the next sample slides forward with 1 time interval as a time window.
Training of models
Firstly, training a pre-training model on source domain data, and then training a shield attitude deviation prediction model formed after migration on target domain data.
In the pre-training process, MAE and RMSProp are selected as a loss function and an optimizer of model training, and a back propagation algorithm is applied to update model parameters. The pre-trained model initial learning rate was set to 0.005, and then decreased by a factor of 10 every 40 training cycles.
After the pre-training process is finished, firstly, the feature extraction layer of the pre-training model is migrated, and then two new full connection layers are superposed.
In the model retraining process (i.e. the training process of the shield attitude deviation prediction model), firstly, for the first 8 layers of the newly constructed model, the weight and the deviation of each layer are set as the values migrated from the corresponding pre-trained model. For the last two fully connected layers, the weight and offset are initialized to 0. The model retraining process used a learning rate of 0.001, (80% less than the original learning rate of 0.005%). In addition, the number of iterations of the retraining process is set to 80. The performance of the proposed method was evaluated using the Mean Absolute Error (MAE) and the formula:
in the formula: n is the number of data set samples; y isiIs the true value of the sample; p is a radical ofiPredicting the predicted value of the shield attitude deviation prediction model; MAE is the mean absolute error.
The above-mentioned embodiments only express the specific embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for those skilled in the art, without departing from the technical idea of the present application, several changes and modifications can be made, which are all within the protection scope of the present application.
Claims (10)
1. A shield attitude position deviation prediction method is characterized by comprising the following steps:
step 1: adopting shield tunneling parameters of a finished shield construction project and characteristic information of a construction geological environment as source domain data, and training on a pre-training model by using the source domain data;
step 2: adopting a feature extraction layer in a transfer learning transfer pre-training model, superposing two new full-connection layers behind the feature extraction layer to form a shield attitude deviation prediction model, and newly building a structural space suitable for storing the shield attitude deviation prediction model in a storage medium;
and step 3: and taking the shield tunneling parameters of the current shield construction project and the characteristic information of the construction geological environment as target data, reading a shield attitude deviation prediction model from the structural space, and training on the shield attitude deviation prediction model by using the target data.
2. The method for predicting the position deviation of the shield attitude according to claim 1, wherein the training of the source domain data in the pre-trained model comprises the following steps:
step A: obtaining a predicted value through forward propagation of a hybrid neural network;
and B: calculating a predicted loss of source domain data;
and C: and (3) reversely propagating and updating parameters of the pre-training model, and optimizing the prediction effect of the pre-training model until a preset iteration number is reached, wherein the iteration number is calculated and stored through a first counter.
3. The method for predicting the position deviation of the shield attitude according to claim 1, wherein the step of training the target data in the shield attitude deviation prediction model is as follows:
step A: obtaining a predicted value through forward propagation of a shield attitude position lateral prediction model;
and B: calculating a predicted loss of the target data;
and C: and (5) reversely propagating and updating the model, and optimizing the prediction effect of the shield attitude deviation prediction model until the preset iteration times are reached, wherein the iteration times are calculated and stored through a second counter.
4. The method of claim 1, wherein the pre-trained model and the shield attitude deviation prediction model each comprise a plurality of CNN layers, a plurality of LSTM layers, a residual network, and a plurality of fully connected layers;
the CNN layer is used for filtering noise and preliminarily extracting features; the LSTM layer is used for extracting time variation characteristics; the residual error network is used for guaranteeing the performance of the deep neural network; and the full connection layer is used for generating a prediction variable, and predicting the shield tunneling attitude position at the future moment by integrating the characteristic information extracted by the CNN layer and the LSTM layer.
5. The method according to claim 4, wherein the CNN layer is composed of a convolutional layer and an active layer; the CNN layer extracts features through the convolution layer and the activation layer, inputs the extracted features into the LSTM layer, and finally outputs a regression result through the full connection layer;
the convolutional layer firstly performs convolution operation on the input of the current convolutional layer, and then uses a nonlinear activation function to construct the output of the convolutional network layer, and the formula is as follows:
ak=f(wkxk-1+bk);
in the formula: k is the k-th layer; x is the number ofk-1Is the input to the current convolutional layer; w is akIs a weight; bkIs an offset; f (w)kxk-1+bk) The expression f (-) represents the activation function; a iskIs the k-th layer output.
6. The method for predicting the position deviation of the shield attitude according to claim 5, wherein a modified linear unit is adopted to make the output values of part of the neurons zero, and the calculation formula of the modified linear unit is as follows:
ak=f(yk)=max{0,yk};
in the formula: y iskAn output value representing a convolution operation; a iskIs ykAn activation value of; f (-) represents an activation function; max { } means take the maximum value.
7. The method of claim 4, wherein the gates in the LSTM layer include an input gate, a forgetting gate and an output gate, and the calculation formula is as follows:
ft=σ(wf·[ht-1,xt]+bf);
it=σ(wi·[ht-1,xt]+bi);
ot=σ(wo·[ht-1,xt]+bo);
ht=ot⊙tanh(ct);
in the formula, ht-1Is the output of the previous LSTM layer, xtIs the input to the current LSTM layer, σ () is the Sigmoid activation function, tanh () is the tanh activation function, wf,wi,wcIs the network weight, bf,bi,bcIs a deviation, ct-1Memory cells of the previous LSTM layer, ctIs the memory cell of the current LSTM layer, ftIs the output value of the forgetting gate, itIs the output value of the input gate, otIs the output value of the output gate,is a temporary memory cell of the current LSTM layer, htIs the output of the current LSTM.
8. The method for predicting the deviation of the shield attitude position according to claim 4, wherein the residual error network directly transfers the shallow features to the deep layer by adding jump links; therefore, the gradient of the loss function is calculated in the residual network by adopting the following formula:
in the formula: h (X)L) Expressed as H (-) as an identity or convolution function, FL(XL,WL,bL) Denoted F (-) describes the non-linear function of the network.
9. The method for predicting the position deviation of the shield attitude according to any one of claims 1 to 8, wherein the method for acquiring the source domain data and the target data is as follows:
respectively acquiring shield tunneling parameters and construction data of a construction geological environment of a current shield construction project and a finished shield construction project;
respectively deleting the data of the non-tunneling state in the current and finished construction data;
respectively deleting data of a shield starting stage in the current and finished construction data;
deleting missing values, repeated value records and data outliers in the current and finished construction data respectively;
respectively carrying out one-hot coding on the classification features in the current and finished construction data after the data are deleted, and converting the classification features into numerical variables;
respectively carrying out normalization processing on the current and finished data after coding by using a MinMaxScale method, and scaling the data between [0,1], wherein the formula is as follows:
where x represents the original value of data, min () represents the minimum value, max () represents the maximum value, x*Represents the normalized value.
Dividing the scaled current and finished data into a plurality of samples respectively, wherein each sample consists of 89 input variables with 5 time intervals and 1 output variable with 1 time interval; the input of the next sample slides forward with 1 time interval as a time window.
10. The method of claim 9, wherein one of the time intervals is 30 seconds.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111386994.1A CN114117599B (en) | 2021-11-22 | 2021-11-22 | Method for predicting deviation of attitude and position of shield |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111386994.1A CN114117599B (en) | 2021-11-22 | 2021-11-22 | Method for predicting deviation of attitude and position of shield |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114117599A true CN114117599A (en) | 2022-03-01 |
CN114117599B CN114117599B (en) | 2024-08-13 |
Family
ID=80439317
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111386994.1A Active CN114117599B (en) | 2021-11-22 | 2021-11-22 | Method for predicting deviation of attitude and position of shield |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114117599B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114810100A (en) * | 2022-06-28 | 2022-07-29 | 中铁工程服务有限公司 | Shield tunneling attitude prediction method based on deep neural network |
CN114881205A (en) * | 2022-04-20 | 2022-08-09 | 苏州大学 | Shield attitude prediction method, medium, electronic device and system |
CN117493837A (en) * | 2024-01-03 | 2024-02-02 | 中铁南方投资集团有限公司 | Machine learning-based shield tunneling machine attitude item prediction method |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019128123A1 (en) * | 2017-12-29 | 2019-07-04 | 天津大学 | Method for calculating total thrust of composite shield during tunneling in soil and rock mixed geology |
CN110096827A (en) * | 2019-05-09 | 2019-08-06 | 中铁工程服务有限公司 | A kind of shield machine parameter optimization method based on deep neural network |
CN110147875A (en) * | 2019-05-07 | 2019-08-20 | 西安交通大学 | A kind of shield machine auxiliary cruise method based on LSTM neural network |
CN110195592A (en) * | 2019-04-30 | 2019-09-03 | 华中科技大学 | Shield driving pose intelligent Forecasting and system based on interacting depth study |
CN111365015A (en) * | 2020-03-05 | 2020-07-03 | 中建交通建设集团有限公司 | Shield tunneling parameter feature extraction and attitude deviation prediction method based on XGboost |
JP2021014726A (en) * | 2019-07-12 | 2021-02-12 | 株式会社奥村組 | Excavation prediction model creation method in shield excavation method |
CN112879024A (en) * | 2021-01-23 | 2021-06-01 | 西安建筑科技大学 | Dynamic prediction method, system and equipment for shield attitude |
CN112906153A (en) * | 2021-02-04 | 2021-06-04 | 中铁十六局集团北京轨道交通工程建设有限公司 | Intelligent dynamic soil pressure regulating and controlling method of soil pressure balance shield based on LSTM |
CN113344256A (en) * | 2021-05-21 | 2021-09-03 | 上海隧道工程有限公司 | System and method for predicting movement characteristics and evaluating control performance of multiple degrees of freedom of shield attitude |
-
2021
- 2021-11-22 CN CN202111386994.1A patent/CN114117599B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019128123A1 (en) * | 2017-12-29 | 2019-07-04 | 天津大学 | Method for calculating total thrust of composite shield during tunneling in soil and rock mixed geology |
CN110195592A (en) * | 2019-04-30 | 2019-09-03 | 华中科技大学 | Shield driving pose intelligent Forecasting and system based on interacting depth study |
CN110147875A (en) * | 2019-05-07 | 2019-08-20 | 西安交通大学 | A kind of shield machine auxiliary cruise method based on LSTM neural network |
CN110096827A (en) * | 2019-05-09 | 2019-08-06 | 中铁工程服务有限公司 | A kind of shield machine parameter optimization method based on deep neural network |
JP2021014726A (en) * | 2019-07-12 | 2021-02-12 | 株式会社奥村組 | Excavation prediction model creation method in shield excavation method |
CN111365015A (en) * | 2020-03-05 | 2020-07-03 | 中建交通建设集团有限公司 | Shield tunneling parameter feature extraction and attitude deviation prediction method based on XGboost |
CN112879024A (en) * | 2021-01-23 | 2021-06-01 | 西安建筑科技大学 | Dynamic prediction method, system and equipment for shield attitude |
CN112906153A (en) * | 2021-02-04 | 2021-06-04 | 中铁十六局集团北京轨道交通工程建设有限公司 | Intelligent dynamic soil pressure regulating and controlling method of soil pressure balance shield based on LSTM |
CN113344256A (en) * | 2021-05-21 | 2021-09-03 | 上海隧道工程有限公司 | System and method for predicting movement characteristics and evaluating control performance of multiple degrees of freedom of shield attitude |
Non-Patent Citations (3)
Title |
---|
佟雨泉: "《基于LSTM的盾构掘进轨迹预测研究》", 《建筑安全》, vol. 36, no. 08, 5 August 2021 (2021-08-05), pages 13 - 16 * |
殷志浩主编: "《机器学习原理及应用》", 31 August 2021, 上海财经大学出版社, pages: 158 * |
钟跃崎编著: "《人工智能技术原理与应用》", 30 September 2020, 东华大学出版社, pages: 142 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114881205A (en) * | 2022-04-20 | 2022-08-09 | 苏州大学 | Shield attitude prediction method, medium, electronic device and system |
CN114810100A (en) * | 2022-06-28 | 2022-07-29 | 中铁工程服务有限公司 | Shield tunneling attitude prediction method based on deep neural network |
CN117493837A (en) * | 2024-01-03 | 2024-02-02 | 中铁南方投资集团有限公司 | Machine learning-based shield tunneling machine attitude item prediction method |
CN117493837B (en) * | 2024-01-03 | 2024-03-19 | 中铁南方投资集团有限公司 | Machine learning-based shield tunneling machine attitude item prediction method |
Also Published As
Publication number | Publication date |
---|---|
CN114117599B (en) | 2024-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114117599A (en) | Shield attitude position deviation prediction method | |
CN112016734B (en) | LSTM stack-based self-coding multi-model load prediction method and system | |
CN111612243B (en) | Traffic speed prediction method, system and storage medium | |
Sehovac et al. | Forecasting building energy consumption with deep learning: A sequence to sequence approach | |
CN108764539B (en) | Upstream and downstream water level prediction method for cascade power station | |
CN110245801A (en) | A kind of Methods of electric load forecasting and system based on combination mining model | |
CN112488415A (en) | Power load prediction method based on empirical mode decomposition and long-and-short-term memory network | |
CN110084424A (en) | A kind of Methods of electric load forecasting based on LSTM and LGBM | |
CN110570035B (en) | People flow prediction system for simultaneously modeling space-time dependency and daily flow dependency | |
Abdullayeva et al. | Development of oil production forecasting method based on deep learning | |
CN110866631A (en) | Method for predicting atmospheric pollution condition based on integrated gate recursion unit neural network GRU | |
CN112213771A (en) | Seismic wave impedance inversion method and device | |
CN112580848A (en) | PT-LSTM-based time series prediction system and method | |
CN112541256A (en) | Deep learning dimensionality reduction reconstruction-based strong heterogeneous reservoir history fitting method | |
CN112906760B (en) | Horizontal well fracturing segment segmentation method, system, equipment and storage medium | |
Srivastava et al. | Weather Prediction Using LSTM Neural Networks | |
CN113484882A (en) | GNSS sequence prediction method and system of multi-scale sliding window LSTM | |
CN116681159A (en) | Short-term power load prediction method based on whale optimization algorithm and DRESN | |
CN117313201A (en) | Deformation prediction method and system considering rock-fill dam multi-measuring-point complex relevance space-time fusion | |
CN117521511A (en) | Granary temperature prediction method based on improved wolf algorithm for optimizing LSTM | |
CN116739130A (en) | Multi-time scale load prediction method of TCN-BiLSTM network | |
CN114239418B (en) | Landslide displacement prediction method based on combination of multiple algorithms | |
CN113537354B (en) | Aquifer structure staged stochastic inversion identification method based on deep learning | |
CN113361476B (en) | Zhang Heng one-number pre-earthquake abnormal signal identification method based on artificial intelligence technology | |
Ni et al. | Streamflow forecasting using long short-term memory network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |