CN114492174A

CN114492174A - Full life cycle shield tunneling parameter prediction method and device

Info

Publication number: CN114492174A
Application number: CN202210017848.XA
Authority: CN
Inventors: 宁焕生; 高大智; 李荣洋; 徐阳; 毛凌锋; 李莎
Original assignee: University of Science and Technology Beijing USTB
Current assignee: University of Science and Technology Beijing USTB
Priority date: 2022-01-07
Filing date: 2022-01-07
Publication date: 2022-05-13

Abstract

The invention discloses a full life cycle shield tunneling parameter prediction method and a full life cycle shield tunneling parameter prediction device, wherein the method comprises the steps of obtaining historical data and prediction steps of a tunneling task, preprocessing the historical data, and constructing a sample data set by the preprocessed historical data; wherein the historical data comprises geological parameters and tunneling parameters; training a depth convolution network based on an attention mechanism by using a sample data set; the input of the depth convolution network is geological parameters and historical tunneling parameters, the output is future tunneling parameters, and the output dimension is the predicted step number; and predicting the tunneling parameters to be predicted by using the trained deep convolution network to obtain a prediction result. The invention provides a standardized preprocessing mode, dynamic distribution of the weight of input features is realized by using a depth convolution network based on an attention mechanism, convenience in use of different devices is fully considered, the relation between geological information and tunneling parameters is considered, and the possibility of multi-step prediction is realized.

Description

Full life cycle shield tunneling parameter prediction method and device

Technical Field

The invention relates to the technical field of intelligent control of shield technology, in particular to a method and a device for predicting full-life-cycle shield tunneling parameters.

Background

The shield method is a fully mechanical construction method frequently used for tunnel excavation at present, has high construction safety, and has the characteristics of small influence on the earth surface, small influence on weather and small influence on buildings and underground pipelines. In the shield method construction process, a series of tunneling parameters are adjusted by an operator, but in the actual work at present, the operator mostly adjusts the tunneling parameters depending on subjective experience, the accuracy of the construction parameter adjustment depending on the current state, historical experience, subjective feeling and other factors, and the construction quality cannot be guaranteed. At the same time, there is a significant time cost to become a qualified operator. In view of this, an intelligent tunneling parameter adjustment method is urgently needed for improving the quality of the shield construction.

The existing intelligent tunneling parameter adjusting method can be divided into three categories: 1. the method based on the traditional machine learning algorithm uses the least square support vector machine, the adaptive neural fuzzy inference algorithm, the random forest and Bayes optimization algorithm and other algorithms. 2. The method based on the traditional neural network and the deep network algorithm, for example, the BP neural network and the optimization algorithm thereof are used, the optimization algorithm comprises a particle swarm algorithm and the like, and the hyper-parameters such as neurons of the neural network are dynamically adjusted. 3. The time series network-based method is characterized in that a long-short term memory network (LSTM) -based algorithm and an optimization algorithm thereof are common, and a Convolution Neural Network (CNN) -based algorithm is relatively close to the LSTM-based algorithm, wherein the optimization algorithm is a hyper-parameter of an adjustment model and is similar to automatic parameter optimization.

For category 1, the traditional machine learning method is not good in applicability and reliability when facing such a large number of feature data sets due to algorithm limitations, can only obtain a good value from individual problems, needs to analyze different problems independently when facing different problems, and is not high in practicability.

For class 2, conventional neural networks, such as the error Back Propagation (BP) algorithm, are highly vulnerable to local minimization and depend heavily on the setting of initial network parameters. Although some optimization algorithms optimize the problem, the defects of the algorithms cannot be solved substantially, and meanwhile, the accompanying problems of easy overfitting, low convergence speed and the like also become bottlenecks in algorithm development. Although some of the above problems can be solved only by using deep learning methods such as CNN, a professional is required to carefully select features, and a lot of experiments are performed, and the influence degree of different features on the output cannot be found, which is one of the important reasons for limiting the application range of the model. The models of the above categories 1 and 2 are all a kind of verification for existing data, and do not propose prediction for the future, and therefore, the practicability is not great.

For category 3, although the method based on the time series network realizes the prediction of the future, the method excessively focuses on the relation on the tunneling parameter time sequence, which also causes the requirement on the accuracy of the data set to be extremely high, and the method excessively focuses on the change trend of the parameters, which also causes the influence among the internal system parameters and the change brought by the external geological parameters to be ignored. And simply fitting the future parameter changes through the changes of the shield machine is not comprehensive for the shield machine running under complex geology. Meanwhile, the higher parameters of the time sequence model also increase the calculation burden.

In addition, the existing intelligent tunneling parameter adjusting methods do not make standard and uniform expression on the preprocessing mode, the preprocessing method has high relation with the final output effect, and strict limitation is made on the input characteristics, so that the convenience of using shield equipment of different models is also limited. Meanwhile, in the prior art, the relation between geological information and each sub-device of the shield device is not fully considered, and except for a time sequence model, the future multi-step prediction is not considered, but the existing data is verified.

Disclosure of Invention

The invention provides a method and a device for predicting a full-life-cycle shield tunneling parameter, which aim to solve the technical problems in the prior art in the background technology at least to a certain extent.

In order to solve the technical problems, the invention provides the following technical scheme:

in one aspect, the invention provides a full life cycle shield tunneling parameter prediction method, which comprises the following steps:

acquiring historical data and predicted step number of a tunneling task, preprocessing the historical data, and constructing a sample data set by the preprocessed historical data; wherein the historical data comprises geological parameters and tunneling parameters;

training a depth convolution network based on an attention mechanism by using the sample data set; the input of the depth convolution network is geological parameters and historical tunneling parameters, the output is future tunneling parameters, and the output dimensionality is the predicted step number;

and predicting the tunneling parameters to be predicted by using the trained deep convolution network to obtain a prediction result.

Further, the acquiring historical data and the prediction step number of the tunneling task, preprocessing the historical data, and constructing a sample data set by the preprocessed historical data includes:

acquiring related geological parameters under a tunneling stratum, actual tunneling parameters of a tunneling task and predicted step numbers;

marking the geological parameters, and putting marking results into a database;

matching the marked geological parameters with tunneling parameters according to the mileage and the ring number to generate geological-tunneling data;

discretizing the geological parameters;

deleting non-excavation segment data and irrelevant variables in the discretized geological-excavation data, and removing redundant data;

performing missing data completion on the geological-tunneling data from which the redundant data is removed;

performing data normalization on the geological-tunneling data with the completion missing data;

and (4) making a training set, a verification set and a test set according to the normalized geological-tunneling data according to a preset proportion.

Further, the labeling the geological parameters and putting the labeling result into a database includes:

classifying the geological parameters into text data and table data;

marking text data by using a BERT model, selecting a corresponding historical training text training BERT model according to geological parameters to be marked, inputting a text to be marked into the trained BERT model to obtain a marking result, and putting the marking result into a database; and extracting data in the table from the table data by using a Key-Value method, labeling the extracted data, and putting the labeling result into a database.

Further, performing missing data completion on the geological-tunneling data from which the redundant data is removed includes:

and taking the average value of the first two bits and the last two bits of data of the missing data as the value of the current missing data.

Further, the data normalization is performed on the geological-tunneling data with the completion of the missing data, and the formula is as follows:

wherein X' represents the normalization result, X represents the original sequence to be normalized, X_meanMeans, X, representing the original sequence_stdRepresenting the standard deviation of the original sequence.

Further, the deep convolutional network comprises an input layer, a one-dimensional convolutional layer, an attention layer and a full connection layer;

training a deep convolutional network based on an attention mechanism by using the sample data set, wherein the training comprises the following steps:

randomly initializing a one-dimensional convolutional layer and an attention layer, putting the training set as input into the one-dimensional convolutional layer for convolution calculation, wherein the formula is as follows:

y_i＝f(W*[x_i:i+k]+b)

wherein, y_iDenotes the convolution result, x_iRepresenting the input features, k the step size of each convolution, W the convolution kernel, b the bias term, f () the RELA U activation function;

the convolution output Y is set as { Y ═ Y₁,y₂…y_iThe weights of different features are calculated by inputting the weights into the attention layer, and the formula is as follows:

α_i＝softmax(e(y_i,q_j))

wherein,

q_jfor the context of the current input, α_iFor attention distribution, its value is [0,1 ]]To (c) to (d); c_iThe importance degree value of the characteristic is obtained, and N represents the length of the input sequence;

and then converting the output into a one-dimensional vector, inputting the vector into a full-connection layer, and outputting a final prediction result.

Further, the back propagation optimization in the deep convolutional network adopts an Adam optimizer, and the formula is as follows:

wherein, alpha is the step length,

is an exponential decay rate, theta_t，θ_t-1Is a parameter vector, m_t，v_tFirst and second order moment vectors, t is the time step,

for epsilon, the goal is to avoid a divisor of 0.

Further, the loss function in the deep convolutional network is a Smooth L1 function, and the formula is as follows:

the measurement function adopts a coefficient-determining measurement mode, and the formula is as follows:

wherein T ═ { T ═ T₁,t₂…t_nIs the sequence of true values,

the average value of the true value sequence, and n is the sequence length. P ═ P₁,p₂…p_nThe predicted value sequence is finally output by the model, and the length of the predicted value sequence is also n, R²Indicating the decision coefficient.

Further, after the trained deep convolutional network is used for predicting the tunneling parameters to be predicted to obtain a prediction result, the method for predicting the full-life-cycle shield tunneling parameters further comprises the following steps:

and (3) taking the prediction result and the corresponding geological parameters as new samples to be put into a training set of a database for storage, and dynamically updating the depth convolution network based on the attention mechanism after running for a certain time.

On the other hand, the invention also provides a full life cycle shield tunneling parameter prediction device, which comprises:

the data preprocessing module is used for acquiring historical data and the predicted step number of the tunneling task, preprocessing the historical data and constructing a sample data set by the preprocessed historical data; wherein the historical data comprises geological parameters and tunneling parameters;

the model training module is used for training the depth convolution network based on the attention mechanism by utilizing the sample data set obtained by the data preprocessing module; the input of the depth convolution network is geological parameters and historical tunneling parameters, the output is future tunneling parameters, and the output dimensionality is the predicted step number;

and the tunneling parameter prediction module is used for predicting the tunneling parameters to be predicted by utilizing the deep convolutional network trained by the model training module to obtain a prediction result.

In yet another aspect, the present invention also provides an electronic device comprising a processor and a memory; wherein the memory has stored therein at least one instruction that is loaded and executed by the processor to implement the above-described method.

In yet another aspect, the present invention also provides a computer-readable storage medium having at least one instruction stored therein, the instruction being loaded and executed by a processor to implement the above method.

The technical scheme provided by the invention has the beneficial effects that at least:

1. the invention provides a method for standardized processing of original geological and tunneling data, which reduces the influence on the training result caused by different quality of the original data to the maximum extent, improves the stability of model input and further improves the stability of the prediction result.

2. The invention fully considers the relation between the geological information and each sub-device of the shield device, maps the time change through the change of the geological information, and more comprehensively relates to each influence factor in the shield operation process.

3. The invention provides a depth convolution network based on an attention mechanism, which considers the difference of the output tunneling parameter types of different equipment and the instability of a model result caused by the change after the output is replaced. The attention layer can continuously learn, automatically adjust the distribution of importance among the features, and endow each feature with an importance weight, so that the model learning is more targeted. This also ensures a substantial reduction in model accuracy after parameter changes. The CNN layer can well learn different characteristics of data, and the locality characteristics of the data are considered.

4. The method considers the actual operation condition, can realize model training and high-precision prediction under a small amount of computing resources, and has higher robustness, feasibility and stability.

5. The possibility of multi-step prediction is considered, and the attention layer can automatically learn and adjust the weight distribution of the features, so that the model provided by the invention can be conveniently expanded to multi-step prediction at any time, and the actual usability of the model is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of a full-life-cycle shield tunneling parameter prediction method according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of an execution process of the full-life-cycle shield tunneling parameter prediction method in actual engineering application according to the embodiment of the present invention;

fig. 3 is a network diagram of a deep convolutional network based on an attention mechanism according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

First embodiment

The embodiment provides a full-life-cycle shield tunneling parameter prediction method, which standardizes a pretreatment process, and predicts tunneling parameters by adopting a depth convolution network based on an attention mechanism, wherein the principle is shown in fig. 1, and the prediction method is used in actual engineering is shown in fig. 2. The method may be implemented by an electronic device, which may be a terminal or a server. Specifically, the execution flow of the method comprises the following steps:

s1, acquiring historical data and predicted step number of a tunneling task, preprocessing the historical data, and constructing a sample data set by the preprocessed historical data; wherein the historical data comprises geological parameters and tunneling parameters;

specifically, in this embodiment, the implementation process of S1 is as follows:

s11, acquiring related geological parameters under a tunneling stratum, historical data of a tunneling task and predicted step number;

s12, classifying the geological parameters into text data and table data; marking text data by using a BERT model, selecting a corresponding historical training text training BERT model according to geological parameters to be marked, inputting a text to be marked into the trained BERT model to obtain a marking result, and putting the marking result into a database; extracting data in the table from the table data which is not successfully labeled by using a Key-Value method, labeling the extracted data, and putting the labeling result into a database;

s13, matching the geological parameters with the tunneling parameters according to the mileage and the ring number to generate geological-tunneling data;

s14, discretizing the geological parameters; deleting non-excavation segment data and irrelevant variables such as left rotation of a cutter head, right rotation of the cutter head and the like in the discretized geological-excavation data, and removing redundant data; performing missing data completion on the geological-tunneling data from which the redundant data is removed; finally, carrying out normalization processing on the data;

wherein the missing value q_iUsing the average value method to complete, the formula is as follows:

the normalized formula is:

wherein, X' denotes the result of normalization, X denotes the original sequence to be normalized, X_meanMeans, X, representing the original sequence_stdRepresents the standard deviation of the original sequence;

and S15, taking the input tunneling parameters to be predicted as the output of the depth convolution network based on the attention mechanism, and predicting the step number to be the dimension of each output. The remaining data is input as a deep convolutional network based on attention mechanism. The data above were written as 0.7: 0.2: a ratio of 0.1 makes the training set, validation set, and test set.

S2, training the depth convolution network based on the attention mechanism by using the sample data set; the input of the depth convolution network is geological parameters and historical tunneling parameters, the output is future tunneling parameters, and the output dimensionality is the predicted step number;

specifically, in the present embodiment, the attention-based deep convolutional network is shown in fig. 3 and includes an input layer, a one-dimensional convolutional layer, an attention layer, and a fully-connected layer; the implementation process of S2 is as follows:

inputting data into a deep convolution network based on an attention mechanism for training and predicting, specifically:

s21, initializing the one-dimensional convolutional layer and the attention layer randomly, putting the training set obtained in S1 as input into the one-dimensional convolutional layer for convolution calculation, wherein the formula is as follows:

y_i＝f(W*[x_i:i+k]+b)

wherein, y_iRepresents the result of convolution, x_iRepresenting the input features, k representing the step size of each convolution, W representing the convolution kernel, b representing the bias term, f () representing the RELU activation function;

s22, converting the convolution output Y to { Y ═ Y₁,y₂…y_iThe weights of different features are calculated by inputting the weights into the attention layer, and the formula is as follows:

α_i＝softmax(e(y_i,q_j))

wherein,

and S23, then, the output is changed into a one-dimensional vector by entering the flattening layer, the input enters the full connection layer, and the final prediction result is output.

The Adam optimizer is adopted for the back propagation optimization in the deep convolutional network, the Adam optimizer can adapt to the sparse gradient and relieve the problem of gradient oscillation, and the specific formula is as follows:

wherein, alpha is the step length,

for epsilon, the goal is to avoid a divisor of 0.

The loss function is a Smooth L1 function, and the formula is as follows:

the measurement function adopts a coefficient-determining measurement mode, and the specific formula is as follows:

wherein T ═ { T ═ T₁,t₂…t_nIs the sequence of true values,

The problem that the final measurement result is not uniform due to different dimensions can be solved by adopting a measurement mode for determining the coefficient.

And S3, predicting the tunneling parameters to be predicted by using the trained deep convolution network to obtain a prediction result.

Further, in this embodiment, after the trained model is obtained by training using the method described above and the parameters to be predicted are predicted, the method for predicting the full-life-cycle shield tunneling parameters further includes:

and putting the result as a new input into a training sample set of the database for storage, and dynamically updating the attention mechanism-based deep convolutional network after running for a certain time.

It should be noted that although the method of this embodiment introduces an attention mechanism and increases a certain amount of calculation, the model training batches can be significantly reduced and higher prediction accuracy can be obtained.

In conclusion, the embodiment provides a method for standardizing the original geological and tunneling data, so that the influence on the training result caused by different qualities of the original data is reduced to the maximum extent, the stability of model input is improved, and the stability of the prediction result is further improved; in addition, the embodiment considers the difference of the output tunneling parameter types of different equipment and the instability of the model result caused by the change after the output is replaced, and provides the attention-based deep convolution network. The attention layer can continuously learn, automatically adjust the distribution of importance among the features, and endow each feature with an importance weight, so that the model learning is more targeted. This also ensures a substantial reduction in model accuracy after parameter changes. The CNN layer can well learn different characteristics of data, and the locality characteristics of the data are considered. Meanwhile, the actual operation condition is considered, model training and high-precision prediction can be realized under the condition of a small amount of computing resources, and the method has high robustness, feasibility and stability. In addition, the possibility of multi-step prediction is considered, so that the model provided by the embodiment can be conveniently expanded to multi-step prediction at any time, and the actual usability of the model is improved.

Second embodiment

The embodiment provides a full life cycle shield tunneling parameter prediction device, which comprises the following modules:

the model training module is used for training the depth convolution network based on the attention mechanism by utilizing the sample data set obtained by the data preprocessing module; the input of the depth convolution network is geological parameters, the output of the depth convolution network is tunneling parameters, and the output dimensionality is the predicted step number;

The full-life-cycle shield tunneling parameter prediction device of the embodiment corresponds to the full-life-cycle shield tunneling parameter prediction method of the first embodiment; the functions realized by the functional modules in the full-life-cycle shield tunneling parameter prediction device of the embodiment correspond to the flow steps in the full-life-cycle shield tunneling parameter prediction method of the first embodiment one by one; therefore, it is not described herein.

Third embodiment

The present embodiment provides an electronic device, which includes a processor and a memory; wherein the memory has stored therein at least one instruction that is loaded and executed by the processor to implement the method of the first embodiment.

The electronic device may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) and one or more memories, where at least one instruction is stored in the memory, and the instruction is loaded by the processor and executes the method.

Fourth embodiment

The present embodiment provides a computer-readable storage medium, in which at least one instruction is stored, and the instruction is loaded and executed by a processor to implement the method of the first embodiment. The computer readable storage medium may be, among others, ROM, random access memory, CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like. The instructions stored therein may be loaded by a processor in the terminal and perform the above-described method.

Furthermore, it should be noted that the present invention may be provided as a method, apparatus or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied in the medium.

Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It should also be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.

Finally, it should be noted that while the above describes a preferred embodiment of the invention, it will be appreciated by those skilled in the art that, once the basic inventive concepts have been learned, numerous changes and modifications may be made without departing from the principles of the invention, which shall be deemed to be within the scope of the invention. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all changes and modifications that fall within the true scope of the embodiments of the present invention.

Claims

1. A full life cycle shield tunneling parameter prediction method is characterized by comprising the following steps:

acquiring historical data and prediction steps of a tunneling task, preprocessing the historical data, and constructing a sample data set by the preprocessed historical data; wherein the historical data comprises geological parameters and tunneling parameters;

2. The full-life-cycle shield tunneling parameter prediction method of claim 1, wherein the obtaining historical data and the prediction step number of the tunneling task, preprocessing the historical data, and constructing a sample data set with the preprocessed historical data comprises:

marking the geological parameters, and putting marking results into a database;

discretizing the geological parameters;

performing data normalization on the geological-tunneling data with the supplemented missing data;

3. The full-life-cycle shield tunneling parameter prediction method of claim 2, wherein the labeling geological parameters and placing the labeling results into a database comprises:

classifying the geological parameters into text data and table data;

4. The full-life-cycle shield tunneling parameter prediction method of claim 2, wherein the missing data completion of the geological-tunneling data from which the redundant data is removed comprises:

5. The full life cycle shield tunneling parameter prediction method of claim 2, wherein the geological-tunneling data of the completion missing data is normalized by the formula:

wherein X' represents the normalization result, X represents the original sequence to be normalized, X_meanMeans, X, representing the original sequence_stdRepresenting the originalStandard deviation of the sequence.

6. The full-life-cycle shield tunneling parameter prediction method of claim 2, wherein the deep convolutional network comprises an input layer, a one-dimensional convolutional layer, an attention layer, and a full-link layer;

y_i＝f(W*[x_i:i+k]+b)

wherein, y_iDenotes the convolution result, x_iRepresenting the input features, k representing the step size of each convolution, W representing the convolution kernel, b representing the bias term, f () representing the RELU activation function;

α_i＝softmax(e(y_i,q_j))

wherein,

7. The full-life-cycle shield tunneling parameter prediction method of claim 6, wherein the back propagation optimization in the deep convolutional network employs an Adam optimizer, and the formula is as follows:

wherein, alpha is the step length,

for epsilon, the goal is to avoid a divisor of 0.

8. The full-life-cycle shield tunneling parameter prediction method according to claim 7, wherein a Smooth L1 function is adopted as a loss function in the deep convolutional network, and the formula is as follows:

wherein T ═ { T ═ T₁,t₂…t_nIs the sequence of true values,

the average value of the true value sequence is shown, and n is the sequence length; p ═ P₁,p₂…p_nThe predicted value sequence finally output by the model is n; r²To determine the coefficients.

9. The full-life-cycle shield tunneling parameter prediction method according to claim 1, wherein after the prediction of the to-be-predicted tunneling parameter is performed by using the trained deep convolutional network to obtain a prediction result, the full-life-cycle shield tunneling parameter prediction method further comprises:

10. A full life cycle shield tunneling parameter prediction device is characterized by comprising: