CN113051828A

CN113051828A - Online prediction method for natural gas water dew point driven by technological parameters

Info

Publication number: CN113051828A
Application number: CN202110344489.4A
Authority: CN
Inventors: 尹爱军; 谭治斌; 任宏基; 何彦霖
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2021-03-30
Filing date: 2021-03-30
Publication date: 2021-06-29
Anticipated expiration: 2041-03-30
Also published as: CN113051828B

Abstract

The invention relates to the field of natural gas gathering and transportation, and discloses a process parameter-driven natural gas water dew point online prediction method aiming at the defects that a conventional natural gas water dew point detector is easy to damage and high in detection cost and a traditional data driving method cannot effectively reflect the influence relationship between the natural gas water dew point of an actual dehydration system and each monitored parameter; by selecting key parameters for predicting the natural gas water dew point, the elimination of irrelevant redundant features is realized, and a natural gas water dew point prediction training data set is established; training an NP model through a training data set to learn the multivariate regression function relationship of each process monitoring parameter of the triethylene glycol dehydration device; and (3) taking the real-time process monitoring data of the dehydration device as target set data of an NP prediction model to realize the on-line prediction of the natural gas water dew point. Compared with the prior art, the method has the beneficial effect of high accuracy.

Description

Online prediction method for natural gas water dew point driven by technological parameters

Technical Field

The invention relates to the field of natural gas gathering and transportation, in particular to a process parameter driven natural gas water dew point online prediction method.

Background

The natural gas water dew point is an important technical index of the gas quality of natural gas. The accumulation of free water can lead to reduced pipeline capacity and increased corrosion, and hydrate formation due to higher water dew points can lead to pipeline and valve plugging. The removal of free water from natural gas is therefore one of the important tasks for practical production. In actual production, the natural gas water dew point is typically monitored using a natural gas water dew point meter. Due to the limitations of blockage, sensor damage and the like of a detector in actual monitoring, the dehydration performance cannot be accurately and effectively monitored by the conventional method for monitoring the water dew point of the natural gas by using the natural gas dew point instrument. The defect of a detector is avoided by using a data driving method to predict the natural gas water dew point, and the traditional method uses parameters such as the temperature of a natural gas contactor, the concentration of triethylene glycol and the pressure of natural gas to predict the natural gas water dew point. In an actual dehydration system, not only the contactor temperature and the triethylene glycol concentration have an influence on the natural gas water dew point, but also the monitored parameters and the natural gas water dew point have an influence relationship with each other. Aiming at the limitations of the existing detection technology and the traditional data driving method, the sensitivity of the natural gas original feature set parameters to the natural gas water dew point is evaluated through a Gradient Boosting Decision Tree (GBDT) algorithm, the influence of redundant or irrelevant parameters on dew point prediction is reduced, and the selection of key parameters is realized. And training and optimizing a Neural Process (NP) model through the training samples after parameter selection. The optimized NP prediction model can predict the natural gas water dew point on line in real time through the process monitoring parameters of the dehydration device, and the dehydration performance of the dehydration device is evaluated.

Disclosure of Invention

The invention discloses a natural gas water dew point online prediction method driven by process parameters, aiming at the defects that a conventional natural gas water dew point detector is easy to damage and high in detection cost and a traditional data driving method cannot effectively reflect the influence relationship between the natural gas water dew point of an actual dehydration system and each monitored parameter.

The invention is realized by the following technical scheme:

a natural gas water dew point online prediction method driven by process parameters comprises the following steps:

step S1: for a dehydration system with N process monitoring parameters, sequentially numbering the parameters from 1 to N, forming a prediction model original training data set by the N process parameters and historical monitoring data of a natural gas water dew point, wherein the data set comprises P samples, sequentially numbering the samples from 1 to P, taking the N process parameters as variables, and taking the natural gas water dew point as a label or a target value;

step S2: establishing a Gradient Boosting Decision Tree (GBDT) model for selecting key parameters. The GBDT model is an additive model consisting of a regression tree. Establishing the latest decision tree according to the established decision tree model, namely, when the current t-th regression tree is established, the negative gradient of the previous t-1 decision tree is used as a new target value of each sample of the regression tree fitting until the model is finished when the model is converged, when the model is finished, calculating the average importance degree of N process parameters in each tree to evaluate and select the important parameters, wherein the average importance degree of the process parameter j is based on the established decision tree model

Calculating, wherein M is the number of decision trees established by GBDT, L is the depth of a binary tree, and each tree is assumed to be a binary tree, and the M is the number of decision trees established by GBDT

Is the square loss criterion l after the node t is split_t＝n_rootimp_root-n_limp_l-n_rimp_rCalculating, where n is_rootIs the number of samples before node splitting, n_lIs the number of samples of the left node after splitting, n_rIs the number of samples of the right node after splitting. imp_rootIs the mean square error, imp, before node splitting_lIs the mean square error, imp, of the left node after splitting_rDetermining the training parameters, loss functions, the number of decision trees, the maximum depth of the decision trees and the learning rate of the GBDT model, determining the splitting of the decision tree nodes by the loss functions, and initializing a model F by using the mean value of all sample target values of a training set to prevent overfitting of the model by using the number of decision trees, the maximum depth of the decision trees and the learning rate₀(x)；

Step S3: training the t regression tree of the model, wherein t is more than or equal to 1 and less than or equal to M, and calculating the model F after the t-1 tree splitting is completed_t-1(x) The training of the regression tree is a node splitting process, splitting is stopped when the maximum depth of the tree is reached or only one sample is available in a node, the training of the regression tree is completed when all the nodes stop splitting, the splitting process of the regression tree selects the optimal splitting characteristic and value through the minimum square error function, and the optimal splitting attribute calculation process is as follows: sequentially traversing all parameter characteristics h ═ 1, 2., N of the node_gAnd the sample value of the h parameter at the node, namely taking the h parameter characteristic as a splitting variable and taking a certain sample value s of the parameter characteristic as a splitting condition value according to

Selecting optimal splitting characteristics and splitting values, y_iIs a sample target value, γ₁Is R₁Mean and gamma of all sample target values₂Is R₂Calculating the mean value a of each leaf node of the current regression tree by taking the mean value of all the sample target values_tiDetermining model weights by linear search

Updating model F_t(x)＝F_t-1(x)+γ_ta_tSaid F_t(x) Splitting for the t treeFinished model a_tFor each leaf node a of the t-th regression tree_tiThe t-th regression tree is trained;

step S4: repeating the step S3 until all regression trees are trained to obtain a GBDT model selected by the process parameter characteristics;

step S5: respectively calculating and sequencing the average importance degree of N process parameters of the original data set in each tree of the GBDT to obtain the importance degree of each process parameter to the natural gas water dew point, setting a threshold value to select key process parameters to obtain N 'process parameters after feature selection, sequentially numbering the selected N' parameters from 1 to N ', wherein the N' process parameters are variable data and natural gas water dew point monitoring data are tag values or target values to form a data set after feature selection;

step S6: dividing a training context set and a training target set according to a ratio of 7: 3 by a training data set after feature selection, establishing a Neural Process (NP) model for natural gas water dew point prediction, wherein the NP model combines the advantages of a neural network and a Gaussian Process, the NP model consists of an encoder and a decoder, the model generates a high-dimensional random hidden variable z through the encoder, the prediction of target distribution of the Gaussian Process is converted into the calculation of the hidden variable z, the high-dimensional random hidden variable z obeys multi-dimensional Gaussian distribution, the encoder generates an intermediate variable through a multi-layer Perceptron (MLP) and parameterizes the high-dimensional random hidden variable z by utilizing the intermediate variable, the distribution is established by utilizing a mean vector and a variance vector of distribution of the intermediate variable obtained through MLP, and the intermediate variable is obtained by utilizing the mean value of an array generated through the MLP by utilizing a point set consisting of N' Process parameter data and natural gas water dew point, in the neural network for generating the intermediate variable, the number of neurons in an input layer is determined by input data, the neural network comprises three hidden layers in total, each hidden layer comprises 128 neurons, an output layer consists of 128 neurons, the output value of the neural network for the intermediate variable is averaged, a mean vector mu and a variance vector sigma of a high-dimensional random hidden variable z are respectively generated through the hidden layer of one 128 neuron and the output layers of two 128 neurons, the high-dimensional random hidden variable z is sampled, and z is determined by the neural network for the intermediate variableThe output value of the network is averaged and established, all data points have the same sampling for one-time input, the sampling is copied for N ' times, the input N ' is the size of a point set consisting of N ' process parameter data and the natural gas water dew point, and the sampling is expanded to obtain the sampling input z of the high-dimensional random hidden variable z of the decoder_C、z_TThe decoder is divided into two parts of model optimization and model prediction, the decoder is established by MLP, and the model optimization is generated by inputting training context set_CTraining target set generated samples z_TAnd training a target set to optimize the model, wherein the model optimization optimizes parameters of the model through evidence lower bound optimization, the number of neurons of an input layer of the decoder neural network is determined by input data, the decoder neural network comprises two hidden layers, each hidden layer comprises 128 neurons, an output layer comprises 2 neurons, a predicted value and a predicted variance of the predicted value are respectively output, the model prediction takes the training data set as a context set, newly generated monitoring data is taken as the target set, the target set does not comprise a target value, and sampling z generated through the context set is used as the target set_cAnd the target set independent variable data input decoder neural network realizes target prediction, and when the data input in all the decoders and the encoders comprises a plurality of inputs, the plurality of inputs form a new single input in a vector tandem mode;

step S7: training an NP model, randomly sampling K samples from a training upper text set, wherein the number of the samples is not less than 30% of the number of the training upper text set, randomly sampling Q samples from a training target set, wherein the number of the samples is not less than 30% of the number of the training target set, forming a primary training sample of the NP model, sending the selected training sample into the NP model, and obtaining model loss through evidence lower bound in NP model optimization, wherein the model loss is single training loss;

step S8: repeating the step S7 until the training process converges to obtain a natural gas water dew point prediction model;

step S9: and (3) performing on-line prediction of the natural gas water dew point, regarding real-time process parameter data of a dehydration system, taking a training data set as a context set, taking new monitoring data as a target set in the step (6), inputting the context set and the target set into an NP (non-point) model, and realizing on-line prediction of the natural gas water dew point through target prediction of a decoder.

The principle of the method is that important parameters in an original data set are evaluated and selected through a GBDT algorithm, and the influence of redundant or irrelevant characteristics on a prediction result is reduced so as to improve the prediction precision of the natural gas water dew point; dividing a training data set into a training upper text set and a training target set, sampling from the training upper text set and the training target set to form a multi-dimensional sample sequence as a training sample of an NP (non-point) model, and establishing a natural gas water dew point prediction model by using process monitoring parameters to fully utilize the influence relation between each relevant parameter and a dew point in a production system; the NP model parameterizes a target function through a high-dimensional random variable, realizes a Gaussian process through neural network simulation, can adaptively learn a kernel function of the Gaussian process, and can fully learn the influence relation between each relevant parameter and a dew point in a production system; by training the optimized NP model, the natural gas water dew point can be predicted on line in real time for new process monitoring parameters.

Further, the encoder stage in step S6 increases the attention mechanism of the training target set and the training context set, the attention mechanism fully utilizes the information of the training context set and gives more weight to the important context points, and gives more weight to the important context points to improve the learning efficiency and the prediction effect, the attention mechanism is generated by the argument of the training context set, the argument of the training target set and the training context set, the training context set generates the intermediate table expression r by MLP, the r neural network is composed of three hidden layers, each hidden layer is 128 neurons, the output layer is composed of 128 neurons, and simultaneously the argument of the training context set, the argument of the training target set are generated by the hidden layer of 128 neurons and the MLP of the output layer composed of 128 neurons to generate the intermediate expression k of the argument of the training context set and the intermediate expression q of the argument of the training target set, by Laplace (q, k, r): the number of the channels is equal to wr,

calculating attentionThe method comprises the steps that during model prediction, the calculation process of the attention mechanism of a context set and a target set is the same as the calculation process of the independent variable of a training context set and the independent variable of a training target set during training, the attention mechanism is generated in an encoding stage, the decoder is added with the input of the attention mechanism in a decoding stage, and the attention mechanism is used for model optimization and prediction in the decoding stage.

The method has higher prediction precision, is an effective method for realizing the on-line prediction of the natural gas water dew point, and the conventional detector is influenced by the factors of the blockage of the detector, the damage of the sensor and the like, so that the detection cost of the natural gas water dew point is high, and the accuracy is poor; the invention selects key parameters to form a high-dimensional data set combined with NP to self-adaptively learn the characteristic of complex data set function relationship, one process monitoring data learns the coupling relationship among parameters, and can realize the on-line prediction of the natural gas water dew point of the dehydration system with higher precision.

Drawings

The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:

FIG. 1 is a flow chart of an implementation of a process parameter driven natural gas water dew point on-line prediction method of the present invention;

fig. 2 is a schematic diagram of the GBDT model.

FIG. 3 is a schematic representation of the NP model.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to examples and accompanying drawings, and the exemplary embodiments and descriptions thereof are only used for explaining the present invention and are not meant to limit the present invention.

The first embodiment is as follows:

the invention discloses a process parameter-driven natural gas water dew point online prediction method, which is shown in a flow chart shown in figure 1 and comprises the following steps:

for a certain natural gas triethylene glycol dehydration device, the monitored parameters comprise 33 process monitoring parameters such as absorption tower pressure, triethylene glycol circulation amount, flash tank liquid level, pressure control valve opening degree, instantaneous treatment capacity, reboiler temperature, rectification column temperature and the like of each sub-component, and natural gas water dew point daily detection data, wherein the total number of the monitored parameters is 495 samples, and all historical data are used as an original training data set.

Is the square loss criterion l after the node t is split_t＝n_rootimp_root-n_limp_l-n_rimp_rCalculating, where n is_rootIs the number of samples before node splittingAmount, n_lIs the number of samples of the left node after splitting, n_rIs the number of samples of the right node after splitting. imp_rootIs the mean square error, imp, before node splitting_lIs the mean square error, imp, of the left node after splitting_rDetermining the training parameters, loss functions, the number of decision trees, the maximum depth of the decision trees and the learning rate of the GBDT model, determining the splitting of the decision tree nodes by the loss functions, and initializing a model F by using the mean value of all sample target values of a training set to prevent overfitting of the model by using the number of decision trees, the maximum depth of the decision trees and the learning rate₀(x)；

Aiming at the original training data set of the triethylene glycol dehydration device, all process monitoring data form a high-dimensional independent variable, a natural gas water dew point is taken as a dependent variable, a square error function is taken as a loss function, the number of decision trees is 100, the maximum depth of the decision trees is 3, and the learning rate is 0.1, and an initial model is established.

Selecting optimal splitting characteristics and splitting values, y_iIs a sample target value, γ₁Is R₁All samplesMean value and gamma of the target value₂Is R₂Calculating the mean value a of each leaf node of the current regression tree by taking the mean value of all the sample target values_tiDetermining model weights by linear search

Updating model F_t(x)＝F_t-1(x)+γ_ta_tSaid F_t(x) Splitting the t-th tree into the finished model a_tFor each leaf node a of the t-th regression tree_tiThe t-th regression tree is trained;

the GBDT model optimizes the model by calculating the gradient direction of the residual error through connecting the trees in series, adds new trees to the model, and continuously adds new trees to calculate the gradient direction of the residual error of the current model so as to improve the accuracy of the model.

and finishing the training when the training loss is converged, and finishing the model training.

when the key parameters are selected, the degree of each process monitoring parameter of the natural gas triethylene glycol dehydration device in each tree in the GBDT model is calculated to obtain the key parameters. After ranking the importance of all the parameters, setting a threshold value to select important parameters. For the dehydration device, 15 key parameters such as the rich glycol temperature of the absorption tower, the differential pressure of the absorption tower, the metering temperature, the magnetic float liquid level of the absorption tower, the temperature of triethylene glycol before pumping are selected. And forming a new training data set by using the key parameters and the natural gas water dew point, wherein the key parameters are independent variables, and the natural gas water dew point is a label or a target value.

Step S6: dividing a training context set and a training target set according to a ratio of 7: 3 by a training data set after feature selection, establishing a Neural Process (NP) model for natural gas water dew point prediction, wherein the NP model combines the advantages of a neural network and a Gaussian Process, the NP model consists of an encoder and a decoder, the model generates a high-dimensional random hidden variable z through the encoder, the prediction of target distribution of the Gaussian Process is converted into the calculation of the hidden variable z, the high-dimensional random hidden variable z obeys multi-dimensional Gaussian distribution, the encoder generates an intermediate variable through a multi-layer Perceptron (MLP) and parameterizes the high-dimensional random hidden variable z by utilizing the intermediate variable, the distribution is established by utilizing a mean vector and a variance vector of distribution of the intermediate variable obtained through MLP, and the intermediate variable is obtained by utilizing the mean value of an array generated through the MLP by utilizing a point set consisting of N' Process parameter data and natural gas water dew point, the neural network for generating the intermediate variable comprises three hidden layers, wherein the number of neurons of the input layer is determined by input data, each hidden layer comprises 128 neurons, the output value of the neural network for the intermediate variable is averaged, the average value vector mu and the variance vector sigma of a high-dimensional random hidden variable z are generated through the hidden layer of one 128 neuron and the output layers of two 128 neurons respectively, the high-dimensional random hidden variable z is sampled, as z is established by averaging the output value of the neural network for the intermediate variable, all data points have the same sampling for one input, the sampling is copied for N times, the input N 'is the size of a point set consisting of N' process parameter data and a natural gas water dew point, and the sampling input z of the high-dimensional random variable z of the decoder is obtained by sampling and implicit expansion_C、z_TThe decoder is divided into two parts of model optimization and model prediction, the decoder is established by MLP, and the model optimization is generated by inputting training context set_CTraining target set generated samples z_TAnd training a target set to optimize the model, wherein the model optimization optimizes parameters of the model through evidence lower bound, the number of neurons of an input layer of the decoder neural network is determined by input data, the decoder neural network comprises two hidden layers, and each hidden layer comprises two hidden layers128 neurons in the hidden layer, an output layer consisting of 2 neurons and respectively outputting a predicted value and a predicted variance of the predicted value, wherein the model prediction takes a training data set as a context set and newly generated monitoring data as a target set, the target set does not contain a target value, and a sample z generated by the context set_cAnd the target set independent variable data input decoder neural network realizes target prediction, and when the data input in all the decoders and the encoders comprises a plurality of inputs, the plurality of inputs form a new single input in a vector tandem mode;

the gaussian process prediction is probabilistic and can be adapted to fit and regress. The distribution of multidimensional monitoring data of a dehydration device or other industrial equipment is complex, and a kernel function is difficult to accurately select. The NP model converts a Gaussian process modeling process into calculation of high-dimensional random variables by introducing the high-dimensional random variables, and the high-dimensional random variables are established through a neural network. Therefore, the limitation of a kernel function is eliminated, the distribution rule of complex multidimensional process monitoring data of the dehydration device can be learned in a self-adaptive mode, and accurate regression calculation is achieved.

training a dehydration device data set according to the following steps of 7: 3, the training context set and the training target set are divided into a training context set and a training target set, a training sample is formed by randomly sampling 105 data points from the training context set and randomly sampling 45 data points from the training target set in each training, and the time complexity can be reduced by a sampling mode. And then calculating the loss of the target set predicted value and the target set actual value through the evidence lower bound.

and finishing the training when the training loss is converged, and finishing the NP model training.

And (3) performing on-line prediction of the natural gas water dew point, wherein the real-time process monitoring parameter data newly generated by the dehydration device is used as a target set, the training set data is used as a context set to form an on-line prediction data sample, and the on-line prediction of the natural gas water dew point is realized through a trained NP prediction model.

Further, the encoder stage in step S6 increases the attention mechanism of the training target set and the training context set, the attention mechanism fully utilizes the information of the training context set and gives more weight to the important context points, and gives more weight to the important context points to improve the learning efficiency and the prediction effect, the attention mechanism is generated by the argument of the training context set, the argument of the training target set and the training context set, the training context set generates the intermediate table expression r by MLP, the r neural network is composed of three hidden layers, each hidden layer is 128 neurons, the output layer is composed of 128 neurons, and simultaneously the argument of the training context set, the argument of the training target set are generated by the hidden layer of 128 neurons and the MLP of the output layer composed of 128 neurons to generate the intermediate expression k of the argument of the training context set and the intermediate expression q of the argument of the training target set, by passing

Calculating an attention mechanism, wherein in model prediction, the calculation process of the attention mechanism of the context set and the target set is the same as the calculation process of the independent variable of the training context set and the independent variable of the training target set in training, the attention mechanism is generated in an encoding stage, a decoder is added with the input of the attention mechanism in a decoding stage, and in the decoding stage, the model optimization and the prediction are carried outThe measurement uses the attention mechanism.

When the data point in the target set is close to a certain context set point, the predicted value of the target point is close to the natural gas water dew point value of the context point. By calculating the attention mechanism between the target point and the context point, more weight is given to the important context point, so that the training and predicting effects are improved.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A natural gas water dew point online prediction method driven by process parameters is characterized by comprising the following steps:

R₁(h，s)＝{x|x^(h)≤s}，R₂(h，s)＝{x|x^(h)S, selecting optimal splitting characteristics and splitting values, y_iIs a sample target value, γ₁Is R₁Mean and gamma of all sample target values₂Is R₂Calculating the mean value a of each leaf node of the current regression tree by taking the mean value of all the sample target values_tiDetermining model weights by linear search

step S6: and (3) the training data set after feature selection is processed according to the following steps of 7: 3, establishing a natural gas water dew point prediction Neural Process (NP) model, wherein the NP model combines the advantages of a neural network and a Gaussian Process, the NP model consists of an encoder and a decoder, the model generates a high-dimensional random hidden variable z through the encoder, the prediction of the target distribution of the Gaussian Process is converted into the calculation of the hidden variable z, the high-dimensional random hidden variable z obeys multi-dimensional Gaussian distribution, the encoder generates an intermediate variable through a Multilayer Perceptron (MLP) and parameterizes the high-dimensional random hidden variable z by utilizing the intermediate variableA variable z, wherein the distribution is established by a mean vector and a variance vector of a distribution obtained by an MLP through an intermediate variable, the intermediate variable is obtained by averaging a point set consisting of N' process parameter data and a natural gas water dew point through an array generated by the MLP, the neural network for generating the intermediate variable, the number of neurons in an input layer is determined by input data, the neural network comprises three hidden layers, each hidden layer comprises 128 neurons, an output layer comprises 128 neurons, the output value of the neural network for the intermediate variable is averaged, a mean vector mu and a variance vector sigma of a high-dimensional random hidden variable z are respectively generated through the hidden layer of one 128 neuron and the output layers of two 128 neurons, the high-dimensional random hidden variable z is sampled, and as z is established by averaging the output value of the neural network for the intermediate variable, all data points have the same sampling for one-time input, copying the sampling N 'times, wherein the input N' is the size of a point set consisting of N process parameter data and the natural gas water dew point, and obtaining the sampling input z of the high-dimensional random hidden variable z of the decoder through sampling expansion_C、z_TThe decoder is divided into two parts of model optimization and model prediction, the decoder is established by MLP, and the model optimization is generated by inputting training context set_CTraining target set generated samples z_TAnd training a target set to optimize the model, wherein the model optimization optimizes parameters of the model through evidence lower bound optimization, the number of neurons of an input layer of the decoder neural network is determined by input data, the decoder neural network comprises two hidden layers, each hidden layer comprises 128 neurons, an output layer comprises 2 neurons, a predicted value and a predicted variance of the predicted value are respectively output, the model prediction takes the training data set as a context set, newly generated monitoring data is taken as the target set, the target set does not comprise a target value, and sampling z generated through the context set is used as the target set_cAnd the target set independent variable data input decoder neural network realizes target prediction, and when the data input in all the decoders and the encoders comprises a plurality of inputs, the plurality of inputs form a new single input in a vector tandem mode;

2. The on-line natural gas water dew point prediction method driven by process parameters as claimed in claim 1, wherein the encoder stage in step S6 adds an attention mechanism for training a target set and a context set, the attention mechanism fully utilizes information of the training context set and gives more weight to the model for important context points, and more weight is given to the important context points to improve learning efficiency and prediction effect, the attention mechanism is generated by an independent variable of the training context set, an independent variable of the training target set and the training context set, the training context set generates an intermediate table expression r by MLP, the r neural network is composed of three hidden layers, each hidden layer is 128 neurons, and the output layer is composed of 128 neurons, and the independent variable of the training context set and the independent variable of the training target set are generated by an MLP of the hidden layer of 128 neurons and the output layer composed of 128 neurons The intermediate representation k of the argument of the training context set and the intermediate representation q of the argument of the training target set are represented by Laplace (q, k, r): the number of the channels is equal to wr,

computational attention mechanism, context when model predictionThe calculation process of the attention mechanism of the set and the target set is the same as the calculation process of the independent variables of the training context set and the independent variables of the training target set during training, the attention mechanism is generated in the encoding stage, the decoder is added with the input of the attention mechanism in the decoding stage, and the attention mechanism is used for model optimization and prediction in the decoding stage.