CN112749849A

CN112749849A - Integrated learning online prediction method for key parameters of continuous catalytic reforming process

Info

Publication number: CN112749849A
Application number: CN202110068102.7A
Authority: CN
Inventors: 钟伟民; 杜文莉; 钱锋; 彭鑫; 李智
Original assignee: East China University of Science and Technology
Current assignee: East China University of Science and Technology
Priority date: 2021-01-19
Filing date: 2021-01-19
Publication date: 2021-05-04

Abstract

The invention relates to the field of continuous catalytic reforming process, more particularly, to an integrated learning online prediction method of key parameters of continuous catalytic reforming process. The method includes the following steps: S1, selecting auxiliary variables and main variables of the continuous catalytic reforming process; S2, preprocessing the process data sample set to generate a data set; S3, dividing the data set; S4, extracting from the training data set Select the spn data samples as the overall data set, the remaining data samples as the correction data set, and use the overall data set as the sub-training set to construct three adaptive regression sub-models; S5, extract data samples from the correction data set in turn, and update the corresponding Adaptive regression sub-model; S6, applying the evaluation data set to each adaptive regression sub-model to establish an integrated model; S7, predicting the main variable based on the input auxiliary variables. The invention effectively improves the prediction accuracy of the model in the continuous catalytic reforming process.

Description

Integrated learning online prediction method for key parameters of continuous catalytic reforming process

Technical Field

The invention relates to the field of continuous catalytic reforming processes, in particular to an integrated learning online prediction method for key parameters of a continuous catalytic reforming process.

Background

In recent years, with the increasing demand for automotive oils and the increasing demand for energy, catalytic reforming technology has also attracted considerable attention. The continuous regeneration reforming technology of the catalyst has epoch-making significance in the development history of the catalyst, and provides infinite possibility for the high-efficiency production of high-quality gasoline and other chemical raw materials.

The continuous catalytic reforming process is mainly used for increasing the octane number in gasoline refineries and aromatic-rich petrochemicals, is the most popular naphtha reforming process in the world at present, and realizes continuous extraction of catalyst in the steam operation process of a reactor.

Figure 1 discloses a flow diagram of a continuous catalytic reforming process, as shown in figure 1, in which a heavy naphtha liquid is pumped 300 to a heat exchanger 400 and mixed with a hydrogen rich recycle gas from a recycle compressor 600 during the catalytic cracking process.

The temperature of the liquid gas feed mixture is raised to the reaction temperature in the heater (heater No. 1, 101 to heater No. 4, 104) before entering each reactor (reactor No. 1, 201 to reactor No. 4, 204) to ensure complete vaporization of the mixture.

In the four reactors, several chemical reactions, mainly dehydrogenation, occur stably.

When the reactants flow rapidly through the catalyst bed, the temperature of the reactants drops dramatically. The process of heating and reacting continues until it exits reactor No. 4, 204, into the contacting and separating section.

Among them, the reforming catalyst is continuously consumed during operation, and is in contact with various substances to undergo chemical reactions, thereby gradually decreasing the catalyst activity, and thus it is necessary to continuously regenerate the catalyst to maintain a high and stable catalyst activity.

The regenerator 500 functions to achieve continuous circulation of the catalyst and to complete regeneration of the catalyst at the same time.

In the separator 700, the hydrogen-rich gas is separated and returned to the recycle compressor 600.

In addition, the net hydrogen rich gas is used for additional hydrogen consuming petrochemical processes and the bottom product of the stabilizer column is a high octane liquid reformate.

The catalytic reforming process takes naphtha as a raw material, and improves the octane number of the naphtha through reforming reaction.

As a key parameter of the process, the online detection of the octane number index of the reformed gasoline is important in order to improve the economic benefit and the pollution problem and the safety problem of energy sources.

However, the continuous catalytic reforming process belongs to a very complicated industrial production technology and has the characteristics of large time lag, high coupling and nonlinearity.

In industrial engineering in recent decades, when variables cannot be directly obtained, soft measurement is widely applied to estimation of key performance indexes instead of hard measurement, and the performance is excellent in parameter prediction and process monitoring.

Many innovative soft measurement modeling methods such as artificial neural networks, gaussian process regression, and deep learning have been applied and developed.

However, traditional multivariate statistical methods such as principal component regression, partial least squares, and slow feature analysis are still popular in practical applications.

The partial least squares method is a common modeling technique because of its inherent structure and feature extraction capability, and it can project process data into low-dimensional latent variables, including the comprehensive information in the input and output data, and can well deal with the co-linearity between process variables. However, the partial least squares method is inherently limited by its linear characteristics, and cannot properly extract data characteristics when the data relationship is non-linear.

Therefore, a series of nonlinear improvement methods are proposed to integrate the nonlinear features in the linear partial least squares framework. These non-linear techniques combine neural network methods, kernel-based algorithms, and local modeling strategies.

Among them, LWPLS (local weighted partial least squares) is based on the combination of a sample weighting technique and a partial least squares method, and is one of the most widely applied local modeling methods.

However, due to the non-linear and time-varying characteristics of the actual industrial process, the conventional soft measurement model may have severe performance gradients when not matched with the originally designed model.

While many new approaches have been used to alleviate this problem, each approach focuses only on certain aspects of the model features, and thus a comprehensive framework is needed to combine these features.

Disclosure of Invention

The invention aims to provide an integrated learning online prediction method for key parameters of a continuous catalytic reforming process, which solves the problems of poor prediction precision and poor adaptability of the key parameters of the conventional continuous catalytic reforming process.

In order to achieve the aim, the invention provides an integrated learning online prediction method for key parameters of a continuous catalytic reforming process, which comprises the following steps:

s1, selecting auxiliary variables and main variables of the continuous catalytic reforming process, and performing data acquisition to form a process data sample set, wherein the auxiliary variables reflect the operation condition of the continuous catalytic reforming process, and the main variables reflect the quality of finished products;

s2, preprocessing the collected process data sample set of the continuous catalytic reforming process to generate a data set;

s3, dividing the data set, including a training data set, an evaluation data set and a test data set;

s4, selecting spn data samples from the training data set as a total data set, using the residual data samples as a correction data set, and respectively using the total data set as a sub-training set to construct three adaptive regression submodels, wherein the three adaptive regression submodels comprise a moving window model based on a local weighted partial least square method, a time difference model based on the local weighted partial least square method and an instant learning model based on the local weighted partial least square method;

s5, sequentially extracting data samples from the correction data set, verifying each self-adaptive regression sub-model, adding the current data sample into a sub-training set of the self-adaptive regression sub-model with the minimum estimation error value, and updating the corresponding self-adaptive regression sub-model;

s6, applying the evaluation data set to each self-adaptive regression sub-model, calculating the root mean square error of the predicted value of each self-adaptive regression sub-model, determining the weight coefficient of each self-adaptive regression sub-model by Bayesian estimation, and establishing an integrated model;

and S7, predicting the final output value of the main variable based on the input auxiliary variable by adopting the constructed integrated model.

In one embodiment, in step S1:

auxiliary variables including re-contact temperature, separation tank temperature, hydrocracked naphtha, reactor temperature, separation tank pressure, recycle hydrogen flow, primary alkane content, compressor pressure, reflux flow, tray temperature, bottom temperature, and reactor outlet temperature;

the main variable is the octane number, and the corresponding expression is as follows,

wherein, Y_RON、

And

measured value parameter, X, for the output of a continuous catalytic reforming process_FeedIs a feed variable for a continuous catalytic reforming process.

In an embodiment, the preprocessing method in step S2 further includes:

removing abnormal values and missing values, and removing measurement noise.

In an embodiment, in the step S4, based on the moving window model of the local weighted partial least squares, the modeling process is as follows:

s411, mixing { X₁，Y₁I.e. the window size H is set to the training data matrix in the initial window, where X₁For the initial H input variables, Y, in the selected overall dataset₁For the initial H output variables in the selected overall dataset;

s412, establishing a local weighted partial least square regression model for the output of the plurality of sampled data points after the window;

s413, moving the window forward according to the step length D to enable the { X_w，Y_wRetraining a locally weighted partial least squares regression model in a window, where X_wFor H input variables, Y, after w D steps in the overall dataset_wH output variables after w D steps in the overall data set.

In an embodiment, in the step S4, based on the time difference model of the local weighted partial least squares, the modeling process is as follows:

s421, respectively calculating a first-order difference delta x (t) between input variables and a first-order difference delta y (t) between output variables of adjacent sampling data points;

s422, constructing a relation model between the delta x (t) and the delta y (t) by using a local weighted partial least square regression model;

s423, calculating a query sample x_q(t) first order difference Deltax_q(t)；

S424, converting delta x_q(t) predicting the differential value Deltay of the response variable in the input relational model_p(t)；

S425, based on adjacent sampling data points y_p(t-1) and the difference value Deltay_p(t) calculating the predicted value y of the response variable_p(t), the expression is as follows,

y_p(t)＝Δy_p(t)+y_p(t-1)。

in an embodiment, in the step S4, the modeling process based on the instantaneous learning model of the local weighted partial least squares is as follows:

s431, when the traversal inputs a query sample x_q∈R^mOn arrival, calculate x_qAnd x_iThe distance between the two or more of the two or more,

wherein, n data sample points (x)_i，y_i)_i＝1～nThe constituent data set, x ∈ R^mAnd y ∈ R^lRepresenting input and output training sets, x, respectively_iRepresents the ith input training value;

s432, selecting N data samples with the minimum distance to form a related sample set;

s433, constructing a local weighted partial least square regression model based on the relevant sample set;

s434, obtaining a query sample x through an LWPLS regression model_qCorresponding response variable y_pThe predicted value of (2).

In one embodiment, in step S5:

the estimation error is an error value between the predicted value and the actual value;

the data sample with the smallest estimated error value is added to the corresponding sub-training set.

In one embodiment, in step S6, the root mean square error of the predicted value of each adaptive regression submodel is calculated as follows:

root Mean Square Error (RMSE) of predicted value of ith adaptive regression sub-model_i，

Wherein, y_dRepresents the actual value, y_dp，iRepresents the predicted value of the ith adaptive regression submodel, and D represents the number of evaluation samples of the evaluation data set.

In one embodiment, the output result y of the model is integrated in the step S6_pThe corresponding expression is:

wherein, y_p，iThe predicted output value of the ith adaptive regression submodel is represented, C is the current process state, P (M)_iI C) is the ith adaptive regression submodel M in the current process state C_iProbability of (M)₁，M₂，...，M_IThere are I different adaptive regression submodels.

In one embodiment, the current process state C in the step S6 is the ith adaptive regression submodel M_iProbability P (M)_i| C), the corresponding expression is:

wherein, P (M)_i) Is the prior probability of the ith adaptive regression submodel, P (C | M)_i) For the ith adaptive regression submodel M_iIs the probability of the current process state C, p is the tuning parameter, RMSE_iThe root mean square error of the predicted value of the ith self-adaptive regression sub-model;

P(C|M_i) The corresponding expression of (a) is,

P(M_i) The corresponding expression of (a) is,

r_ithe corresponding expression of (a) is,

the integrated learning online prediction method for key parameters of the continuous catalytic reforming process, provided by the invention, is based on MW, TD and JITL adaptive regression submodels of an LWPLS regression algorithm, and combines a Bayesian method to fuse data to obtain an integrated model, so that different process characteristic problems of nonlinearity, time variation, mutation and the like in the soft measurement of the key parameters of the continuous catalytic reforming process are respectively solved, the performance and efficiency of the respective adaptive regression submodels are greatly improved, the method can adapt to complex environments, and the prediction precision is effectively improved.

Drawings

The above and other features, properties and advantages of the present invention will become more apparent from the following description of the embodiments with reference to the accompanying drawings in which like reference numerals denote like features throughout the several views, wherein:

FIG. 1 discloses a flow diagram of a continuous catalytic reforming process;

FIG. 2 discloses a flow chart of a continuous catalytic reforming process key parameter ensemble learning online prediction method according to an embodiment of the present invention;

FIG. 3 discloses a modeling flow diagram of an integration model according to an embodiment of the invention;

FIG. 4a discloses a flow chart of modeling a moving window model based on local weighted partial least squares according to an embodiment of the present invention;

FIG. 4b discloses a flow chart of modeling a local weighted partial least squares based time difference model according to an embodiment of the invention;

FIG. 5 discloses a flow chart of modeling an instantaneous learning model based on local weighted partial least squares according to an embodiment of the invention;

FIG. 6 discloses a flow chart of a method for partitioning a training data set according to an embodiment of the invention;

fig. 7 reveals octane number prediction maps for the five models.

The meanings of the reference symbols in the figures are as follows:

no. 1011 heater;

1022. the heater;

1033 # heater;

no. 1044 heater;

2011 reactor;

reactor No. 2022;

2033 reactor no;

number 2044 reactor;

300 pumps;

400 heat exchanger;

a 500 CCR regenerator;

600 circulating compressor;

700 divider.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Aiming at different process characteristics of nonlinearity, time variation, mutation and the like in a continuous catalytic reforming process, the invention provides an integrated learning online prediction method of key parameters of the continuous catalytic reforming process based on a self-adaptive local weighted partial least square method, which comprises the steps of analyzing a continuous catalytic reforming process mechanism, selecting proper auxiliary variables, dividing a training data set into a total data set and a correction data set, training a moving window model, a time difference model and an immediate learning model based on the local weighted partial least square method, combining the moving window model, the time difference model and the immediate learning model based on the local weighted partial least square method, comprehensively considering all characteristic information by adopting an integration mode of Bayesian theorem, carrying out fusion calculation on prediction output, and establishing a soft measurement integrated model with higher cost performance, the method is suitable for complex environments, the change situation of the main variable octane value is correctly predicted, and the efficiency and the accuracy of the model are improved.

Fig. 2 discloses a flowchart of an ensemble learning online prediction method for key parameters of a continuous catalytic reforming process according to an embodiment of the present invention, fig. 3 discloses a flowchart of an ensemble model according to an embodiment of the present invention, and fig. 2 and fig. 3 illustrate an ensemble learning online prediction method for key parameters of a continuous catalytic reforming process according to the present invention, which includes the following steps:

Each step is described in detail below.

And S1, selecting auxiliary variables and main variables of the continuous catalytic reforming process, and performing data acquisition to form a process data sample set, wherein the auxiliary variables reflect the operation condition of the continuous catalytic reforming process, and the main variables reflect the quality of finished products.

Suitable auxiliary variables and main variables are selected according to the mechanism of the continuous catalytic reforming process.

The selection of the secondary and primary variables can be directly summarized from the actual continuous catalytic reforming process.

The auxiliary variable is required to reflect the operating conditions of the continuous catalytic reforming process.

Selected auxiliary variables include, but are not limited to:

the temperature of the secondary contact, the temperature of the liquid separation tank, hydrocracked naphtha, the temperature of the reactor, the pressure of the separation tank, the flow rate of the circulating hydrogen, the content of primary alkane, the pressure of a compressor, the flow rate of reflux, the temperature of a tray, the flow rate of reflux, the temperature of the tray, the temperature of the bottom, the temperature of the outlet of the reactor I and the like.

The main variables, need to be able to reflect the quality of the finished product.

The octane number can reflect the quality of finished products and is an important performance index.

Thus, octane number was chosen as the primary variable.

The octane number can be derived by the following formula:

wherein, Y_RON、

And

for 3 measurements of output, Y, of a continuous catalytic reforming process_RONTo investigate octane number, X_FeedIs a feed variable for a continuous catalytic reforming process.

And S2, preprocessing the collected process data sample set of the continuous catalytic reforming process to generate a data set.

The measurement of continuous catalytic reforming industrial process variables often contains measurement noise and therefore requires pre-processing of the collected continuous catalytic reforming process data sample sets.

The pretreatment method comprises the following steps: removing abnormal values and missing values, and removing measurement noise.

In this embodiment, outliers are removed and the 3-sigma criterion is applied. The 3-sigma criterion is also called Laudea criterion, and is characterized by that firstly, a group of detection data only contains random error, and makes calculation treatment to obtain standard deviation, and according to a certain probability a zone is defined, and when the error exceeding said zone is not random error but coarse error, the data containing said error can be removed.

In this embodiment, measurement noise is removed, and white noise is removed using a gaussian smoothing filter.

The expression of the gaussian smoothing filter is as follows:

where μ represents the mean of the data sample x and σ represents the standard deviation of the data sample x.

And S3, dividing the data set, including a training data set, an evaluation data set and a test data set.

By dividing the preprocessed process data sample set, an accurate model is constructed and the accuracy of the model can be evaluated.

Typically, 45% of the data set is divided into the training data set, 10% into the evaluation data set, and the remaining 45% into the test data set.

The training data set is used for training a model, inputting and corresponding output data to the model, and training the model to learn the relation between the input data and the output data;

the evaluation dataset, which is used to verify the training level of the evaluation model, such as the classification accuracy of the classifier, the prediction error, etc., may be used to select the best model based on the performance of the evaluation dataset.

And the test data set is used for testing the final model after training is finished, and the new input data is simulated in the final model to obtain a prediction result and output the prediction result.

S4, selecting spn data samples from the training data set as a total data set, using the residual data samples as correction data sets, and respectively using the total data set as a sub-training set to construct three adaptive regression sub-models, wherein the three adaptive regression sub-models comprise a Moving Window model (MW-LWPLS, Moving Window-LWPLS) based on a local weighted partial least square Method, a Time Difference model (TD-LWPLS, Time Difference-LWPLS) based on a local weighted partial least square Method, and a Just-in-Time Learning Method-LWPLS based on a local weighted partial least square Method.

Spn data samples are selected from the training data set as a total data set, and the remaining data samples are used as a correction data set.

Based on the overall data set, three adaptive regression sub-models are constructed.

Three adaptive regression submodels, including:

a moving window model based on local weighted partial least squares (MW-LWPLS);

a time difference model based on local weighted partial least squares (TD-LWPLS);

a local weighted partial least squares based just-in-time learning model (JITL-LWPLS).

The size of the initially partitioned overall data set determines to a large extent the performance of the final regression integration model, and thus the optimal value of the spn parameter needs to be determined by trial and error.

Fig. 4a discloses a modeling flow chart of a moving window model based on a local weighted partial least square method according to an embodiment of the present invention, such as the moving window model based on the local weighted partial least square method shown in fig. 4a, the modeling flow chart is as follows:

s412, establishing an LWPLS regression model for the output of the windowed multiple sampling data points;

s413, moving the window forward according to the step length D to enable the { X_w，Y_wRetraining the LWPLS regression model in a window, where X_wFor H input variables, Y, after w D steps in the overall dataset_wRepresenting H output variables after w D steps in the overall data set.

The key step of the moving window method is to determine the appropriate window size H according to the actual process, and usually a trial and error method is used to determine the window size H.

Further, for convenience, the step size D of the moving window is set to 1.

Fig. 4b discloses a modeling flow chart of the local weighted partial least square method-based time difference model according to an embodiment of the invention, and the local weighted partial least square method-based time difference model shown in fig. 4b is composed of time differences between variables, and the modeling flow chart is as follows:

s421, respectively calculating a first order difference Δ x (t) between input variables and a first order difference Δ y (t) between output variables of adjacent sampling data points, where the corresponding expression is,

Δx(t)＝x(t)-x(t-1)；

Δy(t)＝y(t)-y(t-1)；

s422, constructing a relation model between the delta x (t) and the delta y (t) by using an LWPLS regression model, wherein the corresponding expression is as follows,

Δy(t)＝f(Δx(t))；

where f represents the called LWPLS regression model function.

S423, calculating a query sample x_q(t) first order difference Deltax_q(t) corresponding to the expression of,

Δx_q(t)＝x_q(t)-x_q(t-1)；

s424, converting delta x_q(t) predicting the differential value Deltay of the response variable in the input relational model_p(t), the corresponding expression is as follows,

Δy_p(t)＝f(Δx_q(t))；

s425, based on adjacent sampling data points y_p(t-1) and the difference value Deltay_p(t) calculating the predicted value y of the response variable_p(t), the corresponding expression is as follows,

y_p(t)＝Δy_p(t)+y_p(t-1)。

the most critical step in the just-in-time learning framework is the selection of a neighborhood sample set for model construction.

Similarity is generally evaluated using a distance-based metric, i.e., euclidean distance. The greater the distance, the lower the similarity of the current historical sample point and the query sample, and vice versa.

Fig. 5 discloses a modeling flow chart of an instantaneous learning model based on a local weighted partial least square method according to an embodiment of the present invention, such as the instantaneous learning model based on the local weighted partial least square method shown in fig. 5, the modeling flow chart is as follows:

suppose a block consisting of n data sample points (x)_i，y_i)_i＝1～nComposed data set, where x ∈ R^mAnd y ∈ R^lRepresenting input and output training sets, respectively.

And S431, measuring the similarity.

When traversing and inputting a query sample x_q∈R^mOn arrival, calculate x_qAnd x_iThe distance therebetween, as shown in the following formula,

wherein x is_iRepresenting the ith input training value.

The distance d to be calculated_iSorted in ascending order.

And S432, correlating the sample set.

Selecting the N data samples with the smallest distance, i.e. the highest similarity, to form a correlated sample set for building the LWPLS regression model, because these data samples and the query sample x_qMost similar.

Set of correlated samples { X_local，Y_local}＝{(x₁，y₁)，...，(x_i，y_i)，...，(x_N，y_N) Where N represents the number of samples of the LWPLS regression model.

S433, constructing an LWPLS regression model based on the related sample set, wherein the expression is as follows,

y_i＝f(x_i)，

wherein (x)_i，y_i) Data sample points of a correlated sample set;

s434, obtaining a query sample x through an LWPLS regression model_qAnd outputting the predicted value of the corresponding response variable.

Query sample x_qThe LWPLS regression model of (1), the expression is as follows,

y_p＝f(x_q)，

wherein x is_qTo query the sample, y_pIs the predicted value of the response variable.

The LWPLS regression model needs to be re-established every time of prediction, and after the prediction output is completed, the constructed model is abandoned, and a new LWPLS regression model is established when a new query sample comes.

Notably, the window size in the MW-LWPLS sub-model and the local area size in the JITL-LWPLS sub-model should not exceed spn in case the model fails.

The optimal values for the window size in the MW-LWPLS sub-model and the local area size in the JITL-LWPLS sub-model also need to be determined by trial and error.

When the total data set is used as a sub-training set of the three adaptive regression submodels, not only all characteristics of nonlinearity, time variation, mutation and the like need to be covered, but also the characteristics which are unique to the three adaptive regression submodels of MW-LWPLS, TD-LWPLS and JITL-LWPLS need to be adapted.

The time-varying characteristics of the sub-training set of the MW-LWPLS adaptive regression submodel are dominant;

the mutation characteristics of the sub-training set of the TD-LWPLS adaptive regression sub-model are dominant;

the nonlinear characteristics of the sub-training set for the JITL-LWPLS adaptive regression submodel dominate.

Therefore, in the embodiment of the present invention, first, the total data set is respectively used as the sub-training sets of the three adaptive regression sub-models of MW-LWPLS, TD-LWPLS and JITL-LWPLS, so that the sub-training sets of the three adaptive regression sub-models can cover all the features;

and then, by updating the sub-training sets, the sub-training sets of the three self-adaptive regression sub-models can respectively obtain data of corresponding characteristics.

And S5, sequentially extracting data samples from the correction data set, verifying each self-adaptive regression sub-model, calculating a corresponding estimation error, adding the current data sample into a sub-training set of the self-adaptive regression sub-model with the minimum estimation error value, and updating the corresponding self-adaptive regression sub-model.

Fig. 6 is a flowchart of a method for dividing a training data set according to an embodiment of the present invention, and as shown in fig. 6, data samples of a correction data set are added to three adaptive regression submodels, MW-LWPLS, TD-LWPLS and JITL-LWPLS, for training and prediction, and corresponding estimation errors are calculated and sorted in ascending order.

The estimation error is an error between the calculated predicted value and the actual value.

And adding the data sample with the minimum estimated error value in the three adaptive regression submodels to the end of the sub training set corresponding to the adaptive regression submodel.

The estimated error value is used as a performance index to describe the capability of the model processing process characteristic, and the smaller the error value is, the more interpretable the corresponding self-adaptive regression sub-model can be improved corresponding to the specific characteristic.

Thus, the current data sample point with a particular feature interprets the adaptive regression sub-model with the prediction dependency of the smallest estimation error value and adds these data sample points to the corresponding sub-training set of the adaptive regression sub-model.

And S6, applying the evaluation data set to each self-adaptive regression sub-model, calculating the root mean square error of the predicted value of each self-adaptive regression sub-model, determining the weight coefficient of each self-adaptive regression sub-model by Bayesian estimation, and establishing an integrated model.

Evaluation samples of the evaluation data set are applied to each of the adaptive regression sub-models.

And calculating the Root Mean Square Error (RMSE) of each predicted value, determining a weight coefficient by adopting a Bayesian estimation method, and establishing an integrated model for predicting output.

Assume that there are I different adaptive regression submodels, denoted M respectively₁，M₂，...，M_I. In this embodiment, I is 3.

The current process state is denoted C, then the current process state C is the ith adaptive regression sub-model M_iHas a probability of P (M)_iC), which can be the weight value of the fused prediction output estimated for each adaptive regression submodel.

For better illustration of the integration model, M therein_iAnd C are both meaningful characters.

Then, the output result y of the integrated model_pThe corresponding expression is:

wherein, y_p，iThe predicted output value, P (M), representing the ith adaptive regression submodel_iI C) is the ith self-adaptive regression submodel M of the current process state C_iThe probability of (c).

P(M_i| C) can be calculated by the following formula:

wherein, P (C | M)_i) For the ith adaptive regression submodel M_iIs the probability of the current process state C, P (M)_i) Is the prior probability of the ith adaptive regression submodel.

Assuming that the prior probabilities of each adaptive regression submodel are the same, i.e.

For a regression task, the Root Mean Square Error (RMSE) must be small if a high prediction accuracy is to be achieved, i.e. the prediction performance is inversely proportional to the value of RMSE.

P(C|M_i) The value of (d) can be obtained by:

wherein r is_iThe corresponding expression is that,

the parameter p is an adjustment coefficient according to the actual application.

Root Mean Square Error (RMSE) of predicted value of ith adaptive regression sub-model_iCalculated from the following formula:

S7, predicting the final output value of the main variable based on the input auxiliary variable by adopting the constructed integrated model

In this embodiment, the constructed integrated model is used to predict the octane number of the main variable based on the test data set, and the final output is obtained.

The invention provides an integrated learning online prediction method for key parameters of a continuous catalytic reforming process, which comprises the steps of selecting proper auxiliary variables and main variables according to the mechanism of the continuous catalytic reforming process when performing soft measurement modeling of an integrated model, preprocessing an acquired process data sample set, dividing the acquired process data sample set into a training data set, an evaluation data set and a test data set, dividing the training data set into three adaptive regression submodels, updating corresponding sub-training sets so that the training data set not only covers all the characteristics of the three submodels but also respectively comprises all the characteristics, applying the evaluation data set to each adaptive submodel to calculate the RMSE of each predicted value, determining a weight coefficient by Bayesian estimation, establishing the integrated model, and performing prediction by using the integrated model based on the test data set to obtain final output.

The on-line prediction method based on the adaptive local weighted partial least square method for the integrated learning is described by the following key parameter prediction embodiment in the continuous catalytic reforming process, and the specific steps comprise:

step S1: data samples of the continuous catalytic reforming process are collected, and auxiliary variables and main variables are selected.

The data samples collected include 84 input measurements and 3 output measurements, RON, C₆(%) and C₇₊(%), all measurements were sampled every 30 minutes.

Selecting 84 input measurement values as auxiliary variables, including but not limited to, re-contact temperature, temperature of the liquid separation tank, hydrocracked naphtha, reactor temperature, pressure of the separation tank, flow rate of recycle hydrogen, primary alkane content, compressor pressure, reflux flow rate, tray temperature, bottom temperature, reactor outlet temperature, etc.

Selecting the octane number as a performance index as a main variable, wherein the octane number can be derived by the following formula:

wherein, X_FeedRepresenting the feed variables for the continuous catalytic reforming process.

Step S2: and (4) preprocessing data.

Noise elimination is required in consideration of measurement noise contained in the continuous catalytic reforming process.

White noise is removed by adopting a Gaussian smoothing filter, and the calculation is as follows:

where μ represents the mean of the data x and σ represents the standard deviation of the data x.

And removing outliers using the 3-sigma criterion while removing missing values.

737 sample data were finally obtained (approximately half a month).

Step S3: sample data is divided into three parts: 350 samples were used as training data set, 50 samples as evaluation data set, 337 samples as test data set.

And S4-S7, establishing sub models of MW-LWPLS, TD-LWPLS and JITL-LWPLS, and establishing an integrated model EA-LWPLS.

Meanwhile, an LWPLS model is established for comparison, the number of hidden variables in the LWPLS model is set to be 5, and a weight coefficient is set to be 1;

the window size in the MW-LWPLS submodel is set to 28;

for the integrated model EA-LWPLS, spn and the adjustment coefficients in the weight function are set to 280 and 0.5, respectively.

Of the three adaptive regression submodels, the JITL-LWPLS submodel improved most significantly after the JITL model was added, wherein RMSE and R²0.417 and 0.819, respectively, where the degree of fit R is²Can be calculated from the following formula:

fig. 7 reveals octane number prediction comparison graphs of five models, and as shown in fig. 7, the JITL model can better reflect the variation trend of the octane number, but cannot accurately reflect the variation of the octane number.

Although MW-LWPLS and TD-LWPLS submodels cannot catch up with JITL-LWPLS submodels in reflecting the trend of changes, the submodels can reflect the change of octane value in the process. The results of the TD-LWPLS sub-model are closer to the actual process than the results of the MW-LWPLS sub-model, especially after the 300 th sample point, which, although not fully tracking the amplitude of the process, reflects the fluctuating changes of the process.

However, between the 50 th and 100 th sampling points, the predicted value of the TD-LWPLS sub-model does not change significantly when the actual octane number drops to a minimum, but remains relatively stable.

Compared with the TD-LWPLS submodel, the MW-LWPLS submodel can better reflect the slow change process of the system.

FIG. 7 (e) is a graph showing that the proposed integrated model of the present invention not only achieves optimal prediction performance, where RMSE and R²0.385 and 0.846 respectively, and also accurately reflects the real trend of the octane number changing along with the time.

The integrated learning online prediction method for key parameters in the continuous catalytic reforming process provided by the invention effectively improves the prediction accuracy of the model in the continuous catalytic reforming process containing characteristics of nonlinearity, mutation, time variation and the like.

While, for purposes of simplicity of explanation, the methodologies are shown and described as a series of acts, it is to be understood and appreciated that the methodologies are not limited by the order of acts, as some acts may, in accordance with one or more embodiments, occur in different orders and/or concurrently with other acts from that shown and described herein or not shown and described herein, as would be understood by one skilled in the art.

As used in this application and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.

The embodiments described above are provided to enable persons skilled in the art to make or use the invention and that modifications or variations can be made to the embodiments described above by persons skilled in the art without departing from the inventive concept of the present invention, so that the scope of protection of the present invention is not limited by the embodiments described above but should be accorded the widest scope consistent with the innovative features set forth in the claims.

Claims

1. an integrated learning online prediction method of a continuous catalytic reforming process key parameter, is characterized in that, comprises the following steps:

S1, select auxiliary variables and main variables of the continuous catalytic reforming process, carry out data collection to form a process data sample set, the auxiliary variables reflect the operating conditions of the continuous catalytic reforming process, and the main variables reflect the quality of the finished product;

S3. Divide the data set, including training data set, evaluation data set and test data set;

S4. Select spn data samples from the training data set as the overall data set, and the remaining data samples as the correction data set, respectively use the overall data set as the sub-training set to construct three adaptive regression sub-models. The three adaptive regression sub-models include A moving window model based on local weighted partial least squares, a time difference model based on local weighted partial least squares, and an instant learning model based on local weighted partial least squares;

S5. Extract data samples in turn from the correction data set, verify each adaptive regression sub-model, add the current data sample to the sub-training set of the adaptive regression sub-model with the smallest estimated error value, and update the corresponding adaptive regression sub-model Model;

S6. Apply the evaluation data set to each adaptive regression sub-model, calculate the root mean square error of the predicted value of each adaptive regression sub-model, and use Bayesian estimation to determine the weight coefficient of each adaptive regression sub-model , build an integrated model;

S7. Using the constructed ensemble model, based on the input auxiliary variables, predict the final output value of the main variable.

2. the integrated learning online prediction method of continuous catalytic reforming process key parameter according to claim 1, is characterized in that, in described step S1:

Auxiliary variables including recontact temperature, separator tank temperature, hydrocracked naphtha, naphtha, reactor temperature, knockout tank pressure, circulating hydrogen flow, primary paraffin content, compressor pressure, reflux flow, tray temperature, Reflux flow, tray temperature, bottom temperature and reactor No. 1 outlet temperature;

The main variable is octane number, and the corresponding expression is as follows,

Among them, Y _RON ,

and

is the output measurement parameter of the continuous catalytic reforming process and X _Feed is the feed variable of the continuous catalytic reforming process.

3. the integrated learning online prediction method of the key parameter of continuous catalytic reforming process according to claim 1, is characterized in that, the pretreatment method in described step S2, further comprises:

Remove outliers and missing values, remove measurement noise.

4. the integrated learning online prediction method of the key parameter of continuous catalytic reforming process according to claim 1, is characterized in that, in described step S4, based on the moving window model of local weighted partial least squares method, the modeling process is as follows :

S411. Set {X ₁ , Y ₁ }, that is, the window size H, as the training data matrix in the initial window, wherein X ₁ is the initial H input variables in the selected overall data set, and Y ₁ is the selected overall data set. Initial H output variables;

S412, establishing a local weighted partial least squares regression model for the outputs of the multiple sampled data points behind the window;

S413. Let the window move forward by step D, so that {X _w , Y _w } retrain the local weighted partial least squares regression model in the window, where X _w is after performing w D steps in the overall data set The H input variables of , and Y _w are the H output variables after performing w D steps in the overall data set.

5. the integrated learning online prediction method of continuous catalytic reforming process key parameter according to claim 1, is characterized in that, in described step S4, based on the time difference model of local weighted partial least squares method, the modeling process is as follows :

S421, respectively calculating the first-order difference Δx(t) between the input variables of adjacent sampling data points and the first-order difference Δy(t) between the output variables;

S422, using a locally weighted partial least squares regression model to construct a relationship model between Δx(t) and Δy(t);

S423. Calculate the first-order difference Δx _q (t) of the query sample x _q (t);

S424, input Δx _q (t) into the relational model, and predict the difference value Δy _p (t) of the response variable;

S425, based on the adjacent sampling data points y _p (t-1) and the difference value Δy _p (t), calculate the predicted value y _p (t) of the response variable, and the expression is as follows,

y _p (t)=Δy _p (t)+y _p (t-1).

6. the integrated learning online prediction method of the key parameter of continuous catalytic reforming process according to claim 1, is characterized in that, in described step S4, based on the real-time learning model of local weighted partial least squares method, the modeling process is as follows :

S431. When a query sample x _q ∈ R ^m arrives in the traversal input, calculate the distance between x _q and x _i ,

Among them, a data set consisting of n data sample points (x _i , y _i ) _i=1～n , ^x∈Rm and ^y∈Rl represent the input and output training sets, respectively, and _xi represents the ith input training set value;

S432. Select N data samples with the smallest distance to form a relevant sample set;

S433, constructing a locally weighted partial least squares regression model based on the relevant sample set;

S434. Obtain the predicted value of the response variable corresponding to the query sample x _q through the LWPLS regression model.

7. the integrated learning online prediction method of continuous catalytic reforming process key parameter according to claim 1, is characterized in that, in described step S5:

The estimated error is the error value between the predicted value and the actual value;

8. the integrated learning online prediction method of the key parameter of continuous catalytic reforming process according to claim 1, is characterized in that, described step S6, calculates the root mean square error of the predicted value of each adaptive regression sub-model, The expression is as follows:

The root mean square error RMSE _i of the predicted values of the ith adaptive regression submodel,

Among them, y _d represents the actual value, y _{dp, i} represents the predicted value of the ith adaptive regression sub-model, and D represents the number of evaluation samples in the evaluation data set.

9. the integrated learning online prediction method of the key parameter of continuous catalytic reforming process according to claim 8, is characterized in that, in described step S6, the output result y _p of integrated model, corresponding expression is:

Among them, y _{p, i} represents the predicted output value of the ith adaptive regression sub-model, C is the current process state, P(M _i | C) is the current process state C is the ith adaptive regression sub-model M _i The probability of , M ₁ , M ₂ , ..., M _I is that there are I different adaptive regression sub-models.

10. the integrated learning on-line prediction method of continuous catalytic reforming process key parameter according to claim 9, is characterized in that, in described step S6, in current process state C is the _i -th self-adaptive regression sub-model Mi. The probability P(M _i |C), the corresponding expression is:

Among them, P(M _i ) is the prior probability of the ith adaptive regression sub-model, P(C|M _i ) is the probability that the _ith adaptive regression sub-model Mi is the current process state C, and p is the adjustment parameter, RMSE _i is the root mean square error of the predicted value of the ith adaptive regression sub-model;

The corresponding expression of P(C|M _i ) is,

The corresponding expression of P(M _i ) is,

The corresponding expression of _ri is,