CN113449476B

CN113449476B - Stacking-based soft measurement method for butane content in debutanizer

Info

Publication number: CN113449476B
Application number: CN202110771243.5A
Authority: CN
Inventors: 葛志强; 庄新镇; 孔祥印
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2021-07-08
Filing date: 2021-07-08
Publication date: 2022-07-05
Anticipated expiration: 2041-07-08
Also published as: CN113449476A

Abstract

The invention discloses a Stacking-based soft measurement method for butane content in a debutanizer. The method includes the steps of carrying out feature expansion on process variables at the current sampling moment through a time sliding window, solving the problem of process dynamics by using historical process data, enhancing the diversity among first-layer learners in an ensemble learning model by introducing a feature disturbance mechanism and adopting a non-homogeneous learner, and further combining the output of the first-layer learner by using a Stacking integration strategy and a second-layer learner to obtain a final predicted value of butane content.

Description

Stacking-based soft measurement method for butane content in debutanizer

Technical Field

The invention belongs to the field of prediction and soft measurement of industrial processes, and particularly relates to a soft measurement method for butane content in a debutanizer based on Stacking.

Background

There are many variables in an actual industrial process that are difficult or costly to measure directly, and these variables tend to affect product quality to a large extent. The industrial process soft measurement technology is a method for estimating the true value of a variable to be measured by establishing a mathematical model between the variable to be measured and other variables easy to measure. The soft measurement technology is mainly divided into two types of modeling through mechanism and modeling based on data, and the development of computer technology and machine learning enables the application of the soft measurement method based on data modeling to be more and more extensive. Common soft measurement methods based on data modeling include: support vector machines, partial least squares, neural networks, and the like. These methods have better prediction effects on simple data sets, however, these single model methods perform poorly in the face of soft measurement problems of process nonlinearity, data non-gaussian distributions, and process dynamics.

The debutanization process is used in industrial oil refining processes to remove propane and butane from naphtha gases, and monitoring and controlling the butane content at the bottom of the debutanizer is important to maximize the production of liquefied petroleum gas. Gas chromatographs are used in the industry to monitor butane content, but the location of this hardware sensor installation can result in a delay of over 30 minutes in the measurement. Therefore, in order to realize real-time monitoring of the butane content, a soft measurement model of the butane content at the bottom of the debutanizer needs to be established so as to fully utilize other process data which can be monitored in real time in the debutanizing process. The debutanizing process has strong nonlinearity and dynamics, the butane content at the current moment and the process data at the previous sampling moments have close relation, and the traditional single prediction model is difficult to meet the requirements.

Patent publication No. CN103389360A discloses a soft measurement method for the butane content of a debutanizer based on a probability principal component regression model, introduces a probability modeling method, and provides a soft measurement model based on probability principal component regression, which can simultaneously model process data and noise information. Patent publication No. CN108647373A discloses "a method for soft measurement of industrial process based on xgboost model", which includes: independent repeated sampling and data preprocessing are carried out on historical data, an xgboost model is established by utilizing a training sample, and a soft measurement model aiming at a target variable is established through cross validation and parameter adjustment. The methods reported in the above patents all use a soft measurement method based on a specific regression model, and when the method is applied to the butane content prediction of a debutanizer, the characteristics of dynamicity, nonlinearity and the like of debutanizer process data cannot be well processed at the same time, and when the method is applied to the establishment of a butane content soft measurement model, the prediction accuracy is difficult to guarantee.

Patent publication No. CN110066895A discloses "a method for predicting a blast furnace molten iron mass interval based on Stacking", which comprises: acquiring original historical data of a blast furnace and preprocessing the data; extracting a sample data set from the preprocessed blast furnace original historical data according to the input and output parameters; establishing a Stacking algorithm molten iron quality model based on an N-fold model and calculating a modeling error prediction interval; and predicting the real-time collected blast furnace data according to the Stacking algorithm molten iron quality model of the N-fold model to obtain a predicted value and a predicted interval. In fact, the first-layer learner in the Stacking model reported in the patent only has one random weight neural network, and the second-layer learner adopts the same model as the first-layer learner, so that the advantages of the Stacking model cannot be well utilized.

Disclosure of Invention

The invention aims to provide a soft measuring method for butane content in a debutanizer based on Stacking, aiming at overcoming the defects of the prior art in solving the problems of nonlinearity and dynamics of debutanizing process data.

The purpose of the invention is realized by the following technical scheme: a method for integrally learning soft measurement of butane content in a debutanizer based on Stacking comprises the following steps:

(1) continuously sampling the tower top temperature, the tower top pressure, the reflux quantity, the flow to the next stage, the sixth tower plate temperature, the tower bottom temperature 1 and the tower bottom temperature 2 of the debutanizer for n times in a fixed sampling period T, obtaining the butane content value at each sampling moment through off-line laboratory analysis, and obtaining n samples which are used as an original sample set and are expressed as D_train＝{(X_i′,y_i′) 1,2, …, n, where X is_i′For the feature vector, there are seven columns, X_i′∈R⁷Each column represents the column top temperature, column top pressure, reflux amount, flow rate to the next stage, sixth column plate temperature, column bottom temperature 1, column bottom temperature 2, y at the i' th sampling time_i，For the prediction target, there is one column in total, y_i，∈R¹And represents the butane content at the i' th sampling time.

(2) Respectively carrying out feature screening and feature disturbance on an original training sample set by utilizing an XGB OST model and a time sliding window mechanism, and constructing two training sets D_train1＝{(X_{i_train1},y_i)|i＝W₁,W₁+1, …, n } and D_train2＝{(X_{i_train2},y_i)|i＝W₁，W₁+1, …, n }; wherein, W₁Is a pair D_trainThe time sliding window width of the feature expansion.

(3) Establishing a two-layer ensemble learning butane content prediction model based on the Stacking learning strategy based on the two training sample sets obtained in the step (2), and specifically comprising the following steps:

(3.1) Cross-training method based on K-fold using D_train1Training two different learners, denoted L₁And L₂Using D_train2Training two different learners, denoted L₃And L₄Obtaining four learners in total as the first-layer learners in the prediction model, obtaining the prediction value of each learner on the training sample, and expressing the prediction value of the t-th learner on the ith training sample as

Wherein t is 1,2, 3, 4, i is W₁，W₁+1，…，n。

(3.2) mixing L₁And L₂Averaging of the predicted values over the training samples, L₃And L₄Averaging the predicted values over the training samples and using the original training set D_trainButane content y in_iConstructing a second-level learner training set D_constructed＝{(X″_i，_i)|i＝W₁，W₁+1, …, n }, wherein X ″ ", is_i∈R²，

(3.3) use of D_stackingTraining the second tier learner, denoted as L.

(4) Predicting the real-time butane content of the debutanizer by using a Stacking-based ensemble learning debutanizer butane content soft measurement model: performing characteristic disturbance and expansion on the tower top temperature, the tower top pressure, the reflux amount, the flow to the next stage, the sixth tower plate temperature, the tower bottom temperature 1, the tower bottom temperature 2 and historical sample data in a database obtained by the sensors in the step (2) to obtain two new samples, wherein the first sample is used as D in the step (3.1)_train1The two learner inputs obtained from the training, the second sample as D in step (3.1)_train2Training the obtained inputs of two learners to obtainAnd (4) when the four predicted values are obtained, respectively averaging the four predicted values in the step (3.2) to obtain two characteristic values which are used as the input of the second-layer learning device in the step (3.3), and using the output of the second-layer learning device as the final real-time predicted value of the butane content.

Further, in the step (2), feature screening and feature disturbance are respectively performed on the original training sample set by using the XGBOOST model and the time sliding window mechanism, and the specific steps are as follows:

(2.1) set of training samples D_train(X_i，) As input to the XGBOOST model, D_train(y_i′) As the target output of the XGBOOST model, training the XGBOOST model, deleting the features with lower scores according to the feature _ attributes of the XGBOOST model after the training is finished, and obtaining a training sample set D after feature screening_{train_screened}＝{(X′i_′，y_i′) 1,2, …, n, wherein X'_i′Each column of (a) respectively representing a retained feature, y_i′∈R¹Still represents the butane content at the i' th sampling instant.

(2.2) determining a first time sliding window width W₁Through a time sliding window pair D_trainThe feature variables in (1) are expanded, and a training sample set after feature expansion is expressed as

Wherein the content of the first and second substances,

representing the characteristic variable X of the ith sample after characteristic expansion through the first time sliding window_{i_train1}In total of 7W₁The columns of the image data are,

(2.3) determining a second time sliding window width W₂(W₂＜W₁) Through a time sliding window pair D_{train_screened}The feature variables in (1) are expanded, and a training sample set after feature expansion is expressed as

Wherein the content of the first and second substances,

representing the characteristic variable X of the ith sample after characteristic screening and characteristic expansion by a second time sliding window_{i_train2}In total of 4W₂The columns of the image data are,

further, the step (3.1) of training the first-layer learner based on the K-fold cross training method specifically includes the steps of:

(3.1.1) adding D_train1Dividing the prediction set into K subsets, taking K-1 subsets as a training set to train a learner, and taking the other subset as a prediction set; performing K times of training and prediction, selecting a subset different from the previous one as a prediction set each time, storing the prediction output of the learner on the prediction set after each training is completed, and obtaining all X times of the learner in the X times through K times of training and prediction_{i_train1}，i＝W₁，W₁Predicted values at +1, …, n; co-training two different learners, the two learners being at X_{i_train1}，i＝W₁，W₁The predicted values at +1, …, n are:

(3.1.2) reaction of D_train2Dividing the prediction set into K subsets, taking K-1 subsets as a training set to train a learner, and taking the other subset as a prediction set; performing K times of training and prediction, selecting a subset different from the previous one as a prediction set each time, storing the prediction output of the learner on the prediction set after each training is completed, and obtaining all X times of the learner in the X times through K times of training and prediction_{i_train2}，i＝W₁，W₁Predicted values on +1, …, n; co-training two different learners, the two learners being at X_{i_train2}，i＝W₁，W₁The predicted values at +1, …, n are:

further, in step (3.1), D is added_train1、D_train2Equally dividing into K subsets for training the learning device.

Further, the learner includes an XGBOOST regression model, a RBF kernel-based support vector machine regression model, a multi-layered perceptron regression model, a bayesian ridge regression model, a RBF kernel-based ridge regression model, and the like.

The invention has the beneficial effects that: the method and the device have the advantages that the process variable at the current sampling moment is subjected to characteristic expansion through the time sliding window, the problem of process dynamics is solved by using historical process data, the diversity among first-layer learners in the ensemble learning model is enhanced by introducing a characteristic disturbance mechanism and adopting a non-homogeneous learner, and then the final predicted value of the butane content is obtained by combining the output of the first-layer learner by using a Stacking integration strategy and a second-layer learner.

Drawings

FIG. 1 is a schematic flow diagram of a debutanizer column;

FIG. 2 is a schematic diagram of feature expansion using a width 3 time sliding window;

FIG. 3 is a flow chart of a soft measurement model for real-time estimation of butane content according to an embodiment of the present invention; wherein, XGBOOST represents XGBOOST regression model, KSVR represents support vector regression model based on RBF kernel, MLP represents multi-layer perceptron regression model, BRR represents bayesian ridge regression model, KRR represents ridge regression model based on RBF kernel;

FIG. 4 is a graph of the results of prediction of butane content in a Stacking-based ensemble learning debutanizer according to an embodiment of the present invention; wherein "+" is the laboratory analysis value of the butane content of each sampling point, and "+" is the predicted value of the butane content of each sampling point.

Detailed Description

The invention is further described in detail below with reference to the figures and examples.

The invention relates to a Stacking-based ensemble learning debutanizer soft measurement method, which comprises the following steps:

(1) continuously sampling the tower top temperature U1, the tower top pressure U2, the reflux U3, the flow U4 to the next stage, the sixth tower plate temperature U5, the tower bottom temperature 1U6 and the tower bottom temperature 2U7 of the debutanizer for n times in a fixed sampling period T, obtaining the butane content value at each sampling moment through off-line laboratory analysis, and obtaining n samples as an original sample set which are expressed as D_train＝{(X_i′，y_i′) I | 1,2, …, n }. Wherein, X_i′For the feature vector, there are seven columns, X_i′∈R⁷Each column respectively represents the tower top temperature, the tower top pressure, the reflux quantity, the flow to the next stage, the sixth tower plate temperature, the tower bottom temperature 1 and the tower bottom temperature 2 at the ith' sampling moment; y is_i′For the prediction target, there is one column in total, y_i′∈R¹And represents the butane content at the i' th sampling time. Fig. 1 illustrates seven process variables that need to be collected at each sampling point.

(2) Respectively carrying out feature screening and feature disturbance on an original training sample set by utilizing an XGB OST model and a time sliding window mechanism, and constructing two training sets D_train1And D_train2The method comprises the following specific steps:

(2.1) set of training samples D_train(X_i′) As input to the XGBOOST model, D_train(y_i′) As the target output of the XGBOOST model, training the XGBOOST model, deleting three features with lower scores according to feature _ attributes of the XGBOOST model after the training is finished, and obtaining a training sample set D after feature screening_{train_screened}＝{(X′_i′，y_i′) 1,2, …, n, wherein X'_i′∈R⁴There are four columns, each column representing the column top temperature, reflux amount, flow rate to the next stage and sixth tray temperature at the i' th sampling time, y_i′∈R¹Still represents the butane content at the i' th sampling instant.

Wherein the content of the first and second substances,

as shown in FIG. 2, W₁＝3。

Wherein the content of the first and second substances,

(3.1) Cross-training method based on K-fold using D_train1Training two different learners, denoted L₁And L₂Using D_train2Training two different learners, denoted L₃And L₄Obtaining four learners in total as the first-layer learners in the prediction model, and obtaining the prediction value of each learner on the training sample, wherein the prediction value of the t-th learner on the ith training sample is represented as

Wherein t is 1,2, 3, 4, i is W₁，W₁+1, …, n. The method comprises the following specific steps:

(3.1.1) reaction of D_train1Equally dividing the data into K subsets, taking K-1 subsets as a training set to train a learning machine, and taking the other subset as a prediction set; performing K times of training and prediction, selecting a subset different from the previous one as a prediction set each time, storing the prediction output of the learner on the prediction set after each training is completed, and obtaining all X times of the learner in the X times through K times of training and prediction_{i_train1}，i＝W₁，W₁Predicted values on +1, …, n; co-training two different learners, the two learners being at X_{i_train1}，i＝W₁，W₁The predicted values at +1, …, n are:

(3.1.2) adding D_train2Equally dividing the data into K subsets, taking K-1 subsets as a training set to train a learning machine, and taking the other subset as a prediction set; performing K times of training and prediction, selecting a subset different from the previous one as a prediction set each time, storing the prediction output of the learner on the prediction set after each training is completed, and obtaining all X times of the learner in the X times through K times of training and prediction_{i_train2}，i＝W₁，W₁Predicted values on +1, …, n; co-training two different learners, the two learners being at X_{i_train2}，i＝W₁，W₁The predicted values at +1, …, n are:

(3.2) mixing L₁And L₂Averaging of predicted values over training samples

L₃And L₄Averaging of predicted values over training samples

And using the original training set D_trainButane content y in_iConstructing a second-level learner training set D_constructed＝{(X″_i＝[y′_i，y″_i]，y_i)|i＝W₁，W₁+1, …, n }; wherein, X ″)_i∈R²，i＝W₁，W₁+1，…，n。

(3.3) use of D_constructedTraining the second tier learner, denoted as L.

(4) Predicting the real-time butane content of the debutanizer by using the Stacking-based integrated learning debutanizer butane content soft measurement model established in the step (3):

and (3) performing characteristic disturbance and expansion on seven process variables including real-time tower top temperature, tower top pressure, reflux quantity, flow to the next stage, sixth tower plate temperature, tower bottom temperature 1 and tower bottom temperature 2 obtained by the sensors and historical sample data in a database in the step (2) to obtain two new samples, taking the first sample as the input of two learners obtained by training in the step (3.1.1), taking the second sample as the input of two learners obtained by training in the step (3.1.2) to obtain four predicted values, averaging the four predicted values in the step (3.2) to obtain two characteristic values respectively, taking the two characteristic values as the input of a second-layer learner in the step (3.3), and taking the output of the second-layer learner as a final real-time predicted value of the butane content.

Examples

The invention is illustrated below with reference to a specific debutanizer butane content prediction example:

the debutanizer is continuously sampled to obtain 2394 samples, the first 1596 samples are used as a training set for training the integrated learning soft measurement model, and the last 798 samples are used as a test set for verifying the effectiveness of the integrated learning soft measurement model. In the process, seven process variables were selected to model the butane content at the bottom of the debutanizer, which were the overhead temperature, overhead pressure, reflux, flow to the next stage, sixth tray temperature, bottom temperature 1, bottom temperature 2, as shown in fig. 1.

The following will explain the implementation steps of the present invention in detail with reference to the specific process, as shown in fig. 3, specifically:

1. an XGB OST regression model is trained by using 1596 training samples, and importance scores of seven process variables are obtained through feature _ attributes of the XGB OST model, wherein the importance scores are respectively as follows: 0.151, 0.001, 0.166, 0.095, 0.585, 0, 0.003. The 2 nd, 6 th and 7 th process variables in each sample are deleted in feature screening according to the scores of the various process variables.

2. Using a time sliding window with the width of 50 to perform characteristic expansion on the 1596 original training samples to obtain D_train11546 new training samples are included, each sample having a feature vector dimension of 350.

Using a time sliding window with the width of 35 to perform characteristic expansion on 1596 samples obtained by characteristic screening to obtain D_train2Comprises 1561 new training samples with a feature vector dimension of 140, and keeps D_train1And D_train2The number of middle training samples is consistent, and D is deleted_Train2The first 15 samples in (a).

3. Establishing a Stacking-based ensemble learning debutanizer butane content soft measurement model according to a detailed method in the implementation steps:

(1) use of D_train1Training an XGBOOST regression model (XGBOOST) and an RBF kernel-based support vector machine regression model (KSVR) using D_train2A multi-level perceptron regression Model (MLP) and a bayesian ridge regression model (BRR) are trained as the first level learner.

(2) Using 3-fold cross training method to separate D_train1The trisection is divided into three subsets, each subset contains 516, 515 and 515Training samples which are sequentially marked as a subset 1, a subset 2 and a subset 3, training an XGB regression model and a support vector machine regression model based on an RBF core by using the subset 1 and the subset 2 as training sets in a first round, then using the subset 3 as a prediction set of the two models, and storing the average value of prediction results of the two models; in the second round, the subset 1 and the subset 3 are used as training sets, an XGB regression model and an RBF kernel-based support vector machine regression model are trained, then the subset 2 is used as a prediction set of the two models, and the average value of prediction results of the two models is stored; the third round takes the subset 2 and the subset 3 as training sets, trains an XGB regression model and a RBF core-based support vector machine regression model, then takes the subset 1 as a prediction set of the two models, and stores the average value of the prediction results of the two models; the predicted average values of the XGB OST regression model and the RBF core-based support vector regression model on 1546 samples can be obtained through three rounds of training. Similarly, D_train2Trisection is carried out to three subsets, and the predicted average values of the multilayer perceptron regression model and the Bayesian ridge regression model on 1546 samples are obtained through three rounds of training.

(3) These two predicted averages are used as the eigenvalues of the second-level learner input samples, D_train1Using the 1546 label values as output labels of the second-layer learner, and constructing a training set D of the second-layer learner_construtedA ridge regression model (KRR) based on RBF kernel is trained as a second-level learner.

4. 798 test samples were used to verify the validity of the soft measurement model proposed by the present invention:

after each test sample and each historical sample are subjected to feature expansion through a first time sliding window, the test sample with the feature vector dimension of 350 is used as a first-layer learner L₁And L₂The two predicted values are averaged to obtain a first characteristic of the input sample of the second-layer learner.

After the test sample is subjected to feature screening and feature expansion by a second time sliding window, the feature vector dimension is expanded to 140 and is used as a first-layer learner L₃And L₄Is inputted to obtainAnd averaging the two predicted values to be used as a second characteristic of the input sample of the second-layer learner.

And taking the output value of the second-layer learner, namely the ridge regression model based on the RBF core as the final estimated value of the real-time butane content.

FIG. 4 is a graph comparing the butane content prediction results and the butane content analysis values of the soft measurement model of the present invention over 798 test samples; wherein "+" is the laboratory analysis value of the butane content of each sampling point, and "+" is the model prediction value of the butane content of each sampling point. It can be seen that, at most sampling points, the estimated value of the butane content obtained by the soft measurement model provided by the present invention can track the offline analysis value of the butane content with a small error, and the Root Mean Square Error (RMSE) is obtained for the estimated value of 798 samples and the offline analysis value of 798 samples, and the error value is about 0.1, in this embodiment, the calculation formula of the Root Mean Square Error (RMSE) is:

wherein, y_kAn off-line analysis value representing the butane content of each sampling point,

a soft measurement estimate representing the butane content of each sample point.

The above examples are intended to illustrate the invention, but not to limit the invention, and any modifications and variations of the invention within the spirit of the invention and the scope of the claims are within the scope of the invention.

Claims

1. A method for integrally learning soft measurement of butane content in a debutanizer based on Stacking is characterized by comprising the following steps:

(1) continuously sampling the tower top temperature, the tower top pressure, the reflux quantity, the flow to the next stage, the sixth tower plate temperature, the tower bottom temperature 1 and the tower bottom temperature 2 of the debutanizer for n times in a fixed sampling period T, and introducingAnalyzing in off-line laboratory to obtain butane content value at each sampling time, and obtaining n samples as original sample set, denoted as D_train＝{(X_i′，y_i′) 1,2, ·, n }, where X ═ i' ═ 1,2_i′For the feature vector, there are seven columns, X_i′∈R⁷Each column represents the column top temperature, column top pressure, reflux amount, flow to the next stage, column tray temperature, column bottom temperature 1, column bottom temperature 2, y at the ith' sampling time, respectively_i′For the prediction target, there is one column in total, y_i′∈R¹Represents the butane content at the i' th sampling time;

(2) respectively carrying out feature screening and feature disturbance on an original training sample set by utilizing an XGB OST model and a time sliding window mechanism, and constructing two training sets D_train1＝{(X_{i_train1}，y_i)|i＝W₁，W₁+1,.,. n } and D_train2＝{(X_{i_train2}，y_i)|i＝W₁，W₁+1,.. multidot.n }; wherein, W₁Is a pair D_trainA time sliding window width of feature expansion;

Wherein t is 1,2, 3, 4, i is W₁，W₁+1，...，n；

(3.2) mixing L₁And L₂Averaging of the predicted values over the training samples, L₃And L₄Averaging the predicted values over the training samples and using the original training set D_trainButane content y in_iConstructing a second-level learner training set D_constructed＝{(X″_i，y_i)|i＝W₁，W₁+1, a_i∈R²，X″_i＝[y′_i，y′_i]，

i＝W₁，W₁+1，...，n；

(3.3) use of D_stackingTraining a second-layer learner, and recording as L;

(4) predicting the real-time butane content of the debutanizer by using a Stacking-based ensemble learning debutanizer butane content soft measurement model: performing characteristic disturbance and expansion on the tower top temperature, the tower top pressure, the reflux amount, the flow to the next stage, the sixth tower plate temperature, the tower bottom temperature 1, the tower bottom temperature 2 and historical sample data in a database obtained by the sensors in the step (2) to obtain two new samples, wherein the first sample is used as D in the step (3.1)_train1The two learner inputs obtained from the training, the second sample as D in step (3.1)_train2And (3) training the obtained input of the two learners to obtain four predicted values, respectively averaging the four predicted values in the step (3.2) to obtain two characteristic values which are used as the input of the second-layer learner in the step (3.3), and using the output of the second-layer learner as the final real-time predicted value of the butane content.

2. The method for integrated learning and soft measurement of the butane content in the debutanizer based on Stacking according to claim 1, wherein the step (2) of respectively performing feature screening and feature perturbation on an original training sample set by using an XGBOOST model and a time sliding window mechanism comprises the following specific steps:

(2.1) set of training samples D_train(X_i′) As input to the XGBOOST model, D_train(y_i′) As the target output of the XGBOST model, the XGBOST model is trained and finishedDeleting the features with lower scores according to the feature _ attributes of the XGBOOST model after the completion to obtain a training sample set D after feature screening_{train_screened}＝{(X′_i′，y_i′) 1,2, ·, n }, where X'_i′Each column of (a) respectively representing a retained feature, y_i′∈R¹Still represents the butane content at the i' th sampling instant;

(2.2) determining a first time window width W₁Through a time sliding window pair D_trainThe feature variables in (1) are expanded, and a training sample set after feature expansion is expressed as

Wherein the content of the first and second substances,

(2.3) determining a second time sliding window width W₂，W₂＜W₁Through a time sliding window pair D_{train_screened}The feature variables in (1) are expanded, and a training sample set after feature expansion is expressed as

Wherein, the first and the second end of the pipe are connected with each other,

representing the characteristic variable X of the ith sample after characteristic screening and characteristic expansion by a second time sliding window_{i_train2}In total of 4W₂The columns of the image data are arranged in rows,

3. the method for integrated learning and soft measurement of the butane content in the debutanizer based on Stacking according to claim 1, wherein the step (3.1) of training the first-layer learner based on the K-fold cross training method comprises the following specific steps:

(3.1.1) adding D_train1Dividing the training set into K subsets, taking K-1 subsets as a training set to train a learner, and taking the other subset as a prediction set; performing K times of training and prediction, selecting a subset different from the previous one as a prediction set each time, storing the prediction output of the learner on the prediction set after each training is completed, and obtaining all X times of the learner in the X times through K times of training and prediction_{i_train1}，i＝W₁，W₁+ 1.. ang., predicted value on n; co-training two different learners, the two learners being at X_{i_train1}，i＝W₁，W₁+1,.. the predicted values on n are:

i＝W₁，W₁+1，...，n；

(3.1.2) adding D_train2Dividing the training set into K subsets, taking K-1 subsets as a training set to train a learner, and taking the other subset as a prediction set; performing K times of training and prediction, selecting a subset different from the previous one as a prediction set each time, storing the prediction output of the learner on the prediction set after each training is completed, and obtaining all X times of the learner in the X times through K times of training and prediction_{i_train2}，i＝W₁，W₁+ 1.. ang., predicted value on n; co-training two different learners, the two learners being at X_{i_train2}，i＝W₁，W₁+1,.. the predicted values on n are:

i＝W₁，W₁+1，...，n。

4. the Stacking-based ensemble learning debutanizer butane content soft measurement method according to claim 3, wherein in step (3.1), D is_train1、D_train2Equally dividing into K subsets for training the learning device.

5. The Stacking-based ensemble learning debutanizer butane content soft measurement method according to claim 1, wherein the learner comprises an XGBOOST regression model, a RBF kernel-based support vector machine regression model, a multi-layered perceptron regression model, a bayesian ridge regression model, a RBF kernel-based ridge regression model.