CN111291020A

CN111291020A - Dynamic process soft measurement modeling method based on local weighted linear dynamic system

Info

Publication number: CN111291020A
Application number: CN201911094779.7A
Authority: CN
Inventors: 方靖云; 何雨辰; 张丽芳; 曾九孙; 严天宏
Original assignee: China Jiliang University
Current assignee: China Jiliang University
Priority date: 2019-11-11
Filing date: 2019-11-11
Publication date: 2020-06-16

Abstract

The invention discloses a dynamic process soft measurement modeling method based on a local weighted linear dynamic system. The invention introduces sliding windows, establishes a linear dynamic system model in each sliding window, obtains the original space weight and the hidden space weight of an online sample relative to all offline samples by calculating the similarity of the online sample and the offline samples in each window in an original space and a hidden space, and establishes a weighted linear dynamic system model based on the two weights to obtain the hidden variables of all the offline samples. And finally, calculating a butane content predicted value of the online sample by a local weighted regression method. The invention simultaneously considers the similarity relation between the online sample and the offline sample in the original space and the hidden space, and improves the accuracy of key variable prediction in the industrial process.

Description

Dynamic process soft measurement modeling method based on local weighted linear dynamic system

Technical Field

The invention belongs to the field of industrial and chemical process soft measurement modeling and application, and particularly relates to a dynamic process soft measurement modeling method based on a local weighted linear dynamic system.

Background

The industrial production process is very complex, the quality requirement of products is higher and higher, the production quality of the products can be ensured by effectively monitoring the production process, and the process monitoring is becoming more and more important. Typical process monitoring methods are knowledge-based methods and data-driven methods. The knowledge-based methods perform diagnosis through known knowledge, and can generally achieve good results under the condition of complete knowledge; however, the current industrial production has high complexity, so the data-driven method can avoid the defect that the knowledge-based method needs a large amount of comprehensive process knowledge, and the effective monitoring is carried out by analyzing the relation among data in the production process and modeling.

At present, key variables in a complex industrial production process are often difficult to be directly measured by a sensor, and a soft measurement model between a process variable which is easy to be measured and the key variable which is difficult to be directly measured needs to be established. The traditional soft measurement model such as probability principal component analysis is a static model, and dynamic characteristics and nonlinear characteristics before and after data are not considered; linear dynamic system models, while taking into account the dynamics between data, are difficult to directly describe for non-linearities in complex industrial processes. Therefore, the traditional linear dynamic system model is expanded into a linear dynamic system model based on sliding window weighting under a probabilistic modeling framework, and the dynamic and nonlinear characteristics among data are well explained, so that the problem of monitoring key variables in the industrial production process is solved.

Disclosure of Invention

The invention provides a dynamic process soft measurement modeling method based on a local weighted linear dynamic system, which is characterized by comprising the following steps of:

step 1: and collecting offline data of the debutanizer as a training sample set, wherein the training sample set comprises a plurality of groups of training samples, and each group of training samples comprises a plurality of process variables and a key variable which are observed at the same time. The process variables are flow values, pressure and temperature of different parts in the operation process of the debutanizer; the key variable is the butane content value at that time obtained by off-line assay analysis.

Step 2: and introducing sliding windows, traversing the training sample set by the sliding windows with a fixed step length to obtain a plurality of sliding windows, wherein each sliding window comprises process variables of a plurality of groups of training samples, the process variables of the plurality of groups of training samples contained in each sliding window are defined as sub-training sets of the sliding window, and the sub-training sets in different sliding windows are not identical. The window length and the step length of the sliding window are manually preset. And establishing a linear dynamic system model for the sub-training set under each window to obtain model parameters calculated by the sub-training set under each window and a plurality of hidden variable values of each group of training samples in a hidden space.

The window length and the step length of the sliding window are manually determined in advance, if the window step length is defined to be S, the length of the training sample set is N, the window length is T, the total number of windows is

And the sub-training set in each window comprises T groups of training samples.

And step 3: online data of the debutanizer is collected as a test sample set, the test sample set including a plurality of sets of online samples, each set of online samples including a plurality of process variables at the same time. The process variable name and number of the online sample are consistent with the process variable of the offline sample. Calculating the similarity of each group of online samples in the original space relative to the training sample set; and calculating the similarity of each group of online samples in the hidden space relative to each sub training set. And finally, obtaining the global similarity of the online samples in the original space and the global weight of the hidden space relative to the training sample set through calculation.

And 4, step 4: establishing a local weighted linear dynamic system model, taking the process variables of the training sample set, the hidden space global weight of the online samples and the original space global weight of the online samples as the input of the model, and obtaining the parameters of the model under the model.

And 5: and extracting the training sample closest to the online sample at the sampling time in the training sample set, and calculating to obtain the hidden variable value of the online sample by a Kalman filtering method.

Step 6: and establishing a local weighted linear regression model according to the model parameters obtained by the local weighted linear dynamic system model and the training sample set to obtain model parameters and predict key variable values of the online sample, namely the butane content value.

The processing method of each sliding window in the step 2 is the same, and the processing method of the η th sliding window specifically includes:

2.1) carrying out normalization processing of subtracting the mean value and dividing by the square difference on the sub-training set in the window, and establishing a linear dynamic system model on the normalized sub-training set:

the linear dynamic system model is represented as follows:

representing hidden variable values of a t group of training samples under the linear dynamic system model in an η th window in a hidden space;

representing the process variable in the t-th set of training samples in the η -th window;

A_η∈R^H×Hrepresenting a state transition probability matrix under the linear dynamic system model in the η th window;

B_η∈R^V×Hrepresenting the emission probability matrix under the linear dynamic system model in the η th window;

representing the noise of the t group of hidden variables under the linear dynamic system model in the η th window;

noise representing the process variable of the tth set of training samples in the η th window;

v is the number of process variables in a group of training samples, H is the number of variables in a group of hidden variables, T is 1, 2.

Noise(s)

And

are subject to a gaussian distribution,

and

is the covariance of the implicit variable and the process variable in the η th window.

2.2) maximizing the likelihood function by building the likelihood function of the model and by Kalman filtering, Kalman smoothing and expectation maximization (EM algorithm), recording the parameters of the model in the η th window when the likelihood function converges

Wherein

And (4) the hidden variable values in the hidden space of each group of training samples in the window.

The processing method of the group of online samples in step 3 in each window is consistent, and the processing method of the online samples in the η th window specifically includes:

3.1) calculating the global weight of the process variables of the training sample set in relation to the online samples in the original space:

wherein v is_nProcess variables of an nth set of training samples in the set of training samples; v. of_newIs a process variable of the online sample;

representing the similarity of the process variables of the nth set of training samples in the original space with respect to the online samples; n is the number of training sample sets.

Secondly, the global weight of the training sample set in the original space relative to the online sample is calculated:

wherein λ is_nGlobal weights for the nth training sample in the original space with respect to the online samples; zeta_vThe weight control parameters of the original space are manually set;

3.2) calculating the global weight of the online samples in the hidden space relative to the process variable of the training sample set:

firstly, computing the hidden variables of the online samples in η th windows, wherein the computing method of the online samples in each window is the same:

wherein v is_newIs an online sample; mu.s_ηIs the mean value of the sub-training set in the η th window_ηThe variance of the sub-training set in the η th window, B_ηAnd

model parameters of the linear dynamic system model in the η th window;

and obtaining an implicit variable after the online sample in the η th window is subjected to projection.

Secondly, calculating the similarity of the process variables of the sub-training set in the η th window in the hidden space relative to the online sample:

wherein,

and (3) the similarity between the variable values of the t-th group of training samples in the hidden space in the η th window and the hidden variables obtained after the online samples are projected in the η th window.

Then summing the similarity obtained by each group of training samples in different windows to obtain the global similarity of the online samples in the hidden space relative to the training sample set:

wherein, theta_n,iSimilarity for the nth set of training samples with respect to the online samples in the ith window; Γ, i is the number of window sequences comprising the nth set of training samples, Γ is the total number of all sliding windows comprising the nth set of training samples,

global similarity of the nth set of training samples in the set of training samples.

And finally, calculating the global weight of the training sample set in the hidden space relative to the online sample:

therein, ζ_hThe weight control parameters are manually set; phi is a_nThe global weight of the nth training sample in the hidden space with respect to the online samples is used.

The sliding window weighted linear dynamic system model in the step 4 is as follows:

h_n＝Ah_n-1+a_n

v_n＝Bh_n+b_n

whereinA and B are a state transition probability matrix and an emission probability matrix under a weighted linear dynamic system model; h is_nHidden variable values of process variables of the nth group of training samples in the training sample set in a hidden space; v. of_nProcess variables of an nth set of training samples in the set of training samples; a is_n，b_nAre respectively h_nAnd v_nThe noise of (2).

Noise a_nAnd b_nAll obey Gaussian distribution, a_n～N(0,Σ_h)，b_n～N(0,Σ_v)；Σ_hSum-sigma_vIs the covariance of the hidden variable and the process variable in the model.

The input to the model is the process variable of a training sample set with respect to the weight λ ═ of the online samples in the original space (λ)₁,λ₂,...,λ_N) The weight of the training sample set in relation to the online samples in the hidden space is (phi)₁,φ₂,...,φ_N)。

By establishing a likelihood function for the model and maximizing the likelihood function by Kalman filtering, Kalman smoothing and expectation maximization (EM algorithm), the parameters of the model are recorded when the likelihood function converges

Wherein

The variance of each set of hidden variables in the kalman filter.

In the step 5, the calculation methods of each group of online samples are consistent, wherein one group of online samples are processed, and the specific formula of the Kalman filtering in the step 5 is as follows:

V^*＝AF^*A+Σ_h

K＝V^*B^T(BV^*B^T+Σ_v)^-1

h_new＝Ah^*+K(v_new-BAh^*)

wherein h is^*And F^*The training sample closest to the online sample at the sampling moment is the variable value and the variance in the hidden space; k is a Kalman gain matrix; h is_newCalculating a hidden variable value of the online sample in a hidden space by a Kalman filtering method for the online sample; v. of_newIs the process variable of the on-line sample.

The specific steps for predicting the key variables of a group of online samples in the step 6 are as follows, and the calculation methods of each group of online samples are consistent:

first, a weighted average of the key variables of the online sample with respect to λ is calculated, and the weighted average is subtracted from each key variable of the training sample set:

wherein Y is (Y)₁,y₂,...y_N)∈R^1×NIs a key variable in the training sample set;

is a weighted average.

And subtracting the key variable of the training sample set after the weighted average value is subtracted from the nth group.

Secondly, predicting key variables of the online samples through local weighted linear regression:

b＝(Yλ^*h^T)(hλ^*h^T)^-1

wherein b isRegression parameter, y, of a locally weighted linear regression_newIs the key variable of the online sample obtained by final prediction. Lambda [ alpha ]^*Is a matrix of lambda after diagonalization, lambda^*∈R^N×N。

All the steps 2 to 6 are modeling a group of online samples, and online predicting the butane content value of the online samples. The butane content values are predicted for multiple sets of online samples by repeating steps 2 through 6.

Drawings

Fig. 1 shows soft measurement results of a linear dynamic system model based on local weighting.

Fig. 2 is a flow chart of the algorithm.

Detailed Description

The invention is further described with reference to the following figures and specific examples.

Aiming at the problem of online monitoring of the butane content in the debutanizer, the invention carries out soft measurement modeling on variables which are easy to directly measure, and estimates the butane content value in the chemical process on line.

The embodiment of the invention and the specific implementation process thereof are as follows:

the first step is as follows: and collecting the process variables of the operation of the debutanizer in the chemical process by using a distributed control system, and storing the process variables into a history database system.

The second step is that: and obtaining a key variable value under the process variable at each moment, namely a butane content value, through offline assay analysis, and storing the value into a history database system. This yields the process variables and the key variables. And extracting part of process variables and key variables thereof to form a training sample set.

In order to obtain an optimal parameter set, in the step E of the expectation maximum algorithm, Kalman filtering and smoothing operation are required, the calculation method of each window is consistent, and the step E in the η th window specifically comprises the following steps:

firstly, kalman filtering operation is required, and a specific calculation formula is as follows:

wherein,

a Kalman gain matrix for the t-th set of training samples in the η th window;

estimate the variance of the t-th group of hidden variables for the η th window;

the t group hidden variable in the η th window obtained by final calculation;

the variance of the t-th group of hidden variables in the η th window is finally calculated.

Next, kalman smoothing operation is performed, and a specific calculation formula is as follows:

wherein,

a gain matrix for the t-th set of training samples in the η -th window in Kalman smoothing;

the t group of hidden variables in η th window after Kalman smoothing;

is the variance of t group hidden variables in η th window after Kalman smoothing

And finally, calculating the estimated values of the first medium and second medium statistics of the hidden variables:

and (3) updating the parameters of the model in the step M of the η th window, wherein the specific steps are as follows:

whereby through successive iterations, eventually when the likelihood function converges, the optimal set of parameters for the window is recorded

The fourth step: collecting easily measured process variables in the online operation process of the debutanizer to form an online sample, and calculating the global weight lambda of the online sample in an original space and the global weight phi of the online sample in a hidden space.

The fifth step: taking the process variable of the training sample set, the hidden space global weight phi of the online sample and the original space global weight lambda of the online sample as the input of a local weighted linear dynamic system model, and solving through an expectation maximization algorithm to obtain an optimal parameter set, wherein in the step E of the expectation maximization algorithm, the specific calculation steps are as follows:

V_n＝AF_n-1A^T+Σ_h(30)

K_n＝V_nB^T(BV_nB^T+Σ_v)^-1(31)

F_n＝(I-K_nB)V_n(33)

wherein, K_nA Kalman gain matrix for the nth set of training samples; v_nIs an estimate of the nth set of latent variable variances; h is_nTo finally countCalculating the nth group of hidden variables; f_nThe variance of the nth group of hidden variables is obtained through final calculation. N, N is the total number of the training sample set, and then kalman smoothing operation is required, wherein a specific calculation formula is as follows:

J_n＝(AF_n)^T·(AF_nA^T+Σ_h)^-1(34)

wherein, J_nA gain matrix for the nth set of hidden variables;

the nth group of hidden variables is subjected to Kalman smoothing;

is the variance of the nth group of hidden variables in the η th window after Kalman smoothing.

And in Kalman smoothing there are

and M, updating the parameters of the model, and specifically comprising the following steps:

whereby through successive iterations, eventually when the likelihood function converges, an optimal set of parameters is recorded

And a sixth step: inserting the online samples into the sampling time of the online samples according to the sequence of the sampling time; and after the training sample closest to the sampling instant of the online sample. And calculating to obtain the hidden variable value of the process variable of the online sample in the hidden space by a Kalman filtering method. The specific calculation formula is as follows:

V^*＝AF^*A+Σ_h

K＝V^*B^T(BV^*B^T+Σ_v)^-1

h_new＝Ah^*+K(v_new-BAh^*)

the seventh step: the specific steps for predicting the key variables of the online samples are as follows, and the calculation methods of each group of online samples are consistent:

b＝(Yλ^*h^T)(hλh^T)^-1

where b is the regression parameter of the locally weighted linear regression, y_newIs the key variable of the online sample obtained by final prediction.

The effectiveness of the invention is illustrated below by a specific debutanizer example. For the process, a total of 2000 sets of process variables of the tobutane tower data are collected and the values of the key variables, namely the butane content values, are obtained through off-line analysis. 400 groups of data are selected for modeling, and 200 groups of data acquired additionally are used as online samples for verifying the effectiveness of the samples. In the process, 7 process variables are selected for soft measurement modeling, wherein the 7 process variables are tower top pressure, tower top temperature, sensitive plate temperature, next-stage flow, tower bottom temperature, tower bottom pressure and reflux flow. The following detailed description of the steps of the present invention is provided in conjunction with the specific process:

the 400 groups of training sample sets are divided into a plurality of sub-training sets through a sliding window, and normalization processing is carried out on each sub-training set.

And establishing a linear dynamic system model for each sub-training set, and recording parameters of the model.

1. According to the method given in the implementation step, the global weight of each group of online samples in the original space relative to the training samples and the weight of the original samples in the hidden space are calculated.

2. And establishing a local weighted linear dynamic system model for a group of online samples and two global weights of the online samples relative to the training samples, and recording parameters of the model.

3. Calculating the variable value of each group of online samples in the hidden space according to the model parameters recorded in the step 2 and the method given in the implementation step

4. And calculating the predicted value of the butane content of each group of online samples by a local weighted linear regression method.

5. Repeat 2 through 4 until the butane content values for all online samples are calculated.

Fig. 1 shows an online predicted value of butane content, a blue line is a value obtained by offline assay analysis of butane content, a red scatter point is an online estimated butane content value obtained by model prediction, the closer the red scatter point is to the blue line, the better the explanation effect is, and the predicted root mean square error is 0.0307. Compared with the traditional soft measurement method, the method has the advantages that the dynamic characteristic and the nonlinear characteristic of data are well considered through introducing the sliding window, calculating the weight and weighting the linear dynamic system model, online prediction is carried out, and the butane content value is difficult to measure, so that the soft measurement result is more reliable.

Fig. 2 is a flow chart of the method.

The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are within the spirit of the invention and the scope of the appended claims.

Claims

1. The dynamic process soft measurement modeling method based on the local weighted linear dynamic system is characterized by comprising the following steps of:

Step 2: and introducing sliding windows, traversing the training sample set by the sliding windows with a fixed step length to obtain a plurality of sliding windows, wherein each sliding window comprises process variables of a plurality of groups of training samples, the process variables of the plurality of groups of training samples contained in each sliding window are defined as sub-training sets of the sliding window, and the sub-training sets in different sliding windows are not identical. And establishing a linear dynamic system model for the sub-training set under each window to obtain model parameters calculated by the sub-training set under each window and a plurality of hidden variable values of each group of training samples in a hidden space.

And step 3: online data of the debutanizer is collected as a test sample set, the test sample set including a plurality of sets of online samples, each set of online samples including a plurality of process variables at the same time. The process variable name and number of the online sample are consistent with the process variable of the offline sample. Calculating the similarity of each group of online samples in the original space relative to the training sample set; and calculating the similarity of each group of online samples in the hidden space relative to each sub training set. And finally, calculating to obtain the global similarity of each group of online samples in the original space and the global weight of the hidden space relative to the training sample set.

2. The dynamic process soft measurement modeling method based on the local weighted linear dynamic system as claimed in claim 1, wherein the processing method of each sliding window in the step 2 is the same, and the processing method of the η th sliding window is specifically:

2.1) carrying out normalization processing on the sub-training set in the window, and establishing a linear dynamic system model for the normalized sub-training set:

the linear dynamic system model is represented as follows:

representing the t group of hidden variables under the linear dynamic system model in the η th window;

representing the process variable of the t-th set of training samples in the η -th window;

v is the number of process variables in a set of training samples, H is the number of variables in a set of hidden variables, T is 1, 2.

Noise(s)

And

are subject to a gaussian distribution,

and

2.2) Linear dynamic System model by establishing the likelihood function of the model and maximizing the likelihood function by Kalman filtering, Kalman smoothing and expectation maximization (EM algorithm), when the likelihood function converges, recording the parameters of the model under the η th window

Wherein

The optimal estimated value of the t group of hidden variables under the linear dynamic system model in the η th window is obtained.

3. The dynamic process soft measurement modeling method based on the local weighted linear dynamic system as claimed in claim 1, wherein the processing method of the group of online samples in step 3 in each window is consistent, and the processing method of the online samples in η th window is specifically:

3.1) calculating the similarity of the process variables of the training sample set in relation to the online samples in the original space:

representing the similarity of the process variables of the nth set of training samples in the original space with respect to the online samples; n is the number of groups in the training sample set.

wherein λ is_nGlobal weights for the nth training sample in the original space with respect to the online samples; zeta_vControlling parameters for the weights of the original space;

firstly, computing the hidden variable of the online sample in an η th window by a projection method, wherein the computing method of the online sample in each window is the same:

model parameters obtained by training a linear dynamic system model in the η th window;

in the η th window, the online sample is projected to obtain a hidden variable.

Secondly, calculating the global weight of the process variable of the sub-training set in the η th window in the hidden space of the online sample:

wherein,

wherein, theta_n,iSimilarity for the nth set of training samples with respect to the online samples in the ith window; i is 1,2,. Γ, i is the number of window sequences that contain the nth set of training samples, and Γ is the total number of all sliding windows that contain the nth set of training samples.

therein, ζ_hA weight control parameter; phi is a_nThe global weight of the nth training sample in the hidden space with respect to the online samples is used.

4. The method of claim 1 for modeling dynamic process soft measurements based on a locally weighted linear dynamic system, wherein: the local weighted linear dynamic system model in step 4 is as follows:

a and B are a state transition probability matrix and an emission probability matrix under a weighted linear dynamic system model; h is_nHidden variable values of process variables of the nth group of training samples in the training sample set in a hidden space; v. of_nProcess variables of an nth set of training samples in the set of training samples; a is_n，b_nAre respectively h_nAnd v_nThe noise of (2).

The input of the model is the process variable of the training sample set, and the global weight in the original space is (lambda)₁,λ₂,...,λ_N)∈R^1×NGlobal weight in implicit space phi ═ phi (phi)₁,φ₂,...,φ_N)∈R^1×N。

Wherein

The variance for each set of hidden variables.

5. The method of claim 1 for modeling dynamic process soft measurements based on a locally weighted linear dynamic system, wherein: in step 5, a specific variable value formula of the online sample in the hidden space is obtained through Kalman filtering:

V^*＝AF^*A+Σ_h(9)

K＝V^*B^T(BV^*B^T+Σ_v)^-1(10)

h_new＝Ah^*+K(v_new-BAh^*) (11)

wherein h is^*And F^*The training samples, which are closest to the online samples at the sampling instant, the variable values and variances in the implicit space,

k is a Kalman gain matrix; h is_newAnd calculating the hidden variable value of the online sample in the hidden space for the online sample by a Kalman filtering method.

6. The method of claim 1 for modeling dynamic process soft measurements based on a locally weighted linear dynamic system, wherein: the specific steps for predicting the key variables of a group of online samples in the step 6 are as follows, and the calculation methods of each group of online samples are consistent:

is a weighted average.

b＝(Yλ^*h^T)(hλ^*h^T)^-1(14)

where b is the regression parameter of the locally weighted linear regression, y_newIs the key variable of the online sample obtained by final prediction. Lambda [ alpha ]^*Is a matrix of lambda after diagonalization, lambda^*∈R^N×N。