CN113254874B

CN113254874B - Uncertainty non-stationary industrial process oriented anomaly monitoring method

Info

Publication number: CN113254874B
Application number: CN202110397942.8A
Authority: CN
Inventors: 周东华; 吴德浩; 陈茂银; 纪洪泉; 钟麦英
Original assignee: Shandong University of Science and Technology
Current assignee: Shandong University of Science and Technology
Priority date: 2021-04-14
Filing date: 2021-04-14
Publication date: 2022-04-15
Anticipated expiration: 2041-04-14
Also published as: CN113254874A

Abstract

The invention discloses an abnormality monitoring method for an uncertain non-stationary industrial process, and particularly relates to the field of industrial process abnormality monitoring. The invention provides a probabilistic stationary subspace analysis method based on the stationary subspace analysis method by considering the process uncertainty. The method models the uncertainty explicitly and effectively separates non-stationary trends from process uncertainty. In consideration of mutual coupling among model parameters, the method utilizes an expectation maximization algorithm to conduct parameter decoupling and deduces a closed-form solution of iterative updating. Based on the model, two detection indexes are provided for anomaly monitoring under a probability framework. Compared with the existing abnormal monitoring method for the non-stationary process, the method provided by the invention eliminates the influence of uncertainty of the process and improves the detection capability of tiny faults in the non-stationary process; and the problem of overfitting of model parameters can be avoided, and a more accurate generative model is established for non-stationary data.

Description

Uncertainty non-stationary industrial process oriented anomaly monitoring method

Technical Field

The invention belongs to the field of industrial process abnormity monitoring, and particularly relates to an abnormity monitoring method for an uncertain non-stable industrial process.

Background

Anomaly monitoring is important for ensuring normal and efficient operation of industrial processes and equipment. Practical industrial processes often exhibit significant non-stationary characteristics, namely: the statistical properties of the process data may change over time. Non-stationary characteristics may be caused by a variety of factors, including raw material variations, load fluctuations, and equipment aging. The non-stationary characteristic affects the application of traditional anomaly monitoring methods such as principal component analysis and the like, and two types of errors often occur in the conventional anomaly monitoring methods. The first type of error is false alarm, that is, the influence of fault is covered by non-stationary trend, thereby resulting in a large amount of false alarm; the second category of errors is false positives, because conventional methods have difficulty tracking changes that are not stationary trends, resulting in higher false positives.

Over the past decade, non-stationary industrial process anomaly monitoring methods have grown in length. To our knowledge, these methods can be mainly divided into four categories, namely: the self-adaptive updating method is based on a method of co-integration analysis, a method of subspace decomposition and a method of trend analysis. Taking recursive principal component analysis as an example, the self-adaptive updating method establishes a data model according to training data, and then updates the parameters of the model by using normal test data. Second, the method based on co-integration analysis wants to find its stationary linear combination among the non-stationary variables. Subspace decomposition-based methods typically separate the stationary subspace from the full space, of which stationary subspace analysis methods are representative. Finally, trend analysis based methods typically extract trend information from non-stationary processes to accomplish the monitoring task.

The above methods are well applicable in anomaly monitoring of non-stationary industrial processes, however they do not take into account the uncertainty issues that exist in non-stationary processes. Actual industrial processes often suffer from various uncertainties, which may be caused by factors such as random noise and unknown disturbances. Process uncertainty presents a challenge to anomaly monitoring of non-stationary processes. On the one hand, the actual non-stationary trend is partially masked by process uncertainties, which may degrade the monitoring performance of the algorithm, especially for minor faults. On the other hand, it is difficult to distinguish true non-stationary trends from changes due to uncertainty, which leads to over-fitting problems of the model parameters. Probabilistic models provide a new view to these challenges, which strikes a good balance between the actual trends of real interest and the changes due to uncertainty. In fact, the measurement data of an industrial process is naturally presented in a statistical form, not in a deterministic manner. Currently, there are some probabilistic latent variable models that have been applied to anomaly monitoring of industrial processes, but they focus on stationary processes, not non-stationary processes.

Disclosure of Invention

In order to solve the problems, the invention provides an abnormity monitoring method for an uncertain non-stationary industrial process, which eliminates the influence of process uncertainty and improves the detection capability of tiny faults in the non-stationary process.

The technical scheme of the invention is as follows:

an abnormity monitoring method facing to uncertain non-stationary industrial process comprises an off-line training stage and an on-line monitoring stage; wherein the content of the first and second substances,

A. the off-line training stage comprises the following steps:

A1. collecting historical data of equipment operation in normal working condition of non-stationary process

Wherein N is the number of samples in the historical data set, and m is the number of measurement variables;

A2. the equipment operation historical data is standardized to enable each variable to be in a zero-mean and unit variance standardized form, and the standardization method is shown as a formula (1):

where Λ is a diagonal matrix, the diagonal elements consist of the standard deviations of m variables, 1 is a column vector consisting of m 1's,

is a sample mean vector of historical data;

A3. carrying out Johansen test on the standardized data matrix X, and determining the number d of stable components, wherein d is more than or equal to 1 and less than or equal to m-1;

A4. the k-th sample X (k) in the normalized data matrix X is decomposed into a probabilistic stationary subspace analysis model shown in equation (2):

wherein the content of the first and second substances,

is a mixing matrix that is reversible and that,

consists of the preceding d columns of A,

consists of the rear m-d columns of A,

is a stationary component, sⁿ(k) Is a non-stationary component and is,

representing process uncertainty, which is independent of s (k), I_mIs an identity matrix;

A5. the samples in the normalized data matrix X are divided into n consecutive, non-overlapping data segments, each data segment being denoted as

The number of samples of each data segment is recorded as N_iAnd is provided with

A6. Setting a stationary component s^s(k) And non-stationary component sⁿ(k) Is not related, and s^s(k) Gaussian distribution obeying a d dimension

Is invariant on a time scale, sⁿ(k) Gaussian distribution obeying an m-d dimension

It is in each numberIs variable across segments;

A7. combining a4 and a6, the prior distribution of s (k) is:

then, the probability density function for x (k) is:

A8. estimating parameters of a probabilistic stationary subspace model using an expectation-maximization algorithm

A9. For the kth sample x (k), its corresponding local component is calculated by equation (5):

wherein H_i＝I-Σ_i(σ²I+A^TAΣ_i)^-1A^TA，i＝1,2,…,n；

A10. Calculating the similarity between the sample x (k) and the sample belonging to the ith data segment, as shown in equation (6):

wherein the content of the first and second substances,

as shown in equation (4), the prior probability

Is set to 1/n, and

A11. the estimated contribution from samples x (k) is calculated by equation (7):

A12. combining equations (2) and (7), the estimated process uncertainty is shown as equation (8):

A13. due to the fact that

Is a stationary time series, and is therefore designed

Statistics monitor changes in process uncertainty, sample x (k) corresponds to

The statistic is shown as equation (9):

A14. for estimated local components

The mean vector and covariance matrix are calculated by equations (10) and (11), respectively:

wherein R is_i＝AΣ_iA^T+σ²I_m，i＝1,2,…,n；

A15. Will be provided with

Is recorded as

Then

The mean vector and covariance matrix of the corresponding stationary components are calculated by equations (12) and (13), respectively:

wherein, W₁From an identity matrix I_mThe first d columns of (1);

A16. based on

Defining a local mahalanobis distance index as shown in equation (14):

A17. designing based on the weighted Mahalanobis distance as shown in equation (15)

Statistics to monitor the change in stationary components:

A18. given a confidence level α, statistics are determined using a kernel density estimation method

And

are respectively recorded as

And

B. the on-line monitoring stage comprises the following steps:

B1. acquiring real-time operation data y (t) of a non-stationary process of the equipment, and standardizing y (t):

wherein, y_iAnd x_iThe ith variable of the real-time data before and after normalization respectively,

is the mean of the ith variable and is,

is the standard deviation of the ith variable;

B2. for the real-time sample x (t) after normalization, its corresponding local component is calculated by equation (17):

wherein H_i＝I-Σ_i(σ²I+A^TAΣ_i)^-1A^TA，i＝1,2,…,n；

B3. Calculating the similarity between the sample x (t) and the samples belonging to the ith data segment, as shown in equation (18):

wherein the content of the first and second substances,

prior probability

Is set to 1/n, and

B4. the estimated contribution from samples x (t) is calculated by equation (19):

B5. combining equations (2) and (19), the estimated process uncertainty is shown as equation (20):

B6. calculating the real-time sample x (t) corresponds to

Statistics:

B7. based on

Calculating the local mahalanobis distance index as shown in equation (22):

B8. calculating the real-time sample x (t) corresponds to

Statistics:

B9. will make statistics of

And

respectively in their control limits

And

by comparison, if

And is

Judging that the industrial process is in a normal operation condition at present, otherwise, judging that the operation is abnormal.

Preferably, step A8 specifically includes:

A801. initial setting of model parameters

Arbitrarily assigning an initial value;

A802. calculating the first moment and the second moment corresponding to s (k) according to the parameter value theta obtained in the last step:

〈s(k)s(k)^T〉＝H_iΣ_i+〈s(k)〉〈s(k)〉^T (25)

wherein H_i＝I-Σ_i(σ²I+A^TAΣ_i)^-1A^TA，i＝1,2,…,n；

A803. The model parameters Θ are updated using equations (24) and (25):

wherein, W₁From an identity matrix I_mThe first d columns of (A) constitute (W)₂From an identity matrix I_mThe last m-d columns of (A);

A804. if the absolute value of the difference between the norm values of theta obtained by two iterations is less than 10^-5Stopping iteration and outputting the estimated value of the model parameter theta, otherwise returning to the step A802 for next iteration.

The invention has the following beneficial technical effects:

the method provided by the invention eliminates the influence of process uncertainty and improves the detection capability of tiny faults in a non-stationary process; and the problem of overfitting of model parameters can be avoided, and a more accurate generative model is established for non-stationary data.

Drawings

FIG. 1 is a flow chart of an anomaly monitoring method for an uncertain non-stationary industrial process according to the present invention;

FIG. 2 is a schematic diagram of a closed-loop controlled continuous stirred tank reactor according to an embodiment of the invention;

FIG. 3 shows a synergistic analysis-T in example 1 of the present invention²A graph of the monitored results of the statistics;

FIG. 4 is a graph showing the results of monitoring the Q statistic of the recursive principal component analysis in example 1 of the present invention;

FIG. 5 shows T of recursive principal component analysis in example 1 of the present invention²A graph of the monitored results of the statistics;

FIG. 6 is a graph showing the results of stationary subspace analysis-Mahalanobis distance monitoring in example 1 of the present invention;

FIG. 7 is T of probability stationary subspace analysis in embodiment 1 of the present invention_e ²A graph of the monitored results of the statistics;

FIG. 8 is T of probability stationary subspace analysis in embodiment 1 of the present invention_s ²A graph of the monitored results of the statistics;

FIG. 9 shows a co-integration analysis-T in example 2 of the present invention²A graph of the monitored results of the statistics;

FIG. 10 is a graph showing the results of monitoring the Q statistic of the recursive principal component analysis in example 2 of the present invention;

FIG. 11 shows T of recursive principal component analysis in embodiment 2 of the present invention²A graph of the monitored results of the statistics;

FIG. 12 is a graph showing the results of stationary subspace analysis-Mahalanobis distance monitoring in example 2 of the present invention;

FIG. 13 is T of probability stationary subspace analysis in embodiment 2 of the present invention_e ²Of statisticsA monitoring result graph;

FIG. 14 is T of probability stationary subspace analysis in embodiment 2 of the present invention_s ²And (5) a monitoring result graph of the statistics.

Detailed Description

The invention is described in further detail below with reference to the following figures and detailed description:

as shown in fig. 1, an anomaly monitoring method for uncertain non-stationary industrial processes includes an offline training phase and an online monitoring phase; wherein the content of the first and second substances,

A. the off-line training stage comprises the following steps:

is a sample mean vector of historical data;

A4. it is assumed that the kth sample X (k) in the normalized data matrix X can be decomposed into the form shown in equation (2):

this model is called a probabilistic stationary subspace analysis model, where,

is a mixing matrix that is reversible and that,

consists of the preceding d columns of A,

consists of the rear m-d columns of A,

is a stationary component, sⁿ(k) Is a non-stationary component and is,

A6. Assuming a stationary component s^s(k) And non-stationary component sⁿ(k) Is not related, and s^s(k) Gaussian distribution obeying a d dimension

It varies across data segments;

A7. the prior distribution of s (k) is known from the joint assumptions in A4 and A6 as:

then, the probability density function for x (k) is:

wherein H_i＝I-Σ_i(σ²I+A^TAΣ_i)^-1A^TA，i＝1,2,…,n；

wherein the content of the first and second substances,

as shown in equation (4), the prior probability

Is set to 1/n, and

A11. the estimated contribution from samples x (k) can be calculated by equation (7):

A13. due to the fact that

Is a smooth time sequence and can be designed accordingly

Statistics monitor changes in process uncertainty, sample x (k) corresponds to

The statistic is shown as equation (9):

A14. for estimated local components

The mean vector and covariance matrix can be calculated by equations (10) and (11), respectively:

wherein R is_i＝AΣ_iA^T+σ²I_m，i＝1,2,…,n；

A15. Will be provided with

Is recorded as

Then

The mean vector and covariance matrix of the corresponding stationary components can be calculated by equations (12) and (13), respectively:

wherein, W₁From an identity matrix I_mThe first d columns of (1);

A16. based on

Defining a local mahalanobis distance index as shown in equation (14):

Statistics to monitor the change in stationary components:

And

are respectively recorded as

And

B. the on-line monitoring stage comprises the following steps:

is the mean of the ith variable and is,

is the standard deviation of the ith variable;

wherein H_i＝I-Σ_i(σ²I+A^TAΣ_i)^-1A^TA，i＝1,2,…,n；

wherein the content of the first and second substances,

prior probability

Is set to 1/n, and

B4. the estimated contribution from samples x (t) can be calculated by equation (19):

B6. calculating the real-time sample x (t) corresponds to

Statistics:

B7. based on

Calculating the local mahalanobis distance index as shown in equation (22):

B8. calculating the real-time sample x (t) corresponds to

Statistics:

B9. will make statistics of

And

respectively in their control limits

And

by comparison, if

And is

The specific content of the step A8 includes:

A801. initial setting of model parameters

Arbitrarily assigning an initial value;

<s(k)s(k)^T>＝H_iΣ_i+<s(k)〉〈s(k)>^T (25)

wherein H_i＝I-Σ_i(σ²I+A^TAΣ_i)^-1A^TA，i＝1,2,…,n；

A803. The model parameters Θ are updated using equations (24) and (25):

A804. if the absolute value of the difference between the norm values of theta obtained by two iterations is less than 10^-5Stopping iteration and outputting the estimated value of the model parameter theta, otherwise returning to the step A802 for the next iterationAnd (4) generation.

In order to help understand the invention and simultaneously visually show the effect of the method for monitoring the abnormal state of the uncertain non-stationary industrial process, the following description is based on two embodiments. The data for both examples was derived from a closed loop controlled continuous stirred tank reactor which provides a library of standard models widely used in the field of monitoring of industrial process anomalies. A schematic of the reactor is shown in fig. 2, where a first order exothermic reaction is performed. Its dynamic model is shown in equation (32):

wherein C is the reaction concentration, T is the reaction temperature, T_cIs the temperature of cold water, Q_cIs the cold water flow, v_iIs measuring noise, C_i、T_iAnd T_ciConstituting the system input.

Since the original continuous stirred tank reactor is a stationary process, system inputs need to be modified in order to simulate non-stationary characteristics. In the modified version, T_ciIs switched to a steady state, C_iAnd T_iAdding a non-stationary drift term lambda:

wherein the content of the first and second substances,

and

respectively represent C_iAnd T_iThe nominal value of (a) is,

and

representing two gaussian random variables. Thus, the continuous stirred tank reactor exhibits non-stationary characteristics.

In the off-line training phase, 1200 normal samples were collected with a1 minute sampling interval. Each sample consists of 7 measured variables, i.e.: x ═ C_i,T_i,C,T,Qc,T_ci,T_c]^T. In addition to the method proposed by the present invention, three other monitoring algorithms are also added for comparison, respectively, a method based on co-integration analysis, a recursive principal component analysis method and a method based on stationary subspace analysis. Wherein the method based on the co-integration analysis combines the co-integration analysis with T²The statistics are combined, and the co-integration order is selected as r-5. For the recursive pivot analysis method, the pivot is selected based on an accumulated contribution rate of 85%. For stationary subspace analysis and probabilistic stationary subspace analysis, the number of stationary components is set to d-5, and the number of data segment partitions is set to n-19. In addition, the iteration error of the probabilistic stationary subspace analysis method is assumed to be ∈ 1 × 10^-6. Given a confidence level α of 0.99, the control limits for all monitoring statistics are determined by the kernel density estimation method.

TABLE 1 two typical failures in a continuous stirred tank reactor

In both embodiments of the invention, two typical failures in a continuous stirred tank reactor were considered separately, including multiplicative and additive failures, as shown in table 1. For both faults, 1200 samples were collected for performance comparison of the algorithm. The fault is introduced starting from the 601 st sample and continuing until the end.

Example 1

The heat transfer coefficient degradation failure was tested in example 1 and the monitoring results of the different algorithms are summarized in fig. 3-8. The fault is a typical tiny fault, and three methods based on the co-integration analysis, the stationary subspace analysis and the probabilistic stationary subspace analysis all show the false negative to different degrees in the early stage of the fault, wherein the three methods are respectively based on the co-integration analysis, the stationary subspace analysis and the probabilistic stationary subspace analysisProbabilistic stationary subspace analysis

The statistic has the shortest detection delay as shown in fig. 7.

The statistic has the advantage of containing non-stationary information. In addition, as shown in fig. 4 and 5, the recursive principal component analysis method alarms early in the early stage of a fault, but leaves many false alarms as the magnitude of the fault increases. This reflects the failure of the recursive principal component analysis method to discern subtle faults from non-stationary trends.

Example 2

An additive sensor constant offset fault was tested in example 2, which occurred in magnitude 1 in the reaction temperature variation. This is a minor fault of small magnitude compared to the nominal value of the reaction temperature of 430 c. In this example, the monitoring results of the four different methods are shown in fig. 9-14. As can be seen from fig. 9 and 12, the method based on both the co-integration analysis and the stationary subspace analysis can only partially detect the fault, but leaves a certain degree of false negative. Neither of these methods models non-stationary components, thus resulting in unsatisfactory monitoring performance for minor faults. In addition, the recursive principal component analysis method can hardly detect the fault (see fig. 10 and 11) because the fault magnitude is too small to be masked by the non-stationary trend. For probabilistic stationary subspace analysis, it

The detection effect of statistics is similar to that of the co-integration analysis and stationary subspace analysis, and

the statistics are able to effectively detect the fault at a detection rate of 97.8%. This reflects that the true stationary trend is separated from the process uncertainty, which can effectively improve the algorithm's detection capability for minor faults.

It is to be understood that the above description is not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art may make modifications, alterations, additions or substitutions within the spirit and scope of the present invention.

Claims

1. An abnormity monitoring method facing to uncertain non-stationary industrial process is characterized by comprising an off-line training stage and an on-line monitoring stage; wherein the content of the first and second substances,

A. the off-line training stage comprises the following steps:

is a sample mean vector of historical data;

wherein the content of the first and second substances,

is a mixing matrix that is reversible and that,

consists of the preceding d columns of A,

consists of the rear m-d columns of A,

is a stationary component, sⁿ(k) Is a non-stationary component and is,

It varies across data segments;

A7. combining a4 and a6, the prior distribution of s (k) is:

then, the probability density function for x (k) is:

wherein H_i＝I-Σ_i(σ²I+A^TAΣ_i)^-1A^TA，i＝1,2,…,n；

wherein the content of the first and second substances,

as shown in equation (4), the prior probability

Is set to 1/n, and

A13. due to the fact that

Is a stationary time series, and is therefore designed

Statistics monitor changes in process uncertainty, sample x (k) corresponds to

The statistic is shown as equation (9):

A14. for estimated local components

wherein R is_i＝AΣ_iA^T+σ²I_m，i＝1,2,…,n；

A15. Will be provided with

Is recorded as

Then

wherein, W₁From an identity matrix I_mThe first d columns of (1);

A16. based on

Defining a local mahalanobis distance index as shown in equation (14):

A17. based on weighted mahalanobis distanceAs shown in design formula (15)

Statistics to monitor the change in stationary components:

And

are respectively recorded as

And

B. the on-line monitoring stage comprises the following steps:

is the mean of the ith variable and is,

is the standard deviation of the ith variable;

wherein H_i＝I-Σ_i(σ²I+A^TAΣ_i)^-1A^TA，i＝1,2,…,n；

wherein the content of the first and second substances,

prior probability

Is set to 1/n, and

B6. calculating the real-time sample x (t) corresponds to

Statistics:

B7. based on

Calculating the local mahalanobis distance index as shown in equation (22):

B8. calculating the real-time sample x (t) corresponds to

Statistics:

B9. will make statistics of

And

respectively in their control limits

And

by comparison, if

And is

2. The anomaly monitoring method oriented to the uncertain non-stationary industrial processes as claimed in claim 1, wherein said step A8 specifically comprises:

A801. initial setting of model parameters

Arbitrarily assigning an initial value;

<s(k)s(k)^T>＝H_iΣ_i+<s(k)><s(k)>^T (25)

wherein H_i＝I-Σ_i(σ²I+A^TAΣ_i)^-1A^TA，i＝1,2,…,n；

A803. The model parameters Θ are updated using equations (24) and (25):