CN116125922A

CN116125922A - Complex industrial process monitoring method and system based on parallel dictionary learning

Info

Publication number: CN116125922A
Application number: CN202310023849.XA
Authority: CN
Inventors: 阳春华; 张娇娇; 黄科科; 吴德浩; 李勇刚; 朱红求; 桂卫华
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2023-01-09
Filing date: 2023-01-09
Publication date: 2023-05-16

Abstract

The invention discloses a complex industrial process monitoring method and system based on parallel dictionary learning, wherein the method comprises the following steps: extracting a collinearity variable set of the industrial process according to the variance expansion factors of all the monitoring variables; dividing a linear variable subset from the co-linear variable set based on a linear maximization method, and forming a nonlinear variable subset by the residual monitoring variables; establishing a linear monitoring model based on dictionary learning for each linear variable subset, and establishing a nonlinear monitoring model based on kernel dictionary learning for the nonlinear variable subset; calculating a reconstruction error based on the established linear and nonlinear monitoring models, and calculating each error control limit; acquiring real-time monitoring sample data of a complex industrial process on line, and calculating reconstruction errors of variable subsets; based on each reconstruction error and control limit, the global index of the current monitoring sample data is fused and calculated, and whether the current industrial process fails or not is judged by the global index. The invention can realize the monitoring of complex industrial processes with coexisting linearity and nonlinearity.

Description

Complex industrial process monitoring method and system based on parallel dictionary learning

Technical Field

The invention belongs to the field of industrial process monitoring, and particularly relates to a complex industrial process monitoring method and system based on parallel dictionary learning.

Background

With the increasing level of integration and intelligence of modern industrial systems, any minor failure in the industrial process may cause immeasurable disruption to the production campaign. Therefore, in order to ensure safe and stable operation of industrial systems, a higher level of requirements are also put on process monitoring. By adopting a proper process monitoring method, operators can acquire the current running state of the industrial process in real time, and once faults are monitored, an alarm prompt is triggered, so that the operators can react and process in time, thereby avoiding causing larger economic loss and eliminating potential safety hazards in time.

Data-driven methods have demonstrated great potential in the field of industrial process monitoring, and have been widely studied. The multivariate statistical process monitoring method (MSPM) is one of the most typical data driven methods that capture important features of data mainly by way of dimension reduction, with PCA, PLS, ICA, etc. being widely used. With the development of artificial intelligence technology, some machine learning methods are also applied to the field of process monitoring, such as Support Vector Machines (SVMs) and random forest methods.

Dictionary learning is a common machine learning method, and the core idea is to directly learn from original data, train a group of dictionary atoms reflecting essential characteristics of the data, and reconstruct the data by linear combination of a small number of atoms in a sparse way. Dictionary learning is widely studied and applied in the field of industrial process monitoring by virtue of excellent data reconstruction capability. Meanwhile, in order to adapt to different requirements of industrial actual scenes, a series of improved dictionary learning methods are proposed. Huang proposes distributed dictionary learning for efficient monitoring of complex, high-dimensional modern industrial systems. Yang has proposed a robust dictionary learning method to monitor performance of industrial processes with multiple modalities.

Because of the complex mechanism process of an industrial system, measurement data often has nonlinear characteristics, which brings challenges to process monitoring tasks, and a dictionary learning method based on linear reconstruction assumption is difficult to fully capture the nonlinear characteristics of the data. To solve this non-linearity problem, nguyen proposes kernel dictionary learning based on kernel methods. The kernel dictionary learning maps nonlinear data to a high-dimensional feature space, makes it linearly separable, and then seeks a linear solution in the high-dimensional feature space. However, since kernel functions and parameters determine the nature of feature space, kernel dictionary learning typically requires the selection of appropriate parameters, which is sometimes difficult. Furthermore, after the original data is projected into the high-dimensional feature space, not only is the original data difficult to express and interpret mathematically, but also the linear feature variation is easy to ignore.

However, as modern industrial processes become more integrated and complex, it has become common for complex correlations between system process variables to exist both linearly and non-linearly, and creating only a single linear monitoring model or a non-linear monitoring model is not the most ideal option. The dictionary learning method is based on linear reconstruction assumption, a nonlinear system cannot be effectively monitored, and the kernel dictionary learning is sensitive to parameter selection and is easy to ignore the change of linear characteristics. Since the linear and nonlinear characteristics have the same significance for process monitoring, the linear and nonlinear characteristics of the system should be fully considered when the monitoring method is selected. However, most of the current researches only select a single linear or nonlinear monitoring method when a process monitoring model is established, which cannot fully extract data characteristics and achieve the optimal monitoring effect.

Disclosure of Invention

The invention provides a complex industrial process monitoring method based on parallel dictionary learning, which realizes the monitoring of complex industrial processes with linear and nonlinear coexistence.

In order to achieve the technical purpose, the invention adopts the following technical scheme:

a complex industrial process monitoring method based on parallel dictionary learning, comprising:

extracting a collinearity variable set from the monitored variable set according to variance expansion factors of all the monitored variables of the industrial process;

identifying and dividing a linear variable subset from the co-linear variable set based on a linear maximization method, wherein the rest monitoring variables in the monitoring variable set form a nonlinear variable subset;

establishing a linear monitoring model based on dictionary learning for each linear variable subset, and establishing a nonlinear monitoring model based on kernel dictionary learning for the nonlinear variable subset;

based on the established linear and nonlinear monitoring models, calculating the reconstruction errors of training data in the variable subset parts, and calculating the control limits of the linear and nonlinear monitoring models based on a kernel density estimation method;

acquiring real-time monitoring sample data of a complex industrial process on line, and calculating reconstruction errors of variable subsets by using each linear and nonlinear monitoring model; based on each reconstruction error and control limit, calculating the normal likelihood probability, the abnormal likelihood probability and the abnormal posterior probability of each variable subset part of the current monitoring sample data, fusing to obtain a global index of the current monitoring sample data, and judging whether the current complex industrial process is faulty or not by the global index.

Further, the calculation method of the variance expansion factor of each monitoring variable comprises the following steps:

wherein ,VIF_j Representing any monitored variable x _j And the variance expansion factor of (2), m is the number of the monitored variables, R _j ² To monitor variable x _j Determining coefficients when regressing on all the remaining monitored variables.

Further, the method for extracting the co-linear variable set in the monitored variable set comprises the following steps: the monitored variables whose variance expansion factors are greater than a given variance expansion factor threshold are combined into a set of co-linear variables.

Further, the linear variable subset is identified and divided from the co-linear variable set based on the linear maximization method, specifically:

(1) Representing the collinearity set as X _P ＝[x ₁ ,x ₂ ,...,x _P ]∈R ^N×P P is the number of monitoring variables of which the variance expansion factor exceeds a threshold value, and x _max Is co-linear set X _P Has the greatest variance expansion factor;

(2) Maximizing x _max And co-linearity set X _P Correlation between linear combinations of the remaining monitored variables:

wherein ,X_P-1 ∈R ^N×(P-1) Is co-linear set X _P Removing x from the medium _max a.epsilon.R ^P-1 Is X _P-1 Is set in the order of (1), the combined coefficient vector of D (x) _max ) And D (X) _P-1 a) Variance of variables, cov (x _max ,X _p-1 a) Is covariance matrix;

(3) Sigma-shaped ₁₁ ，Σ ₁₂ ，Σ ₂₁ ，Σ ₂₂ Respectively cov (x) _max ,x _max )，cov(x _max ,X _p-1 )cov(X _P-1 ,x _max )，cov(X _p-1 ,X _p-1 ) The method comprises the following steps of:

(4) Substituting the formula (3) into the optimization problem (2), fixing denominators, and converting the optimization problem (2) into:

(5) Converting the optimization problem (4) into a eigenvalue decomposition problem by a Lagrangian multiplier method, and obtaining:

Σ ₂₂ ^-1 Σ ₂₁ Σ ₁₁ ^-1 Σ ₁₂ a＝λ ² a (5)

(6) Based on the number of non-zero value feature roots being min (1, P-1), solving the feature vector to obtain an optimal coefficient vector a;

(7) Screening all coefficients greater than a given weight threshold from the optimal coefficient vector, wherein the corresponding variable constitution corresponds to x _max A subset of the variables that are linearly related, denoted a subset of linear variables;

(8) Extracting a subset of linear variables from the collinearity set X _P If at least two variables still exist in the linear variable subset, repeating the steps to extract a new linear variable subset, otherwise, indicating that the linear variable subset is divided.

Further, a linear monitoring model is built for each linear variable subset based on dictionary learning, specifically:

let the raw data of the industrial process based on the monitoring variable set be y= [ Y ] ₁ ,y ₂ ,...,y _N ]∈R ^m×N M is the number of variables in the monitored variable set, namely the dimension of the sample, and N is the number of samples; the original data block based on the nth linear variable subset is

m _n The number of variables in the nth linear variable subset; then based on the original data sub-block Y _n The dictionary learning optimization problem of establishing a linear monitoring model is expressed as follows:

in the formula, D represents a dictionary to be learned,

for the nth original data sub-block Y _n The dictionary obtained by the training is composed of K dictionary atoms d ₁ ⁿ ,...,d _K ⁿ Composition; x is X _n ＝[x ₁ ⁿ ,x ₂ ⁿ ,...,x _N ⁿ ]∈R ^K×N For the corresponding sparse coding matrix, T is the sparse constraint,limiting sparse coding vector x _i ⁿ The non-zero element in (2) is less than T; n=1, …, l, l is the number of linear variable subsets obtained by division;

and establishing a nonlinear monitoring model for the nonlinear variable subset based on kernel dictionary learning, wherein the nonlinear monitoring model comprises the following concrete steps:

let the original data blocks of the linear variable subset be expressed as

m _l+1 The number of variables in the nonlinear variable subset; then based on the original data sub-block Y _l+1 The problem of the learning optimization of the nuclear dictionary for establishing the nonlinear monitoring model is expressed as follows:

where Φ (·) is the mapping function.

Further, adopting alternate optimization to solve sparse coding matrixes and dictionaries in the optimization problem (6); when solving the optimization problem (7), converting the optimization problem (7) into:

wherein A is a pseudo dictionary, and replaces an original dictionary D together with a base dictionary phi (Y);

and then converting the optimization problem (8) into the following according to the relation between the matrix norms and the traces:

wherein tr (. Cndot.) represents the trace of the matrix, K (Y) _l+1 ,Y _l+1 )∈R ^N×N Is a kernel matrix whose elements k (y _i ,y _j )＝Φ(y _i ) ^T Φ(y _j )；

The sparse coding matrix X and the pseudo dictionary a in the optimization problem (8) are finally updated alternately.

Further, the calculation method of the normal likelihood probability, the abnormal likelihood probability and the abnormal posterior probability comprises the following steps:

wherein ,

for monitoring sample data y in real time _new Part of the data belonging to the nth variable subset; />

and />

Respectively->

Where N and F represent normal and abnormal, respectively;

is->

Reconstruction errors, TH _n A control limit for the nth variable subset; />

Is->

Is an anomaly posterior probability of (2); p (P) _n(F) and P_n (N) represent the abnormal and normal prior probabilities of the sample, respectively, and they are determined by the confidence level alpha, namely P _n (N)＝1-α，P _n (F)＝α；

The global index of the current monitoring sample data obtained by fusion is as follows:

wherein ,GFI_new For monitoring sample data y in real time _new Global index of (2); here the likelihood of anomalies over subsets of variables

All variable subsets n (n=1, 2, the posterior probability of abnormalities on l, l+1)

Fusion to establish global monitoring index GFI _new 。

The complex industrial process monitoring system based on parallel dictionary learning comprises a memory and a processor, wherein a computer program is stored in the memory, and the complex industrial process monitoring system based on parallel dictionary learning is characterized in that the processor realizes the complex industrial process monitoring method based on parallel dictionary learning according to any one of the technical schemes when the computer program is executed by the processor.

Advantageous effects

The complex industrial process monitoring method based on parallel dictionary learning can be applied to the industrial process with complex correlation of monitoring process variables. The method overcomes the blindness of the traditional selection monitoring method and the limitation of constructing a single monitoring model, namely, the method can divide a variable set by exploring the correlation relation of the system process variables, and select a proper linear or nonlinear monitoring method to establish a model in parallel according to the correlation of the variables, thereby effectively improving the fault monitoring performance of the complex industrial process.

Drawings

FIG. 1 is a general framework of the method of the present invention;

FIG. 2 is a comparison of the evaluation index of the variable division results of the present invention;

fig. 3 shows the effects of the parallel dictionary learning method according to the present invention compared with the conventional dictionary learning and the nuclear dictionary learning methods, wherein (a) - (c) are the monitoring effects of the conventional dictionary learning and the nuclear dictionary learning, respectively, and (d) - (f) are the monitoring effects of the three methods in the second scenario.

Detailed Description

The following describes in detail the embodiments of the present invention, which are developed based on the technical solution of the present invention, and provide detailed embodiments and specific operation procedures, and further explain the technical solution of the present invention.

The invention provides a complex industrial process monitoring method based on parallel dictionary learning, which comprises two stages of off-line modeling and on-line monitoring. In the offline learning stage, a linear maximization method for effectively identifying a subset of linear variables is adopted, the collinearity of the variables is evaluated by using a variance expansion factor, and the division of linear and nonlinear subsets is realized by maximizing the linear correlation between one variable and the rest of variables. And then, according to the characteristics of each variable subset, respectively selecting dictionary learning and kernel dictionary learning methods to establish a monitoring model in parallel, and fully extracting the linear and nonlinear characteristics of the data. Finally, the error control limit under each sub-monitoring model is obtained through analysis of the training data. In the online monitoring stage, the system acquires online data of the industrial process, calculates local monitoring statistics of all subsets of the test samples by utilizing a monitoring model trained in the offline stage, and establishes global monitoring statistics by Bayesian inference fusion, so that the real-time state of the industrial process is estimated, and the stable and healthy operation of the industrial process is ensured. The method of the invention realizes that a proper monitoring method can be selected according to the linear and nonlinear characteristics of the system by exploring the complex correlation among the process variables of the system, and simultaneously overcomes the limitation of the traditional construction of a single linear or nonlinear monitoring model, thereby having superiority when monitoring the industrial process with the complex correlation among the process variables.

1. Extraction of a set of collinearity variables based on a variance-expansion factor

At the off-line initial stage, normal training data Y of the complex industrial process is collected ₁ ＝[y ₁ ,y ₂ ,…,y _N ]∈R ^m×N M and N represent the dimension and number of samples, respectively. First, to evaluate the degree of collinearity between each process variable and the remaining variables, for all variables x= [ X ] ₁ ,x ₂ ,...,x _m ]∈R ^N×m Variance expansion factor (VIF) was calculated:

wherein ,VIF_j Representing the variance expansion factor of arbitrary variables, R _j ² As variable x _j (j=1, 2, …, m) determination coefficients when regressing on all the remaining variables.

VIF _j Reflecting the variable x _j The greater the value of the degree of collinearity with the remaining variables, the more significant the collinearity. All VIFs are scaled by a variance expansion factor threshold θ _j The variables > θ form a set of collinearity.

2. Identifying and partitioning subsets of linear variables based on a linear maximization method, and constructing subsets of nonlinear variables

In order to explore the correlation between process variables, the present invention proposes a linear maximization method to identify and separate subsets of linear variables. The core idea is to identify a subset of linear variables related to one variable based on the combination coefficients by maximizing the correlation between the linear combination of the variable and the remaining variables.

Let the collinearity set be X _P ＝[x ₁ ,x ₂ ,...,x _P ]∈R ^N×P P represents the number of variables for which all variance expansion factors exceed the threshold. X is x _max As a variable with the largest variance expansion factor in the collinearity set, this indicates x _max The most number of linear subset variables or linesThe correlation is the strongest. To identify x _max The linear variable subset to which is belongs, maximizing x _max Correlation between linear combinations of the remaining variables:

wherein ,X_P-1 ∈R ^N×P-1 Is co-linear set X _P Removing x from the medium _max a.epsilon.R ^P-1 Is a coefficient vector, D (x _max ) And D (X) _P-1 a) Variance of variables, cov (x _max ,X _p-1 a) Is a covariance matrix. Sigma-shaped ₁₁ ，Σ ₁₂ ，Σ ₂₁ ，Σ ₂₂ Respectively cov (x) _max ,x _max )，cov(x _max ,X _p-1 )cov(X _P-1 ,x _max )，cov(X _p-1 ,X _p-1 ) The method is easy to obtain:

substituting (3) into (2), fixing denominator, and easily converting the optimization problem (2) into:

by the lagrangian multiplier method, (4) can be converted into eigenvalue decomposition problem and obtained:

Σ ₂₂ ^-1 Σ ₂₁ Σ ₁₁ ^-1 Σ ₁₂ a＝λ ² a (5)

here, the number of non-zero value feature roots is min (1, p-1), and then the optimal coefficient vector a can be obtained by solving the feature vector.

The optimization problem (2) is essentially to maximize x by looking for the optimal coefficient a _max Linear dependence on the remaining variables. If x is to be made _max Maximum linear correlation with the remaining variables, then in the combined systemIn the number a, x is _max The coefficient weights of the linearly related variables should be as large as possible. Therefore, by setting the combination coefficient weight threshold value beta at this time, the variable set X can be obtained from the combination coefficient weight _P-1 Middle division and x _max A subset of the variables that are linearly related.

To further identify and partition all remaining linear variable subsets in the collinearity set, the partitioned linear variable subsets are removed from the collinearity set. Then, the collinearity of the collinearity centralized variable is estimated again by the variance expansion factor, if at least two variables VIF still exist _j And > θ, indicating that there are still other different subsets of linear variables, the iterative linear maximization method continues to identify and separate. Otherwise, indicating that all the linear variable sets are divided, and forming a nonlinear variable set by the residual variables except all the divided linear variable subsets in the original variable set.

3. Establishing a monitoring model

3.1 Linear monitoring model establishment based on dictionary learning

Dictionary learning learns and trains a series of atoms containing data features from original data, and the reconstruction expression of the original data is realized through linear combination of a small number of dictionary atoms. Since the data points are distributed on one linear manifold in each linear variable set, the original data can be well reconstructed through a dictionary learning method. Thus, on each linear subset of the partitions, a linear monitoring model is built based on dictionary learning.

Assuming that all process variables are ultimately divided into l subsets of linear variables and 1 set of nonlinear variables, then the raw data y= [ Y ] ₁ ,y ₂ ,...,y _N ]∈R ^m×N Correspondingly decomposed into l+1 sub-blocks, i.e. y= [ Y ] ₁ ^T ,Y ₂ ^T ,...,Y _l ^T ,Y _l+1 ^T ] ^T. wherein ,

represents a first linear sub-block, and contains m ₁ A linear variable and the number of samples is still N.

Representing non-linear sub-blocks therein. Accordingly, the nth linear sub-block contains m in total _n The variables, taken together, are m = Σ _n m _n And n=1, 2, once again, l, l+1.

At n (n=1, 2.,), l) building a monitoring model on the linear subblocks, dictionary learning optimization problems can be expressed as:

here the number of the elements is the number,

the dictionary trained under the nth sub-block consists of K dictionary atoms, X _n ＝[x ₁ ⁿ ,x ₂ ⁿ ,...,x _N ⁿ ]∈R ^K×N For the corresponding sparse coding matrix, T is a sparse constraint, and the vector x is limited _i ⁿ Is less than T. The problem (6) can be solved by alternating optimization, and consists of two parts, namely sparse coding solution and dictionary updating.

First, fix D _n Solving corresponding sparse codes X _n ：

Here, y _i ⁿ Is Y _n Is defined by the i-th column vector (i=1, 2, n.), x _i ⁿ Is y _i ⁿ In dictionary D _n And (5) sparse coding. The optimization problem (7) can be solved directly by OMP algorithm.

Once solved to obtain X _n In turn fix X _n Update D sequentially column by column _n Dictionary atoms of (a):

here, d _k ⁿ For D _n The kth dictionary atom, x _i ⁿ Is X _n Is the i-th column vector in (b). The optimization problem (8) can be solved by the K-SVD algorithm. When D is _n After updating, updating again to solve sparse coding X _n And repeatedly and alternately optimizing the method until the iteration stop condition is met.

3.2 building nonlinear monitoring model based on kernel dictionary learning

On nonlinear sub-blocks, there is a serious nonlinear relation among variables, and original data is difficult to be directly expressed by dictionary atom linear reconstruction in an original space. Nuclear dictionary learning is a non-linear extension of dictionary learning based on the kernel, and linear solutions can be sought in the feature space after mapping non-linear data into the high-dimensional feature space. Therefore, a monitoring model is built on the nonlinear sub-blocks through kernel dictionary learning, and the optimization problem is expressed as follows:

wherein Φ (·) is a mapping function: r is R ^m →R ⁿ (n > m), n and m being the dimensions of the feature space and the original space, respectively, phi (Y _l+1 )＝[Φ(y ₁ ^l+1 ),...,Φ(y _N ^l+1 )]∈R ^n×N To be mapped to the original data in the feature space. Since the feature space dimension n is often unknown and may be infinite, conventional K-SVD cannot solve directly for (9), thus converting (9) to:

here, the original dictionary is replaced by the base dictionary Φ (Y) and the pseudo dictionary a. Based on the relationship of matrix norms to traces, (10) can be translated into:

here, tr (. Cndot.) represents the trace of the matrix, K (Y) _l+1 ,Y _l+1 )∈R ^N×N Is a kernel matrix whose elements k (y _i ,y _j )＝Φ(y _i ) ^T Φ(y _j )。

Similar to the dictionary learning optimization solution process, the problem (11) needs to be solved optimally by updating the sparse codes and the dictionary alternately. Firstly, fixing a pseudo dictionary A, and updating a sparse coding matrix X column by column _l+1 . According to (11), it is possible to obtain:

here, the problem (12) can be solved directly by the KOMP algorithm.

After obtaining a sparse matrix X _l+1 Thereafter, X is fixed in reverse _l+1 Updating the dictionary a column by column:

wherein ,a_k Is the kth column atom, x in pseudo dictionary A _T ^k Is X _l+1 In row k. Optimization problem (13) can be solved by applying the method to a _k The derivative solution can also be solved by adopting a KK-SVD algorithm.

4. Calculating an error control limit

After the monitoring model on each sub-block is established and trained, the reconstruction error of the training data on each sub-block is obtained. For any training sample y _i ＝[y _i,1 ^T ,y _i,2 ^T ,...,y _i,l ^T ,y _i,l+1 ^T ] ^T ∈R ^m I=1, 2, N, at N (n=1, 2.,), l) reconstruction errors under linear sub-blocks are expressed as:

here, x _i ⁿ For sample y _i ⁿ In dictionary D _n And solving the corresponding sparse codes through an OMP algorithm.

Sample y _i The reconstruction error under the nonlinear subblocks is:

here, x _i ^l+1 For sample nonlinear subblock y _i ^l+1 Is solved by means of a KOMP algorithm.

Finally, calculating the error control limit of the sample under each sub-block based on a Kernel Density Estimation (KDE) method. KDE is one of the non-parametric methods of estimating an unknown density function in probability theory, whose function is expressed as:

here, r= [ R ₁ ⁿ ,R ₂ ⁿ ,...,R _N ⁿ ]N=1, 2,.. is the reconstruction error of the training data on each sub-block. h is the bandwidth, N is the number of training samples, f (R) is the probability density function, and K (x) is the non-negative kernel function, which satisfies the following condition:

there are many kinds of kernel functions, and all kernel functions in the present invention are gaussian kernel functions which are most widely used. The error control limit TH of each sub-block under the given confidence level can be obtained through KDE function _n And a decision basis is provided for online testing.

5. On-line real-time monitoring of industrial processes

When a new sample y to be measured _new To be temporary, the same decomposition is performed on the offline variable partition first. Assume thatThere are l linear and 1 nonlinear sub-blocks altogether, and the test sample is decomposed into y _new ＝[y _new,1 ^T ,y _new,2 ^T ,...,y _new,l ^T ,y _new,l+1 ^T ]∈R ^m . Then, according to the local dictionary D learned in the off-line stage _n And A, the test sample y can be calculated by the formulas (14) and (15) _new Reconstruction error DER under each sub-block _n ，n＝1,2,...,l,l+1。

However, establishing block-level monitoring statistics alone is insufficient to monitor the entire process, so bayesian reasoning is introduced, the monitoring results on the sub-blocks are fused, and global monitoring statistics are established. Firstly, the reconstruction errors of the samples on each sub-block are respectively converted into normal likelihood probability and abnormal likelihood probability:

then, according to bayesian reasoning, the abnormal posterior probability of the samples in each sub-block can be obtained according to the following formula:

wherein F and N represent abnormality and normality, respectively, P _n(F) and P_n (N) represents the normal and abnormal prior probabilities of the sample, respectively, and they are determined by the confidence level alpha. I.e. there is P _n (N)＝1-α，P _n (F)＝α。

After the normal likelihood probability, the abnormal likelihood probability and the abnormal posterior probability of the sample under all the sub-blocks are obtained, the monitoring results of the sample under each sub-block are fused according to the following steps to establish a global index GFI _new ：

Here, the likelihood of abnormality

As weights, sub-blocks n (n=1, 2), abnormality posterior probability on l, l+1)>

Thereby establishing global index GFI _new . The state of the sample can be based on the global indicator GFI _new And confidence α assessment. If GFI _new < alpha, can be considered as the sample y to be measured _new A normal sample, and otherwise, a fault sample.

6. Data of detection and experimental verification

In order to verify the effectiveness of the method provided by the invention, a numerical simulation system is designed to simulate real industrial process data according to the improvement of a typical three-variable nonlinear system. From the process variable relationship, the system can be decomposed into x= [ X ] ¹ ,X ² ,X ³ ,X ⁴ ]Four subsets. Wherein the first three subsets X ^m (m=1, 2, 3) the variables within the same subset are linearly related, while in the fourth subset X ⁴ The intermediate variables have only nonlinear relationships. The simulation system is represented as follows:

here, the variable factor u ₁ 、u ₂ and u₃ Independent variables, e, respectively, subject to uniform distribution U (0.01,2) ₁ ¹ ～e ₄ ⁴ A total of 13 independent noise variables obeying a gaussian distribution N (0,0.01), the variables in each subset being denoted as x _n ^m (n=1, 2,3, m=1, 2,3, 4), n and m minutesRespectively denoted as variable dimension and variable subset category.

In order to fully explain the effectiveness of the method, two different test scenes are designed in total:

scene 1: a total of 500 test samples, starting from sample 201, in the linear variable subset X ¹ Variable x in (a) ₂ ¹ Introducing a sensor constant deviation fault of 0.6;

scene 2: a total of 500 test samples, starting from sample 201, in nonlinear variable subset X ⁴ Variable x in (a) ₁ ⁴ And a sensor constant deviation fault of 0.8 is introduced.

In order to quantify the accuracy of the variable division provided by the invention, a variable division evaluation index p is constructed as follows:

where m represents the number of partitioned subsets of linear variables, n represents the number of variables in all sets of linear variables, and when the number of variables m is smaller and the number of linear variables n included is greater,

will approach 1.c _i (i=1, 2,., m) is the first principal contribution of each subset of linear variables, and accordingly, r is the average of the principal contributions over all the linear subsets. c _i Reflecting the degree of linear correlation between the variables in each linear subset, c when the linear variable subsets are properly partitioned _i Will approach 1 and correspondingly r will approach 1. Thus, in general, when the variable division is appropriate, the evaluation index p will approach 1.

Meanwhile, for the monitoring effect of the quantization method, two widely used monitoring indexes are adopted: false positive rate (FAR) and Fault Detection Rate (FDR) are defined as follows:

here, # J > J _th |f=0 } and # { J > J _th The # { f +.0 } represents the number of false alarms of the normal sample and the number of detected failure samples, respectively, and the # { f=0 } and # { f +.0 } represent the total number of the normal sample and the failure samples, respectively.

First, to illustrate the effectiveness of the linear maximization analysis method proposed by the present invention, the VIF of all variables calculated from equation (1) is: vif= [3349,11773,12132,2847,9879,10492,3393,10358,12386,5,90,5,82 ]] ^T. wherein

Variable x ₃ ³ With the maximum VIF value, the variable x is maximized according to equation (2) ₃ ³ With the remaining variables x _other The linear correlation among the two is solved to obtain the linear combination under the optimal coefficient variable as follows:

wherein, is equal to x ₃ ³ X in the same linear variable set ₁ ³ and x₂ ³ The above statement can be verified because the coefficient weights are significantly larger than the other variables.

Then, the invention sets the VIF threshold θ to 10, and iteratively divides the variable set by a linear maximization method under the coefficient weight β=0.05-0.5, wherein the evaluation index p of each variable division result is shown in fig. 2.

When β=0.05, 0.1, and 0.15, the evaluation index p is equal and maximum, and the variable division result coincides with the actual result. At the same time, when β >0.2, there is substantially no longer a change, which demonstrates the effectiveness and robustness of the proposed variable partitioning method of the present invention.

After the weight threshold beta is selected to be 0.1, the iterative linear maximization method divides the linear variable set, monitoring experiments are respectively carried out under two designed scenes, and the parallel dictionary learning method provided by the invention is compared with the traditional dictionary learning and nuclear dictionary learning methods. The experimental results are shown in figure 3.

In fig. 3, (a) - (c) are respectively the monitoring effects of the traditional dictionary learning, the kernel dictionary learning and the parallel dictionary learning method provided by the invention under the scene, and (d) - (f) are the monitoring effects of the three methods under the second scene. According to experimental results, it was found that in scenario one, when a fault occurs on the linear subset, the traditional dictionary learning method is more sensitive to the fault than the kernel dictionary learning method. This is because the kernel dictionary learning method is based on nonlinear mapping solution, and linear feature changes of the original data are easily ignored in the high-dimensional feature space. In the second scenario, when a fault occurs on the nonlinear subset, the traditional dictionary learning method cannot effectively monitor the nonlinear fault due to the assumption based on linear reconstruction, and the kernel dictionary learning method can monitor the nonlinear feature change of the original data in time by searching for a linear solution in a high-dimensional feature space. As shown in fig. 3 (c) and (f), the parallel dictionary learning method provided by the invention divides the variable set and constructs linear and nonlinear monitoring models on each subset in parallel by exploring the correlation among the process variables, fully combines the advantages of the two methods of dictionary learning and nuclear dictionary learning, and improves the monitoring rate of faults under different scenes.

Through the experiments, the monitoring superiority of the method provided by the invention on the industrial process with complex correlation of the process variable compared with the traditional method is verified.

The above embodiments are preferred embodiments of the present application, and various changes or modifications may be made on the basis thereof by those skilled in the art, and such changes or modifications should be included within the scope of the present application without departing from the general inventive concept.

Reference is made to:

[1]Aharon M,Elad M,Bruckstein A.K-SVD:An algorithm for designing overcomplete dictionaries for sparse representation[J].IEEE Transactions on signal processing,2006,54(11):4311-4322.

[2]Yang C,Zhou L,Huang K,et al.Multimode process monitoring based on robust dictionary learning with application to aluminium electrolysis process[J].Neurocomputing,2019,332:305-319.

[3]Huang K,Wu Y,Wen H,et al.Distributed dictionary learning for high-dimensional process monitoring[J].Control Engineering Practice,2020,98:104386.

[4]Pati Y C,Rezaiifar R,Krishnaprasad P S.Orthogonal matching pursuit:Recursive function approximation with applications to wavelet decomposition[C].Proceedings of 27th Asilomar conference on signals,systems and computers.IEEE,1993:40-44.

[5]Huang K,Wen H,Ji H,et al.Nonlinear process monitoring using kernel dictionary learning with application to aluminum electrolysis process[J].Control Engineering Practice,2019,89:94-102.

[6]Van Nguyen H,Patel V M,Nasrabadi N M,et al.Kernel dictionary learning[C]//2012 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2012:2021-2024.

Claims

1. a complex industrial process monitoring method based on parallel dictionary learning, comprising:

2. The method for monitoring a complex industrial process according to claim 1, wherein the method for calculating the variance expansion factor of each monitored variable is as follows:

3. The method for monitoring a complex industrial process according to claim 1, wherein the method for extracting the collinearity variable set from the monitored variable set is as follows: the monitored variables whose variance expansion factors are greater than a given variance expansion factor threshold are combined into a set of co-linear variables.

4. The complex industrial process monitoring method according to claim 1, wherein the linear-based maximization method identifies and divides subsets of linear variables from a set of co-linear variables, in particular:

Σ ₂₂ ^-1 Σ ₂₁ Σ ₁₁ ^-1 Σ ₁₂ a＝λ ² a(5)

5. The complex industrial process monitoring method of claim 1, wherein the linear monitoring model is built based on dictionary learning for each subset of linear variables, in particular:

in the formula, D represents a dictionary to be learned,

for the nth original data sub-block Y _n The dictionary obtained by the training is composed of K dictionary atoms d ₁ ⁿ ,...,d _K ⁿ Composition; x is X _n ＝[x ₁ ⁿ ,x ₂ ⁿ ,...,x _N ⁿ ]∈R ^K×N For correspondingly thinSparse coding matrix, T is sparse constraint, and sparse coding vector x is limited _i ⁿ The non-zero element in (2) is less than T; n=1, …, l, l is the number of linear variable subsets obtained by division;

let the original data blocks of the linear variable subset be expressed as

where Φ (·) is the mapping function.

6. The complex industrial process monitoring method according to claim 5, wherein the sparse coding matrix and dictionary in the optimization problem (6) are solved with alternating optimization;

when solving the optimization problem (7), converting the optimization problem (7) into:

7. The complex industrial process monitoring method of claim 5, wherein the calculation methods of the normal likelihood probability, the abnormal likelihood probability and the abnormal posterior probability are:

wherein ,

for monitoring sample data y in real time _new Partial data belonging to the n (n=1, 2,., l, l+1) variable subset; />

and />

Respectively->

Where N and F represent normal and abnormal, respectively; />

Is->

Is a reconstruction error of (a); TH (TH) _n A control limit for the nth variable subset; />

Is that

Is an anomaly posterior probability of (2); p (P) _n(F) and P_n (N) represent the prior probabilities of the abnormal and normal samples, respectively, each of which is determined by the confidence level α, i.e., P _n (N)＝1-α，P _n (F)＝α；

wherein ,GFI_new For monitoring sample data y in real time _new Global index of (2); here, the likelihood of anomalies over subsets of variables

Acting as weights, all variable subsets n (n=1, 2, abnormal posterior probability on l, l+1)>

Fusion to establish global index GFI _new 。

8. A complex industrial process monitoring system based on parallel dictionary learning, comprising a memory and a processor, wherein the memory stores a computer program which, when executed by the processor, causes the processor to implement the method of any one of claims 1 to 7.