CN109407652B - Multivariable industrial process fault detection method based on main and auxiliary PCA models - Google Patents

Multivariable industrial process fault detection method based on main and auxiliary PCA models Download PDF

Info

Publication number
CN109407652B
CN109407652B CN201811503665.9A CN201811503665A CN109407652B CN 109407652 B CN109407652 B CN 109407652B CN 201811503665 A CN201811503665 A CN 201811503665A CN 109407652 B CN109407652 B CN 109407652B
Authority
CN
China
Prior art keywords
data set
formula
spe
variable
statistics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811503665.9A
Other languages
Chinese (zh)
Other versions
CN109407652A (en
Inventor
邓晓刚
邓佳伟
曹玉苹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Petroleum East China
Original Assignee
China University of Petroleum East China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Petroleum East China filed Critical China University of Petroleum East China
Priority to CN201811503665.9A priority Critical patent/CN109407652B/en
Publication of CN109407652A publication Critical patent/CN109407652A/en
Application granted granted Critical
Publication of CN109407652B publication Critical patent/CN109407652B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B23/00Testing or monitoring of control systems or parts thereof
    • G05B23/02Electric testing or monitoring
    • G05B23/0205Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults
    • G05B23/0218Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterised by the fault detection method dealing with either existing or incipient faults
    • G05B23/0243Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterised by the fault detection method dealing with either existing or incipient faults model based detection method, e.g. first-principles knowledge model
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/20Pc systems
    • G05B2219/24Pc safety
    • G05B2219/24065Real time diagnostics

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

The invention relates to a multivariable industrial process fault detection method based on a main and auxiliary PCA model, which comprises the following steps: carrying out standardization processing on the normal data set and the prior fault data set; establishing a PCA model for a normal data set as a main monitoring model, calculating relative mutual information of prior faults and normal data, grouping variables by means of generalized Dice, establishing the PCA model for the grouped data set as an auxiliary monitoring model, standardizing a test data set, projecting the test data set to the main monitoring model and the auxiliary monitoring model respectively, calculating statistics of the test data set projected to the main monitoring model and the auxiliary monitoring model, integrating information of a variable group by applying Bayesian theory to obtain total monitoring statistics, and judging whether the test data set fails or not according to whether the monitoring statistics exceeds a control limit. The invention not only effectively reduces the omission and waste of part of important prior fault information, but also improves the fault detection rate and the fault detection performance by mining the variable local information in variable groups.

Description

Multivariable industrial process fault detection method based on main and auxiliary PCA models
Technical Field
The invention belongs to the technical field of industrial process fault detection, and relates to a multivariable industrial process fault detection method based on a Primary Assisted Principal Component Analysis (PA-PCA).
Background
Due to the increasing complexity of modern industrial systems, people pay more attention to process safety and product quality, and the position of fault diagnosis in industrial production is more and more important. With the development of storage technology, mass production process data is collected and recorded. Therefore, the data-driven fault diagnosis method is widely used. Classical fault detection methods include Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Fisher Discriminant Analysis (FDA). The PCA method has been a hot spot in the control field in recent years and is widely used by researchers, but the method still has some problems and is worthy of further study. The conventional PCA method only utilizes normal data when performing statistical modeling, ignores part of known prior fault information, causes omission and waste of part of important information, and accordingly causes reduction of fault detection performance. Therefore, how to effectively utilize the known prior fault data to mine effective information to improve the fault detection performance of the PCA has become a challenging subject.
Disclosure of Invention
The invention provides a multivariable industrial process fault detection method based on a main and auxiliary PCA model, aiming at the problems of low fault detection performance and the like caused by the fact that the traditional PCA method cannot deeply mine local information related to faults. The method can utilize prior fault information and deeply excavate variable local information, improve the fault detection rate and further improve the fault detection result.
In order to achieve the purpose, the invention provides a multivariable industrial process fault detection method based on a main and auxiliary PCA model, which comprises the following steps:
collecting normal data set X and known fault data set F in class C in historical databasecC1, 2, C as a training data set, and using the mean μ and standard deviation σ of the normal data set X for the training data sets X and FcCarrying out standardization processing to obtain a standardized training data set
Figure GDA0002276610680000011
And
Figure GDA0002276610680000012
(II) pairs of datasets
Figure GDA0002276610680000013
Establishing a PCA model as a main monitoring model;
(III) calculating a relative mutual information matrix Delta R of the fault data set relative to the normal data setc,c=1,2,...,C;
(IV) pairs of relative mutual information matrix Delta RcPerforming variable grouping on the process variable based on the generalized Dice coefficient to obtain a grouped data set
Figure GDA0002276610680000014
Wherein, BcThe number of variable groups;
establishing a PCA model for the grouped data set as an auxiliary monitoring model;
(VI) collecting the test data set xnewTest data set X is paired with mean μ and standard deviation σ of normal data set XnewCarrying out standardization processing to obtain a standardized test data set
Figure GDA0002276610680000021
(VII) data set
Figure GDA0002276610680000022
Projecting to the main monitoring model and the auxiliary monitoring model respectively, and calculating a data set
Figure GDA0002276610680000023
Statistics T projected onto the master monitoring model2And SPE, data set
Figure GDA0002276610680000024
Statistics projected onto secondary monitoring model
Figure GDA0002276610680000025
And SPEc,bStatistic T2Control limit of
Figure GDA0002276610680000026
Control limit SPE of statistic SPElimStatistics of
Figure GDA0002276610680000027
Control limit of
Figure GDA0002276610680000028
And statistics SPEc,bControl limit of [ SPE ]c,b]limAll calculated by nuclear density estimation;
(VIII) integrating all monitoring results to obtain total monitoring statistics
Figure GDA0002276610680000029
And BICSPEAccording to the statistics
Figure GDA00022766106800000210
Or statistic BICSPEDetermining whether a data set is exceeded by a control limit
Figure GDA00022766106800000211
Whether a failure has occurred.
Further, in the step (a), the training data sets X and F are processed by formula (1) using the mean μ and the standard deviation σ of the normal data setcThe normalization process is performed, and the expression of formula (1) is:
Figure GDA00022766106800000212
training data sets X and FcAfter the standardization treatment of the formula (1), a standardized training data set can be obtained
Figure GDA00022766106800000213
And
Figure GDA00022766106800000214
further, in the step (two), the training data set is subjected to
Figure GDA00022766106800000215
Carrying out PCA decomposition, and calculating a load matrix P of the training data set through a main monitoring model in formula (2), wherein the formula (2) is expressed as:
Figure GDA00022766106800000216
wherein T is a data set
Figure GDA00022766106800000217
E is a data set
Figure GDA00022766106800000218
Model residual moment ofAnd (5) arraying.
Further, in the step (III), the relative mutual information matrix Delta RcThe calculation steps are as follows:
computing a data set by equation (3)
Figure GDA00022766106800000219
The mutual information matrix R, the data set is calculated by formula (4)
Figure GDA00022766106800000220
Mutual information matrix RcThe formula (3) and the formula (4) are expressed as:
Figure GDA00022766106800000221
Figure GDA0002276610680000031
in the formula, m represents the number of variables, RijRepresenting a data set
Figure GDA0002276610680000032
Of the ith and jth columns, Rc,ijRepresenting a data set
Figure GDA0002276610680000033
The ith and jth columns of (1);
relative mutual information matrix DeltaRcThen it is expressed as:
Figure GDA0002276610680000034
further, in the step (iv), the specific steps of grouping variables are as follows:
(1) defining the relative mutual information vector as:
ri=[ΔRc,i1,ΔRc,i2,…,ΔRc,im]T(6)
the similarity of the relative mutual information correlation degree between a certain variable and the rest variables is measured by using the generalized Dice coefficient, and is defined as follows:
Figure GDA0002276610680000035
in the formula, S is more than or equal to 0i,j≤1;
Selecting to make riThe variable with the maximum | is taken as the first variable group and the number B of the variable groups is initializedc=1;
(2) Selecting the next vector r in order of variablesjWhere j ≠ i and j ≦ m, and calculates the vector r by equation (8)jThe mean of the similarity to each vector in the set of known variables, equation (8), is expressed as: :
Figure GDA0002276610680000036
wherein b represents the b-th variable group, nbRepresenting the number of variables in the b-th variable group;
(3) determining
Figure GDA0002276610680000037
The maximum value in the vector is judged whether the value exceeds the threshold value gamma, if the value exceeds the threshold value gamma, the variable x corresponding to the vector is judgedjIs divided into variable group b; conversely, variable xjForm a new variable group, i.e. Bc=Bc+1;
(4) Repeating the steps (2) and (3) until all variables are grouped, i.e.
Figure GDA0002276610680000038
Further, in the step (five), the data set after the variables are grouped
Figure GDA0002276610680000041
Carrying out PCA decomposition, and calculating a data set after variable grouping through an auxiliary monitoring model in a formula (9)
Figure GDA0002276610680000042
Load matrix Pc,bThe formula (9) is expressed as:
Figure GDA0002276610680000043
in the formula, Tc,bAs a data set
Figure GDA0002276610680000044
Score matrix of, Ec,bAs a data set
Figure GDA0002276610680000045
The model residual matrix of (2).
Further, in the step (six), the test data set X is processed by the formula (10) using the mean μ and the standard deviation σ of the normal data set XnewAnd (3) carrying out normalization processing, wherein the expression of the formula (10) is as follows:
Figure GDA0002276610680000046
test data set xnewAfter the standardization treatment of the formula (10), a standardized test data set can be obtained
Figure GDA0002276610680000047
Further, in step (seven), the data set is calculated by formula (11) and formula (12)
Figure GDA0002276610680000048
Statistics T projected onto the master monitoring model2And SPE, formula (11) and formula (12) are expressed as:
Figure GDA0002276610680000049
Figure GDA00022766106800000410
in the formula, sigma represents a diagonal matrix formed by characteristic values of a main monitoring model;
computing a data set by equation (13) and equation (14)
Figure GDA00022766106800000411
Statistics projected onto secondary monitoring model
Figure GDA00022766106800000412
And SPEc,bEquation (13) and equation (14) are expressed as:
Figure GDA00022766106800000413
Figure GDA00022766106800000414
in the formula, sigmac,bA diagonal matrix formed by characteristic values of the auxiliary monitoring model is represented,
Figure GDA00022766106800000415
indicating obtained from type c fault information
Figure GDA00022766106800000416
Group b variables.
Further, in the step (eight), a Bayesian reasoning is adopted to integrate all monitoring results, and the specific steps are as follows:
defining a sample
Figure GDA00022766106800000417
The probability of failure at the b-th statistic is:
Figure GDA00022766106800000418
in the formula, S represents a statistic T2Statistic SPE, statistic
Figure GDA0002276610680000051
And statistics SPEc,b
Figure GDA0002276610680000052
The posterior probability of a sample failure is represented,
Figure GDA0002276610680000053
representing the posterior probability under normal conditions, and respectively solving through a formula (16) and a formula (17)
Figure GDA0002276610680000054
Equations (16) and (17) are expressed as:
Figure GDA0002276610680000055
Figure GDA0002276610680000056
in the formula, SlimRepresentation statistic T2Statistic SPE, statistic
Figure GDA0002276610680000057
And statistics SPEc,bIf p (f) is the confidence level α, then p (n) is 1- α, and the total monitoring statistic obtained by fusing all monitoring results is:
Figure GDA0002276610680000058
Figure GDA0002276610680000059
further, in the step (eight), the total monitoring statistics after fusion is used
Figure GDA00022766106800000510
Or total monitoring statistic BICSPEDetermining whether a data set exceeds a control limit
Figure GDA00022766106800000511
Whether it is failure data; when in use
Figure GDA00022766106800000512
Or BICSPEIf the value is more than 0.01, the process is considered to have a fault; otherwise, no fault is considered to occur in the process.
Compared with the prior art, the invention has the beneficial effects that:
according to the multivariate industrial process fault detection method provided by the invention, the difference of the structure change of the correlation relationship between the variables caused by the occurrence of the fault is measured by calculating the relative mutual information of the prior fault and the normal data, the variables are grouped by means of the generalized Dice, the known prior fault information can be fully utilized, the waste and omission of useful fault information can be avoided as much as possible, and the local information of the variables can be extracted by grouping the variables; on the basis, a PCA model is established for a normal data set containing all variables as a main monitoring model and a PCA sub-model is established for data sets of different variable groups as an auxiliary monitoring model, Bayesian reasoning is applied to integrate information of the variable groups to obtain total monitoring statistics, whether the test data set fails or not is judged according to whether the monitoring statistics exceeds a control limit, whether the test data set fails or not is judged according to the fused statistics, and then a failure detection result is improved, and the failure detection rate is improved.
Drawings
FIG. 1 is a flow chart of a multivariate industrial process fault detection method based on primary and secondary PCA models of the present invention;
FIG. 2 is a block diagram of a CSTR control system according to an embodiment of the present invention;
FIG. 3a is a mutual information comparison graph of normal test data and standard normal data in a CSTR control system by using the multi-variable industrial process fault detection method based on the primary and secondary PCA models according to the embodiment of the present invention;
FIG. 3b is a diagram showing the mutual information comparison between the fault 1 and the standard normal data in the CSTR control system by using the multivariate industrial process fault detection method based on the primary and secondary PCA models according to the embodiment of the present invention;
FIG. 3c is a diagram showing the mutual information comparison between the median fault 4 and the standard normal data in the multivariate industrial process fault detection method based on the primary and secondary PCA models according to the embodiment of the present invention;
FIG. 4a is a schematic diagram of a result of a prior fault information variable grouping for a CSTR control system using a fault 1 by using the multivariate industrial process fault detection method based on primary and secondary PCA models according to the embodiment of the present invention;
FIG. 4b is a schematic diagram showing a result of grouping prior fault information variables of a fault 4 for a CSTR control system by the multivariate industrial process fault detection method based on primary and secondary PCA models according to the embodiment of the present invention;
FIG. 5a is a schematic diagram of the monitoring result of CSTR control system fault 3 by using the existing PCA method according to the embodiment of the present invention;
FIG. 5b is a schematic diagram showing the monitoring result of the fault 3 of the CSTR control system by using the multivariate industrial process fault detection method based on the primary and secondary PCA models according to the embodiment of the present invention;
FIG. 6a is a schematic diagram of the monitoring result of CSTR control system fault 6 by using the existing PCA method according to the embodiment of the present invention;
fig. 6b is a schematic diagram of a monitoring result of a fault 6 of the CSTR control system by using the multivariate industrial process fault detection method based on the primary and secondary PCA models according to the embodiment of the present invention.
Detailed Description
The invention is described in detail below by way of exemplary embodiments. It should be understood, however, that elements, structures and features of one embodiment may be beneficially incorporated in other embodiments without further recitation.
Referring to fig. 1, the invention discloses a multivariate industrial process fault detection method based on primary and secondary PCA models, comprising the following steps:
collecting normal data set X and known fault data set F in class C in historical databasecC is taken as a training data set, and the training data sets X and F are subjected to formula (1) by using the mean value μ and the standard deviation σ of the normal data setcThe normalization process is performed, and the expression of formula (1) is:
Figure GDA0002276610680000061
training data sets X and FcAfter the standardization treatment of the formula (1), a standardized training data set can be obtained
Figure GDA0002276610680000071
And
Figure GDA0002276610680000072
(II) pairs of datasets
Figure GDA0002276610680000073
Establishing a PCA model as a main monitoring model; the method specifically comprises the following steps: for data sets
Figure GDA0002276610680000074
Carrying out PCA decomposition, and calculating a load matrix P of the training data set through a main monitoring model in formula (2), wherein the formula (2) is expressed as:
Figure GDA0002276610680000075
wherein T is a data set
Figure GDA0002276610680000076
E is a data set
Figure GDA0002276610680000077
The model residual matrix of (2).
(III) calculating a relative mutual information matrix Delta R of the fault data set relative to the normal data setcC is 1,2,. cndot.c; the method comprises the following specific steps of;
computing a data set by equation (3)
Figure GDA0002276610680000078
The mutual information matrix R, the data set is calculated by formula (4)
Figure GDA0002276610680000079
Mutual information matrix RcThe formula (3) and the formula (4) are expressed as:
Figure GDA00022766106800000710
Figure GDA00022766106800000711
in the formula, m represents the number of variables, RijRepresenting a data set
Figure GDA00022766106800000712
Of the ith and jth columns, Rc,ijRepresenting a data set
Figure GDA00022766106800000713
The ith and jth columns of (1);
relative mutual information matrix DeltaRcThen it is expressed as:
Figure GDA00022766106800000714
since the mutual information of the variables caused by different faults is different, the difference between the mutual information of the fault data set and the reference is respectively measured by taking the mutual information matrix R of the normal data set as the reference, so that different variable grouping results can be obtained. In the relative mutual information, each row represents the difference of mutual information change between a certain variable and all variables, and if the change differences of the two variables are similar, the change of the correlation structure between the variables caused by the occurrence of a fault is similar, so that the two variables can be divided into the same variable group.
(IV) pairs of relative mutual information matrix Delta RcPerforming variable grouping on the process variable based on the generalized Dice coefficient to obtain a grouped data set
Figure GDA0002276610680000081
Wherein, BcThe number of variable groups;
the specific steps for grouping variables are as follows:
(1) defining the relative mutual information vector as:
ri=[ΔRc,i1,ΔRc,i2,…,ΔRc,im]T(6)
the similarity of the relative mutual information correlation degree between a certain variable and the rest variables is measured by using the generalized Dice coefficient, and is defined as follows:
Figure GDA0002276610680000082
in the formula, S is more than or equal to 0i,j≤1;Si,jThe closer the value is to 1, the stronger the similarity of the two vectors is, the similar the change of the correlation structure between the variables caused by the fault is, the two variables have a certain internal relationship, and the two variables are divided into the same variable group;
selecting to make riThe variable with the maximum | is taken as the first variable group and the number B of the variable groups is initializedc=1;
(2) Selecting the next vector r in order of variablesjWhere j ≠ i and j ≦ m, and calculates the vector r by equation (8)jThe mean of the similarity to each vector in the set of known variables, equation (8), is expressed as: :
Figure GDA0002276610680000083
wherein b represents the b-th variable group, nbRepresenting the number of variables in the b-th variable group;
(3) determining
Figure GDA0002276610680000084
The maximum value in the vector is judged whether the value exceeds the threshold value gamma, if the value exceeds the threshold value gamma, the variable x corresponding to the vector is judgedjIs divided into variable group b; conversely, variable xjForm a new variable group, i.e. Bc=Bc+1;
(4) Repeating the steps (2) and (3) until all variables are grouped, i.e.
Figure GDA0002276610680000085
According to the invention, the variables with the number less than or equal to 2 in the variable group are synthesized into one variable group in consideration of the complexity of operation. The variable grouping method can effectively utilize the known prior fault information, reduces the waste amount of the known fault information, can further mine the local information of the variables, and is more favorable for improving the detection performance of the fault. In the step, different variable grouping results can be obtained by using different prior fault information.
Establishing a PCA model for the grouped data set as an auxiliary monitoring model; the method specifically comprises the following steps: data set after grouping variables
Figure GDA0002276610680000091
Carrying out PCA decomposition, and calculating a data set after variable grouping through an auxiliary monitoring model in a formula (9)
Figure GDA0002276610680000092
Load matrix Pc,bThe formula (9) is expressed as:
Figure GDA0002276610680000093
in the formula, Tc,bAs a data set
Figure GDA0002276610680000094
Score matrix of, Ec,bAs a data set
Figure GDA0002276610680000095
The model residual matrix of (2).
(VI) collecting the test data set xnewThe test data set X is subjected to equation (10) using the mean μ and standard deviation σ of the normal data set XnewAnd (3) carrying out normalization processing, wherein the expression of the formula (10) is as follows:
Figure GDA0002276610680000096
test data set xnewAfter the standardization treatment of the formula (10), a standardized test data set can be obtained
Figure GDA0002276610680000097
(VII) data set
Figure GDA0002276610680000098
Respectively projecting to the main monitoring model and the auxiliary monitoring model; calculating a data set by equation (11) and equation (12)
Figure GDA0002276610680000099
Statistics T projected onto the master monitoring model2And SPE, formula (11) and formula (12) are expressed as:
Figure GDA00022766106800000910
Figure GDA00022766106800000911
in the formula, sigma represents a diagonal matrix formed by characteristic values of a main monitoring model;
computing a data set by equation (13) and equation (14)
Figure GDA00022766106800000912
Statistics projected onto secondary monitoring model
Figure GDA00022766106800000913
And SPEc,bEquation (13) and equation (14) are expressed as:
Figure GDA00022766106800000914
Figure GDA00022766106800000915
in the formula, sigmac,bA diagonal matrix formed by characteristic values of the auxiliary monitoring model is represented,
Figure GDA00022766106800000916
indicating obtained from type c fault information
Figure GDA00022766106800000917
Group b variables;
computing a respective statistic T by kernel density estimation2Control limit of
Figure GDA00022766106800000918
Control limit SPE of statistic SPElimStatistics of
Figure GDA00022766106800000919
Control limit of
Figure GDA00022766106800000920
And statistics SPEc,bControl limit of [ SPE ]c,b]lim
(VIII) integrating all monitoring results by adopting Bayesian inference to obtain total monitoring statistics
Figure GDA00022766106800000921
And BICSPEThe method comprises the following specific steps:
defining a sample
Figure GDA00022766106800000922
The probability of failure at the b-th statistic is:
Figure GDA0002276610680000101
in the formula, S represents a statistic T2Statistic SPE, statistic
Figure GDA0002276610680000102
And statistics SPEc,b
Figure GDA0002276610680000103
The posterior probability of a sample failure is represented,
Figure GDA0002276610680000104
representing the posterior probability under normal conditions, and respectively solving through a formula (16) and a formula (17)
Figure GDA0002276610680000105
And
Figure GDA0002276610680000106
equations (16) and (17) are expressed as:
Figure GDA0002276610680000107
Figure GDA0002276610680000108
in the formula, SlimRepresentation statistic T2Statistic SPE, statistic
Figure GDA0002276610680000109
And statistics SPEc,bIf p (f) is the confidence level α, then p (n) is 1- α, and the total monitoring statistic obtained by fusing all monitoring results is:
Figure GDA00022766106800001010
Figure GDA00022766106800001011
according to the fused total monitoring statistics
Figure GDA00022766106800001012
Or total monitoring statistic BICSPEDetermining whether a data set exceeds a control limit
Figure GDA00022766106800001013
Whether it is failure data; when in use
Figure GDA00022766106800001014
Or BICSPEIf the value is more than 0.01, the process is considered to have a fault; otherwise, no fault is considered to occur in the process.
In the method, the steps (I) to (V) are off-line modeling stages, and the steps (VI) to (eighth) are on-line testing stages.
According to the fault detection method, on one hand, a PCA model is established by using normal process data and is used as a main monitoring model, on the other hand, variables are grouped according to relative mutual information between the normal process data and the fault data, then the PCA model is established aiming at prior fault information and is used as an auxiliary monitoring model, and the results of the main monitoring model and the auxiliary monitoring model are fused to monitor process changes. The prior fault information can be utilized, variable local information can be deeply excavated, waste and omission of useful fault information are reduced, the fault detection rate is improved, and the fault detection result is improved.
In order to more clearly illustrate the beneficial effects of the above-mentioned fault detection method of the present invention, the following further describes the above-mentioned fault detection method of the present invention with reference to the following embodiments.
Example (b): a control system of a Continuous Stirred Tank Reactor (CSTR) is used as a chemical reactor, has the advantages of low cost, strong heat exchange capability, stable product quality and the like, and is widely applied to industrial process reaction. During the reaction, the reactant A undergoes a first-order irreversible exothermic reaction in the reactor, with the formation of substance B. 10 variables were measured in the CSTR control system, including 4 state variables and 6 input variables, the variables are detailed in Table 1.
TABLE 1
Variables of Description of the invention
Ca Concentration of reactant A flowing out of the reaction kettle
T Temperature of the reaction vessel
Tc Temperature of jacket outlet coolant
h Height of liquid level in reaction kettle
Q Concentration of the effluent from the reactor
Qc Flow of coolant in jacket
Qf Flow rate of feed A
Caf Concentration of feed A to the reactor
Tf Temperature of feed A
Tcf Jacket inlet coolant temperature
In the above CSTR control system simulation, 1000 normal data are collected as a training set, and 6 kinds of fault data in table 2 are generated, each fault contains 1000 samples, and each fault is added from the 161 st sampling point.
TABLE 2
Fault of Description of the invention
1 Step change of feed flow
2 Feed concentration ramping
3 Reduction of the activity of the catalyst
4 Decrease of heat exchange rate
5 Deviation of the reactor temperature sensor
6 Deviation of cooling water temperature sensor
The CSTR control system of the embodiment is subjected to fault detection by adopting the fault detection method (hereinafter referred to as PA-PCA method). And after the fault is detected, comparing fault detection results of different methods through a fault detection rate FDR index in order to evaluate the fault detection performance of different fault detection methods. The failure detection rate FDR is defined as the percentage of the number of pieces of failure data that can be detected to the total number of pieces of failure data. Obviously, the larger the value of the FDR is, the better the fault detection effect of the fault detection method of the industrial process is; on the contrary, the worse the fault detection effect of the industrial process fault detection method.
In the simulation of the CSTR control system of this embodiment, the PCA method and the PA-PCA method of the present invention are used to monitor the process variation. Two different types of information, namely fault 1 (step fault) and fault 4 (slope fault), are selected as prior fault information. In both methods, the number of principal elements is selected according to the variance contribution rate of 80%, the threshold gamma for dividing the variable group is set to be 0.65, and the confidence of 99% is used for calculating the control limit of each method. The effect of fault detection is explained by taking fault 3 and fault 6 as examples.
Fig. 3a shows a mutual information comparison diagram of normal test data and standard normal data, fig. 3b shows a mutual information comparison diagram of a fault 1 in the CSTR control system and standard normal data, and fig. 3c shows a mutual information comparison diagram of a fault 4 in the CSTR control system and standard normal data. In fig. 3a-3c, the mutual information between the variable 1 and the remaining variables is shown. As can be seen from FIG. 3a, the mutual information of the two different normal data sets is substantially overlapped, which indicates that the structure of the correlation relationship between the variables in the process data is not substantially changed under normal operating conditions. It can be seen from fig. 3b and 3c that there is a large difference between the mutual information of two different faults and the mutual information of the standard normal data set, which indicates that the correlation structure between the variables in the process data changes under abnormal conditions, and this also verifies the necessity of the present invention in consideration of the prior fault information.
The malfunction 3 is caused by the change in the activity of the catalyst in the form of a ramp. Fig. 4a shows a diagram of the grouping result of the prior information variable using the fault 1, and fig. 4b shows a diagram of the grouping result of the prior information variable using the fault 4. As can be seen from fig. 4a and 4b, different variable grouping results can be obtained by using different a priori fault information. The PCA method and the PA-PCA method of the invention have a fault monitoring diagram as shown in fig. 5. According to FIG. 5a, T of the PCA method2And SPE statistics respectively give out alarm signals at 760 th sampling time and 639 th sampling time, the fault detection rates of the two statistics are respectively 32.02% and 39.88%, and the fault detection rate is low. In fig. 5b, the two statistics of the PA-PCA method can be alarmed 285 times and 106 times earlier than the conventional PCA method, and the failure detection rates are 46.43% and 58.81%, respectively, which improves the monitoring performance compared with the conventional PCA method.
The failure 6 is caused by a deviation of the cooling water temperature sensor. The monitoring of this fault by both methods is illustrated in fig. 6a and 6 b. As can be seen from fig. 6a, although the two statistics of the PCA method can detect the fault at the 413 th and 239 th sampling moments, the statistics fluctuate on the control line, which makes most of the statistics under the control line, and the fault detection rate is only 26.07% and 40.6%. In contrast, although the monitoring performance of the SPE statistic in the PA-PCA method is basically consistent with that of the traditional PCA method, the detection time is advanced by 1, and the fault detection rate is 43.45%, the T of the PA-PCA method is2The statistic can give an alarm signal at the 161 th sampling moment in time, compared with the T of the PCA method2The statistic is advanced 252 times, and the fault detection rate is high and is 77.5%, and the monitoring performance is improved, as shown in fig. 6 b. Therefore, the PA-PCA method provided by the invention can improve the fault detection performance of the CSTR control system fault 6.
Table 3 shows the failure detection rates for 6 failures of the CSTR control system for the PCA method and the PA-PCA method of the present invention.
TABLE 3
Figure GDA0002276610680000131
As can be seen from table 3, the PA-PCA method of the present invention has the best monitoring effect on 6 faults, has the highest average fault detection rate, and particularly, the monitoring performance on faults 3 and 6 is improved more significantly. By combining the analysis, the fault detection effect of the PA-PCA method is superior to that of the traditional PCA method.
The above-mentioned embodiments are merely provided for the convenience of illustration of the present invention, and do not limit the scope of the present invention, and various simple modifications and modifications made by those skilled in the art within the technical scope of the present invention should be included in the above-mentioned claims.

Claims (8)

1. A multivariable industrial process fault detection method based on a main and auxiliary PCA model comprises the following steps:
collecting normal data set X and known fault data set F in class C in historical databasecC1, 2, C as a training data set, and using the mean μ and standard deviation σ of the normal data set X for the training data sets X and FcCarrying out standardization processing to obtain a standardized training data set
Figure FDA0002276610670000011
And
Figure FDA0002276610670000012
(II) pairs of datasets
Figure FDA0002276610670000013
Establishing a PCA model as a main monitoring model;
(III) calculating a relative mutual information matrix Delta R of the fault data set relative to the normal data setcC is 1,2,. cndot.c; relative mutual information matrix DeltaRcThe calculation steps are as follows:
computing a data set by equation (3)
Figure FDA0002276610670000014
The mutual information matrix R, the data set is calculated by formula (4)
Figure FDA0002276610670000015
Mutual information matrix RcThe formula (3) and the formula (4) are expressed as:
Figure FDA0002276610670000016
Figure FDA0002276610670000017
in the formula, m represents the number of variables, RijRepresenting a data set
Figure FDA0002276610670000018
Of the ith and jth columns, Rc,ijRepresenting a data set
Figure FDA0002276610670000019
The ith and jth columns of (1);
relative mutual information matrix DeltaRcThen it is expressed as:
Figure FDA00022766106700000110
(IV) pairs of relative mutual information matrix Delta RcPerforming variable grouping on the process variable based on the generalized Dice coefficient to obtain a grouped data set
Figure FDA00022766106700000111
Wherein, BcThe number of variable groups;
establishing a PCA model for the grouped data set as an auxiliary monitoring model;
(VI) collecting the test data set xnewTest data set X is paired with mean μ and standard deviation σ of normal data set XnewCarrying out standardization processing to obtain a standardized test data set
Figure FDA0002276610670000021
(VII) data set
Figure FDA0002276610670000022
Respectively projecting to the main monitoring model and the auxiliary monitoring model,and calculating a data set
Figure FDA0002276610670000023
Statistics T projected onto the master monitoring model2And SPE, data set
Figure FDA0002276610670000024
Statistics projected onto secondary monitoring model
Figure FDA0002276610670000025
And SPEc,bSeparately calculating the statistic T by kernel density estimation2Control limit of
Figure FDA0002276610670000026
Control limit SPE of statistic SPElimStatistics of
Figure FDA0002276610670000027
Control limit of
Figure FDA0002276610670000028
And statistics SPEc,bControl limit of [ SPE ]c,b]lim
(VIII) integrating all monitoring results by adopting Bayesian inference to obtain total monitoring statistics
Figure FDA0002276610670000029
And BICSPEAccording to the statistics
Figure FDA00022766106700000210
Or statistic BICSPEDetermining whether a data set is exceeded by a control limit
Figure FDA00022766106700000211
Whether a fault occurs; the specific steps of integrating all monitoring results by adopting Bayesian inference are as follows:
defining a sample
Figure FDA00022766106700000212
The probability of failure at the b-th statistic is:
Figure FDA00022766106700000213
in the formula, S represents a statistic T2Statistic SPE, statistic
Figure FDA00022766106700000214
And statistics SPEc,b
Figure FDA00022766106700000215
The posterior probability of a sample failure is represented,
Figure FDA00022766106700000216
representing the posterior probability under normal conditions, and respectively solving through a formula (16) and a formula (17)
Figure FDA00022766106700000217
And
Figure FDA00022766106700000218
equations (16) and (17) are expressed as:
Figure FDA00022766106700000219
Figure FDA00022766106700000220
in the formula, SlimRepresentation statistic T2Statistic SPE, statistic
Figure FDA00022766106700000221
And statistics SPEc,bThe corresponding control limit, p (f) is confidence level α, then p (n) 1- α, and all of them are fusedThe total monitoring statistic obtained by the monitoring result is as follows:
Figure FDA00022766106700000222
2. the multivariate industrial process fault detection method based on primary and secondary PCA models as claimed in claim 1, wherein in the step (one), training data sets X and F are processed by formula (1) using the mean μ and standard deviation σ of the normal data setcThe normalization process is performed, and the expression of formula (1) is:
Figure FDA0002276610670000031
training data sets X and FcAfter the standardization treatment of the formula (1), a standardized training data set can be obtained
Figure FDA0002276610670000032
And
Figure FDA0002276610670000033
3. the multivariate industrial process fault detection method based on primary and secondary PCA models as claimed in claim 2, wherein in step (two), the training data set is compared
Figure FDA0002276610670000034
Carrying out PCA decomposition, and calculating a load matrix P of the training data set through a main monitoring model in formula (2), wherein the formula (2) is expressed as:
Figure FDA0002276610670000035
wherein T is a data set
Figure FDA0002276610670000036
E is a data set
Figure FDA0002276610670000037
The model residual matrix of (2).
4. The multivariate industrial process fault detection method based on the primary and secondary PCA models as claimed in claim 1, wherein in the step (IV), the specific steps of performing variable grouping are as follows:
(1) defining the relative mutual information vector as:
ri=[ΔRc,i1,ΔRc,i2,…,ΔRc,im]T(6)
the similarity of the relative mutual information correlation degree between a certain variable and the rest variables is measured by using the generalized Dice coefficient, and is defined as follows:
Figure FDA0002276610670000038
in the formula, S is more than or equal to 0i,j≤1;
Selecting to make riThe variable with the maximum | is taken as the first variable group and the number B of the variable groups is initializedc=1;
(2) Selecting the next vector r in order of variablesjWhere j ≠ i and j ≦ m, and calculates the vector r by equation (8)jThe mean of the similarity to each vector in the set of known variables, equation (8), is expressed as:
Figure FDA0002276610670000039
wherein b represents the b-th variable group, nbRepresenting the number of variables in the b-th variable group;
(3) determining
Figure FDA0002276610670000041
The maximum value in the vector is judged whether the value exceeds the threshold value gamma, if the value exceeds the threshold value gamma, the variable x corresponding to the vector is judgedjIs divided into variable group b; conversely, variable xjForm a new variable group, i.e. Bc=Bc+1;
(4) Repeating the steps (2) and (3) until all variables are grouped, i.e.
Figure FDA0002276610670000042
5. The multivariate industrial process fault detection method based on primary and secondary PCA models as claimed in claim 4, wherein in step (V), the data sets after variable grouping are performed
Figure FDA0002276610670000043
Carrying out PCA decomposition, and calculating a data set after variable grouping through an auxiliary monitoring model in a formula (9)
Figure FDA0002276610670000044
Load matrix Pc,bThe formula (9) is expressed as:
Figure FDA0002276610670000045
in the formula, Tc,bAs a data set
Figure FDA0002276610670000046
Score matrix of, Ec,bAs a data set
Figure FDA0002276610670000047
The model residual matrix of (2).
6. The multivariate industrial process fault detection method based on primary and secondary PCA models as claimed in claim 5, wherein in step (six), the test data set X is subjected to formula (10) by using the mean μ and standard deviation σ of the normal data set XnewAnd (3) carrying out normalization processing, wherein the expression of the formula (10) is as follows:
Figure FDA0002276610670000048
test data set xnewAfter the standardization treatment of the formula (10), a standardized test data set can be obtained
Figure FDA0002276610670000049
7. The multivariate industrial process fault detection method based on primary and secondary PCA models as claimed in claim 6, wherein in step (seventy), the dataset is calculated by formula (11) and formula (12)
Figure FDA00022766106700000410
Statistics T projected onto the master monitoring model2And SPE, formula (11) and formula (12) are expressed as:
Figure FDA00022766106700000411
Figure FDA00022766106700000412
in the formula, sigma represents a diagonal matrix formed by characteristic values of a main monitoring model;
computing a data set by equation (13) and equation (14)
Figure FDA00022766106700000413
Statistics projected onto secondary monitoring model
Figure FDA00022766106700000414
And SPEc,bEquation (13) and equation (14) are expressed as:
Figure FDA00022766106700000415
Figure FDA0002276610670000051
in the formula, sigmac,bA diagonal matrix formed by characteristic values of the auxiliary monitoring model is represented,
Figure FDA0002276610670000052
indicating obtained from type c fault information
Figure FDA0002276610670000053
Group b variables.
8. The multivariate industrial process fault detection method based on primary and secondary PCA models as claimed in claim 1, wherein in step (eight), the total monitoring statistics after fusion are based on
Figure FDA0002276610670000054
Or total monitoring statistic BICSPEDetermining whether a data set exceeds a control limit
Figure FDA0002276610670000055
Whether it is failure data; when in use
Figure FDA0002276610670000056
Or BICSPEIf the value is more than 0.01, the process is considered to have a fault; otherwise, no fault is considered to occur in the process.
CN201811503665.9A 2018-12-10 2018-12-10 Multivariable industrial process fault detection method based on main and auxiliary PCA models Active CN109407652B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811503665.9A CN109407652B (en) 2018-12-10 2018-12-10 Multivariable industrial process fault detection method based on main and auxiliary PCA models

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811503665.9A CN109407652B (en) 2018-12-10 2018-12-10 Multivariable industrial process fault detection method based on main and auxiliary PCA models

Publications (2)

Publication Number Publication Date
CN109407652A CN109407652A (en) 2019-03-01
CN109407652B true CN109407652B (en) 2020-03-06

Family

ID=65458148

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811503665.9A Active CN109407652B (en) 2018-12-10 2018-12-10 Multivariable industrial process fault detection method based on main and auxiliary PCA models

Country Status (1)

Country Link
CN (1) CN109407652B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110501986B (en) * 2019-09-03 2021-06-18 山东科技大学 Quality related process monitoring method based on weighted partial mutual information
CN110942258B (en) * 2019-12-10 2022-02-25 山东科技大学 Performance-driven industrial process anomaly monitoring method
CN111126870B (en) * 2019-12-30 2023-10-27 华东理工大学 Sewage treatment process abnormal condition detection method by utilizing integrated principal component analysis
CN111382029B (en) * 2020-03-05 2021-09-03 清华大学 Mainboard abnormity diagnosis method and device based on PCA and multidimensional monitoring data
CN111752147B (en) * 2020-05-28 2022-04-22 山东科技大学 Multi-working-condition process monitoring method with continuous learning capability and improved PCA (principal component analysis)
CN111914888A (en) * 2020-06-13 2020-11-10 宁波大学 Chemical process monitoring method integrating multi-working-condition identification and fault detection
CN112180893B (en) * 2020-09-15 2021-07-13 郑州轻工业大学 Construction method of fault-related distributed orthogonal neighborhood preserving embedded model in CSTR process and fault monitoring method thereof
CN112947649B (en) * 2021-03-19 2021-11-23 安阳师范学院 Multivariate process monitoring method based on mutual information matrix projection
CN115291582A (en) * 2022-02-22 2022-11-04 江南大学 Method, device and equipment for monitoring faults in iron-making process

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015171654A1 (en) * 2014-05-06 2015-11-12 Kla-Tencor Corporation Automatic calibration sample selection for die-to-database photomask inspection
CN105955219A (en) * 2016-05-30 2016-09-21 宁波大学 Distributed dynamic process fault detection method based on mutual information
CN108762228A (en) * 2018-05-25 2018-11-06 江南大学 A kind of multi-state fault monitoring method based on distributed PCA

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015171654A1 (en) * 2014-05-06 2015-11-12 Kla-Tencor Corporation Automatic calibration sample selection for die-to-database photomask inspection
CN105955219A (en) * 2016-05-30 2016-09-21 宁波大学 Distributed dynamic process fault detection method based on mutual information
CN108762228A (en) * 2018-05-25 2018-11-06 江南大学 A kind of multi-state fault monitoring method based on distributed PCA

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Distributed plant-wide process monitoring based on PCA with minimal redundancy maximal relevance;Chen Xu 等;《Chemometrics and Intelligent Laboratory Systems》;20171231;第53-63页 *
基于双层局部KPCA 的非线性过程微小故障检测方法;邓晓刚 等;《化工学报》;20180731;第3092-3100 *

Also Published As

Publication number Publication date
CN109407652A (en) 2019-03-01

Similar Documents

Publication Publication Date Title
CN109407652B (en) Multivariable industrial process fault detection method based on main and auxiliary PCA models
CN108062565B (en) Double-principal element-dynamic core principal element analysis fault diagnosis method based on chemical engineering TE process
Lou et al. A novel multivariate statistical process monitoring algorithm: Orthonormal subspace analysis
EP3005004B1 (en) System and method for monitoring a process
Yu Localized Fisher discriminant analysis based complex chemical process monitoring
Yu A new fault diagnosis method of multimode processes using Bayesian inference based Gaussian mixture contribution decomposition
CN108803520B (en) Dynamic process monitoring method based on variable nonlinear autocorrelation rejection
CN110244692B (en) Chemical process micro-fault detection method
Yu A support vector clustering‐based probabilistic method for unsupervised fault detection and classification of complex chemical processes using unlabeled data
Zhang et al. Fault detection in the Tennessee Eastman benchmark process using principal component difference based on k-nearest neighbors
CN104714537A (en) Fault prediction method based on joint relative change analysis and autoregression model
Guo et al. Fault detection based on robust characteristic dimensionality reduction
CN108388234B (en) Fault monitoring method based on relevance division multi-variable block PCA model
CN104699077A (en) Nested iterative fisher discriminant analysis-based fault diagnosis isolation method
CN108919755B (en) Distributed fault detection method based on multiple nonlinear cross relation models
CN108830006B (en) Linear-nonlinear industrial process fault detection method based on linear evaluation factor
Maestri et al. A robust clustering method for detection of abnormal situations in a process with multiple steady-state operation modes
CN104536439A (en) Fault diagnosis method based on nested iterative Fisher discriminant analysis
CN114611067A (en) Chemical process slow-change fault detection method based on typical variable dissimilarity analysis
CN109683594B (en) Method for accurately identifying and positioning abnormal variable
CN112904810B (en) Process industry nonlinear process monitoring method based on effective feature selection
CN103995985A (en) Fault detection method based on Daubechies wavelet transform and elastic network
CN116661410A (en) Large-scale industrial process fault detection and diagnosis method based on weighted directed graph
CN113253682B (en) Nonlinear chemical process fault detection method
CN111914886A (en) Nonlinear chemical process monitoring method based on online brief kernel learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant