CN116661410A - Large-scale industrial process fault detection and diagnosis method based on weighted directed graph - Google Patents

Large-scale industrial process fault detection and diagnosis method based on weighted directed graph Download PDF

Info

Publication number
CN116661410A
CN116661410A CN202310505573.9A CN202310505573A CN116661410A CN 116661410 A CN116661410 A CN 116661410A CN 202310505573 A CN202310505573 A CN 202310505573A CN 116661410 A CN116661410 A CN 116661410A
Authority
CN
China
Prior art keywords
module
variables
variable
directed graph
monitoring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310505573.9A
Other languages
Chinese (zh)
Inventor
徐琛
王竞志
陶洪峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangnan University
Original Assignee
Jiangnan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangnan University filed Critical Jiangnan University
Priority to CN202310505573.9A priority Critical patent/CN116661410A/en
Publication of CN116661410A publication Critical patent/CN116661410A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B23/00Testing or monitoring of control systems or parts thereof
    • G05B23/02Electric testing or monitoring
    • G05B23/0205Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults
    • G05B23/0259Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterized by the response to fault detection
    • G05B23/0262Confirmation of fault detection, e.g. extra checks to confirm that a failure has indeed occurred
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B2219/00Program-control systems
    • G05B2219/20Pc systems
    • G05B2219/24Pc safety
    • G05B2219/24065Real time diagnostics
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Abstract

The invention discloses a large-scale industrial process fault detection and diagnosis method based on a weighted directed graph, and belongs to the field of industrial process monitoring and safety production. The method utilizes normalized mutual information between a process flow chart and variables to construct a process weighted directed graph model, and carries out module division based on a module degree maximization principle; a kernel principal component analysis method is introduced to establish a monitoring model so as to solve the problem of nonlinearity among the processes; after the new acquired data is standardized, processing according to a monitoring model, calculating and fusing monitoring statistics of each module, and comparing the monitoring statistics with a statistics control limit to judge whether the current process state is normal or not; once a fault is detected, fault related variables are identified by the weighted contribution of the variables, and the root-cause variable identification and propagation path analysis of the fault is performed in combination with the process directed graph.

Description

Large-scale industrial process fault detection and diagnosis method based on weighted directed graph
Technical Field
The invention relates to a large-scale industrial process fault detection and diagnosis method based on a weighted directed graph, belonging to the field of industrial process monitoring and safety production.
Background
Along with the urgent demands of modern industry for improving production efficiency and resource utilization rate, industrial production is developing towards large-scale and complicated directions, so that a large-scale industrial process is one of the main modes of modern industry, such as large-scale industrial continuous production of refined adipic acid, production and preparation of chlor-alkali, and industrial preparation of acid materials such as nitric acid and sulfuric acid. Since the above-described large-scale industrial process generally includes a plurality of sub-operation units, different units often operate in different environments, have different mechanisms and control strategies, and interact and co-operate with each other, resulting in the large-scale industrial process generally having complicated process characteristics. When the operation state of a large-scale industrial process is analyzed, the limitation of the traditional global-based process monitoring strategy is gradually highlighted, and the method mainly comprises the steps that the monitoring strategy adopting global overall modeling is difficult to attach to the information of local variables and neglect the local process behavior, so that the monitoring performance is reduced, and the requirement of accurately judging the process state cannot be met. Therefore, developing a monitoring strategy suitable for large-scale industrial processes is a necessary requirement in new period, and has important research value and practical significance.
In recent years, large-scale industrial process monitoring methods are mainly focused on research of module division methods, and as key steps of the monitoring methods, the monitoring performance of the methods is directly determined by the advantages and disadvantages of the module division. The module division is mainly divided into two main categories based on process knowledge and data driving. The process knowledge-based partitioning method is to partition the process variables, such as the composition of the process objects, the composition of the mechanism model, according to a priori knowledge and expert experience. The link relation among the variables in the process is qualitatively described in the process knowledge, so that the method is helpful for explaining the monitoring result and judging the fault propagation mode, however, the method generally requires to acquire complete process knowledge, and has certain limitation in large-scale industrial processes with higher complexity. In contrast, the data-driven partitioning method realizes module partitioning by mining the process data, so that dependence on process knowledge is avoided, and common methods include Pearson correlation coefficient partitioning, mutual information partitioning, KL divergence partitioning, jarque-Bera inspection partitioning and the like. However, these data-driven methods focus only on the analysis of process variable statistics, and ignore the direct links that exist in the process for the variables, thus easily leading to spurious correlations between module variables that affect the ultimate monitoring performance of the method.
In addition, in the aspect of fault root diagnosis, a common contribution rate diagnosis method is easily influenced by tailing effect (namely, the influence of a fault variable on a normal variable), so that the fault root variable is covered, and the lack of process knowledge causes that the diagnosis method cannot analyze a fault propagation path, and is also unfavorable for solving a fault problem from a root in actual production.
Disclosure of Invention
In order to realize the identification of the fault source variable and improve the fault detection accuracy and efficiency, the application provides a large-scale industrial process fault detection and diagnosis method based on a weighted directed graph, which aims at a large-scale industrial process and comprises the following steps:
step one: for an industrial process, constructing a process directed graph model g= (v, E), and obtaining an adjacency matrix E thereof according to the directed graph; the directed graph comprises a node set { v } i I=1, 2, n } and directed edge set { e } ij I, j=1, 2, n }, the set of nodes being a set of n process variables of the industrial process, the set of directed edges being a set of causal relationships between the n process variables of the industrial process; the adjacency matrix E is used for describing a directed graph model, the rows and columns of the matrix represent nodes, and the elements of the ith row and the jth column in the matrix represent whether directed edges pointing to the node j from the node i exist or not;
Step two: acquiring sample data under normal operation conditions of an industrial process to form training data, wherein the sample data are data under normal operation conditions of the n process variables;
step three: processing the adjacency matrix of the directed graph according to mutual information values MI (i, j) between any two variables i, j in sample data under the normal operation working condition of the industrial process to obtain a weighted adjacency matrix and a corresponding weighted directed graph;
step four: taking the modularity as a division judgment standard to carry out module division on the weighted directed graph obtained in the step three;
step five: based on the module dividing result of the step four, selecting a kernel function at each module, and training data by using each moduleEstablishing a base within each moduleLocal monitoring model of KPCA and obtaining monitoring index control limit from training dataSPE th,b
Step six: collecting a sample x at the current moment new And performing standardization to obtain standardized test dataAccording to the module dividing result obtained in the step four, correspondingly dividing the current time sample into B modules, and according to the local monitoring model of each module in the step five, obtaining the monitoring statistic of each module of the current test data>And SPE (SPE) b
Step seven: monitoring statistics of each module according to Bayesian inference And SPE (SPE) new,b Fused into probability indexWith BIC SPE Obtaining a final detection result of the current process state, wherein a subscript B represents a B-th module in the B modules;
step eight: if the final detection result is that a fault occurs, acquiring two monitoring statistics T of each variable pair by adopting a weighted contribution rate method 2 The statistics contribution rate threshold corresponding to the SPE contribution rate is recorded asAnd CSPE (physical layer PE) th The method comprises the steps of carrying out a first treatment on the surface of the Dividing variables exceeding a contribution rate threshold into fault related variable sets, acquiring a local directed graph structure of the fault according to the fault related variable sets, and determining source variables and propagation paths of the fault.
Optionally, the second step includes:
collecting sample data under normal operation condition of industrial process to form training data set X epsilon R m×n
Performing standardization processing on samples corresponding to each row in X according to the formula (1) to obtain new training data
Wherein m is the number of samples, n is the number of process variables, x ε R 1×n In the case of any one of the samples,as normalized samples, μ= [ μ ] 12 ,…,μ n ]For mean line vector, delta E R n×n Is a diagonal matrix and the diagonal element is a standard deviation delta 12 ,…,δ n
Optionally, the third step includes:
step 3.1, calculating a mutual information value MI (i, j) between any two variables i, j according to a formula (2) according to standardized training data;
Normalizing the mutual information according to the formula (3);
wherein NMI (i, j) is normalized mutual information between variables i, j, H (·) is information entropy, p (i, j) is a joint probability distribution function, and p (i), p (j) are edge probability distribution functions of variables i and j respectively;
step 3.2, processing the adjacency matrix of the process directed graph to obtain a weighted adjacency matrix and a corresponding weighted directed graph:
1) Based on the process directed graph model G, an adjacency matrix E R is constructed n×n Element e of the adjacency matrix ij The causal relationship between the variables i and j in the process is described, and the value is 0 or 1:
if said element e ij =0, then the directed edges between the variables do not exist;
if said element e ij =1, then it indicates that there is a directed edge from variable i to variable j;
2) Based on normalized mutual information, a correlation matrix U epsilon R of the process variable is obtained n×n Element u of the correlation matrix ij NMI (i, j) describes the correlation of variables i and j from the information entropy perspective;
3) Constructing a weighted adjacency matrix based on adjacency matrix E and correlation matrix U, wherein the weighted adjacency matrix is defined as E U Obtained by Hadamard product calculation of the original adjacency matrix E and the correlation matrix U, i.e. E U =e×u, expanded to be represented by formula (4), E U The elements describe the weight of the directed edges contained in the original directed graph and are based on a weighted adjacency matrix E U Constructing a new weighted directed graph:
wherein ,ui,j Is an element in the correlation matrix U.
Optionally, the fourth step includes:
step 4.1, initializing each variable as a module, sequentially attempting to divide each variable into the modules where the outgoing side pointing variables are located, and calculating the module degree at the moment; selecting to incorporate the variable into the module that maximizes the module degree increment;
the modularity calculation mode is shown in the formula (5);
wherein Q is the modulusBlock, A ij Directed edge weights for variable i to variable j,is the sum of the weights of all sides of the directed graph, +.>Inward side weight sum of variable i, +.>Is the sum of the edge weights of the variables j, c i A module to which the variable i belongs in the optimization process; delta (c) i ,c j ) As a Cronecker function, when c i =c j When the value is 1, otherwise, the value is 0;
step 4.2, iterating step 4.1 until the module to which each variable belongs is no longer changed, since there may be multiple directed edges connecting the variables in the graph;
step 4.3, aggregating each module into a new node, and updating the directed edge weights in the modules and among the modules;
step 4.4, repeating the steps 4.1 to 4.3 until the modularity is not increased any more, and obtaining the optimal division result V of the current weighted directed graph 1 ,V 2 ,…,V B], wherein ,b= [1,2, …, B, variable set for the B-th module]For the module reference numerals, n b The number of variables in the b-th module;
based on the dividing result obtained in the above steps, the training data set is divided into corresponding B modules, specifically wherein />Representing the training data set corresponding to the b-th module.
Optionally, the fifth step includes:
step 5.1, calculating Module dataThe covariance matrix C after being mapped to the high-dimensional space is converted into a eigenvalue lambda and eigenvector v solving problem of the covariance matrix C after the data is mapped to the high-dimensional space by introducing a kernel function k (), taking the complexity of the nonlinear mapping phi into consideration:
step 5.2, further centering the core matrix K, i.e., replacing K in the above formula with K-KE-EK-EKE, wherein E is a factor of 1/n b N of (2) b A rank unit array; obtaining a characteristic value lambda and a characteristic vector alpha from the above, and preserving the previous k b The largest eigenvalue and the corresponding eigenvector are normalizedWhere i=1, 2, …, k, k is determined from the cumulative variance contribution;
step 5.3, extracting nonlinear principal component of training data sample x, calculating its retained feature vectorProjection on +.>Obtain its score vector t= [ t ] 1 ,t 2 ,…,t k ] T Thereby calculating the monitoring statistic T of the training data sample x 2 =t T Λ -1 t and spe=k (x, x) -t T t;
Step 5.4, repeating the step 5.3 for m training data samples to obtain the monitoring statistic of the training data under the normal working condition, and controlling the limit according to the monitoring index obtained by the nuclear density estimationSPE th,b
Optionally, the step seven includes:
monitoring statistics of each module according to Bayesian inferenceAnd SPE (SPE) new,b Fused into probability index->With BIC SPE Obtaining a final detection result of the current process state according to the formula (7);
where β=0.01 is the level of significance.
Optionally, the step eight includes:
step 8.1, calculating the contribution rate of the variable i to two statistics in the acquired sample at the time t according to the following formula (8) and formula (9):
wherein xt,i For the time t, the value of a variable i in a sample is acquired, K t For a kernel matrix of a sample acquired at the moment t, θ can be estimated approximately according to data under normal working conditions;
step 8.2, calculating the variable i vs T in the module b according to the formula (10) and the formula (11) 2 Weighted contribution rate to SPE two monitoring statistics:
wherein ,the contribution weight of the b-th module to the current monitoring statistic is given;
step 8.3, the monitoring statistic contribution rate thresholdThe variables with the contribution rate exceeding the threshold are identified as fault related variables, and the obtained set is V F
Step 8.4, based on set V F And acquiring a local directed graph formed by variables in the set, qualitatively determining a fault propagation path according to causal connection of the acquired local directed graph, and identifying a root node variable as a root variable of a fault.
Optionally, the kernel function k (·) in the fifth step is a linear kernel function, a polynomial kernel function, or a gaussian kernel function.
Optionally, the cumulative variance contribution rate threshold is 85%.
The application has the beneficial effects that:
1) The large-scale industrial process fault detection and diagnosis method based on the weighted directed graph, provided by the application, constructs the weighted directed graph according to the process flow chart and the normalized mutual information, fully considers the causal relation existing in the process structure of the variable and the correlation strength on the data statistics information, can realize more reasonable module division through a community network division algorithm, does not need more complete priori knowledge and expert experience, avoids the dependence on the process knowledge, can pay attention to the direct relation existing in the process of the variable, avoids the pseudo-correlation among the module variables, and improves the final monitoring performance;
2) According to the application, the difference of the control limits of the monitoring statistics in each module is considered, the variable contribution rate calculation mode of the traditional kernel principal component analysis is improved by giving the contribution rate weight to each module, the accuracy of variable contribution rate calculation is effectively improved, and the contribution rate threshold is regulated so as to automatically distinguish fault related variables from irrelevant variables, thereby reducing the subsequent fault root identification range and improving the diagnosis efficiency;
3) According to the application, the local directed graph is obtained from the process weighted directed graph through the fault related variable set, and the source variable identification and propagation path analysis of faults can be accurately realized by utilizing the variable causal relationship expressed by the directed edges, so that the possibility of solving the fault problem from the source in actual production is provided.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for large scale industrial process fault detection and diagnosis based on weighted directed graph provided by the present application.
FIG. 2 is a TE process flow diagram involved in one embodiment of the application.
Fig. 3 is a process directed graph of the present application for TE process setup.
FIG. 4A is a diagram of a prior art large-scale process monitoring method DPCA monitoring process on TE process fault 1;
FIG. 4B is a diagram of a prior art large-scale process monitoring method MBKPCA monitoring a TE process fault 1;
FIG. 4C is a diagram of the monitoring process of the method WDG-MBKPCA of the present invention on TE process fault 1.
FIG. 5A is a weighted contribution of the method of the present invention to TE process fault 1;
fig. 5B is a fault local directed graph of the method of the present invention for TE process fault 1.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
Embodiment one:
the present embodiment provides a method for detecting and diagnosing faults in a large-scale industrial process based on weighted directed graphs, referring to fig. 1, the method includes:
step one: obtaining a directed graph model G= (v, E) of a process flow chart PFD construction process corresponding to the industrial process, and obtaining an adjacent matrix E of the directed graph according to the directed graph;
inclusion of node sets { v } in directed graph i I=1, 2, …, n } and directed edge set { e } ij I, j=1, 2, …, n }, the set of nodes is a set of process variables, and assuming that the industrial process contains n process variables, the set of directed edges is a set of causal relationships between the nodes, the direction of the directed edges points from the parent node to the child node, and the variable characteristics represented by the child node are affected by the variable characteristics represented by the parent node.
The adjacency matrix E is used for describing the directed graph model, the rows and columns of the matrix represent nodes, and the elements in the j-th row and the j-th column in the matrix represent whether directed edges pointing from the node i to the node j exist or not, and the value is 0 or 1.
Wherein the process variable refers to a variable which can be acquired by a sensor and other devices in the industrial process, such as a process variable in the blast furnace ironmaking process including tuyere flow, tuyere temperature, tuyere pressure, furnace top gas CO content, furnace top gas CO 2 Content, etc. Directed edge e ij It means that the i-th variable will have an influence on the j-th variable, e.g. for a blast furnace ironmaking process the top pressure will have an influence on the top gas CO content and CO 2 The content has an influence, in particular in that an increase in the roof pressure leads to a CO content and CO of the roof gas 2 Rising, there is thus a variable directed from top gas to top gas CO content, top gas CO, in the process map 2 Two directional edges of the content.
Step two: acquiring sample data under the normal operation working condition of an industrial process;
collecting sample data under normal operation condition of industrial process, and composingTraining data set X ε R m×n To eliminate the dimension effect, the mean mu of the corresponding variables of each column is calculated 12 ,…,μ n And standard deviation delta 12 ,…,δ n The samples corresponding to each row in X are standardized according to the formula (1) to obtain new training data
Wherein m is the number of samples, n is the number of process variables, x ε R 1×n In the case of any one of the samples, As normalized samples, μ= [ μ ] 12 ,…,μ n ]For mean line vector, delta E R n×n Is a diagonal matrix and the diagonal element is a standard deviation delta 12 ,…,δ n
Step three: processing the adjacency matrix of the directed graph according to mutual information value MI (i, j) between any two variables i, j in sample data under the normal operation working condition of the industrial process to obtain a weighted directed graph;
calculating a mutual information value MI (i, j) between any two variables i, j according to the standardized training data;
the process directed graph in the first step is required to accurately describe the data correlation between variables, so that the mutual information is normalized according to the formula (3), and the adjacency matrix of the process directed graph is processed to obtain a weighted directed graph;
wherein NMI (i, j) is normalized mutual information between variables i, j, H (·) is information entropy, p (i, j) is a joint probability distribution function, and p (i), p (j) are edge probability distribution functions of variables i and j, respectively.
Step four: taking the modularity as a division judgment standard to carry out module division on the weighted directed graph obtained in the step three;
in the dividing process, a process variable is generally selected to be divided into a plurality of modules with the characteristic of strong intra-module variable correlation and weak inter-module variable correlation, so that the module degree is selected as a dividing judgment standard of the directed graph, and the larger the module degree is, the more the dividing result accords with the dividing characteristic; based on a modularity maximization principle, the division of the weighted directed graph is realized by using a community network division algorithm; the module degree calculation mode is shown in a formula (4);
Wherein Q is modularity, A ij Directed edge weights for variable i to variable j,is the sum of the weights of all sides of the directed graph, +.>Inward side weight sum of variable i, +.>Is the sum of the edge weights of the variables j, c i Module to which variable i belongs in the optimization process, delta (c) i ,c j ) As a Cronecker function, when c i =c j When the value is 1, otherwise, the value is 0; based on the weighted directed graph division result, a module division result of the process variable is obtained as +.> wherein />b=[1,2,…,B]For the module reference numerals, n b The number of variables in the b-th module.
Step five: based on the module division result of the step four, selecting a kernel function in each module, and performing feature dimension reduction on the training data set by using a kernel principal component analysis algorithm; preserving the top k according to the cumulative variance contribution b Diagonal matrix of maximum eigenvaluesAnd a matrix of corresponding eigenvectors>Obtaining monitoring index control limit from training data based on nuclear density estimation>SPE th,b
Step six: collecting a sample x at the current moment new Obtaining standardized test data based on the mean value, standard deviation and standardized formula (1) in the second stepAccording to the different module variables in step four, correspondingly +.>Dividing the test data into B modules, and obtaining monitoring statistics of each module of the current test data according to the monitoring model of each module in the fifth step >And SPE (SPE) b
Step seven: monitoring statistics of each module according to Bayesian inferenceAnd SPE (SPE) new,b Fusion intoProbability indexWith BIC SPE Obtaining a final detection result of the current process state according to the formula (4);
where β=0.01 is the level of significance.
Step eight: after detecting that the process has faults, acquiring two monitoring statistics T of each variable pair by adopting a weighted contribution rate method 2 The statistics contribution rate threshold corresponding to the SPE contribution rate is recorded asAnd CSPE (physical layer PE) th The method comprises the steps of carrying out a first treatment on the surface of the Dividing variables exceeding a contribution rate threshold into fault related variable sets, and acquiring a local directed graph structure of the fault according to the fault related variable sets to realize root variable identification and propagation path analysis of the fault.
Embodiment two:
the embodiment provides a large-scale industrial process fault detection and diagnosis method based on a weighted directed graph, which comprises the following steps:
step one: obtaining a directed graph model G= (v, E) of a process flow chart PFD construction process corresponding to the industrial process, and obtaining an adjacent matrix E of the directed graph according to the directed graph;
inclusion of node sets { v } in directed graph i i=1, 2, …, n } and directed edge set { e } ij I, j=1, 2, …, n }, the set of nodes is a set of process variables, and assuming that the industrial process comprises n process variables, the set of directed edges is a set of causal relationships between nodes, the direction of the directed edges points from a parent node to a child node, and the variable characteristics represented by the child node are affected by the variable characteristics represented by the parent node. The adjacency matrix E is used for describing a directed graph model, the rows and columns of the matrix represent nodes, and the elements in the j-th row and the j-th column in the matrix represent whether directed edges pointing from the node i to the node j exist or not The value is 0 or 1.
Wherein, the process variable refers to a variable which can be collected by a sensor and other devices in the industrial process. Directed edge e ij It means that the ith variable will have an effect on the jth variable.
Step two: acquiring sample data under the normal operation working condition of an industrial process;
collecting sample data under normal operation condition of industrial process to form training data set X epsilon R m×n To eliminate the dimension effect, the mean mu of the corresponding variables of each column is calculated 12 ,…,μ n And standard deviation delta 12 ,…,δ n The samples corresponding to each row in X are standardized according to the formula (1) to obtain new training data
Wherein m is the number of samples, n is the number of process variables, x ε R 1×n In the case of any one of the samples,as normalized samples, μ= [ μ ] 12 ,…,μ n ]For mean line vector, delta E R n×n Is a diagonal matrix and the diagonal element is a standard deviation delta 12 ,…,δ n
Step three: processing the adjacency matrix of the directed graph according to mutual information value MI (i, j) between any two variables i, j in sample data under the normal operation working condition of the industrial process to obtain a weighted directed graph;
step 3.1, calculating a mutual information value MI (i, j) between any two variables i, j according to standardized training data;
the process directed graph in the first step is required to accurately describe the data correlation between variables, so that the mutual information is normalized according to the formula (3);
Wherein NMI (i, j) is normalized mutual information between variables i, j, H (·) is information entropy, p (i, j) is a joint probability distribution function, and p (i), p (j) are edge probability distribution functions of variables i and j, respectively.
And 3.2, processing the adjacency matrix of the process directed graph to obtain a weighted directed graph, wherein the method comprises the following steps of:
1) Based on the process directed graph model G, an adjacency matrix E R is constructed n×n Element e of the adjacency matrix ij The causal relationship between the variables i and j in the process is described, and the value is 0 or 1:
if said element e ij =0, then the directed edges between the variables do not exist;
if said element e ij =1, then it indicates that there is a directed edge from variable i to variable j;
2) Based on normalized mutual information, a correlation matrix U epsilon R of the process variable is obtained n×n Element u of the correlation matrix ij NMI (i, j) describes the correlation of variables i and j from the information entropy perspective;
3) Constructing a weighted adjacency matrix based on adjacency matrix E and correlation matrix U, wherein the weighted adjacency matrix is defined as E U Obtained by Hadamard product calculation of the original adjacency matrix E and the correlation matrix U, i.e. E U =e×u, expanded to be represented by formula (4), E U The elements describe the weight of the directed edges contained in the original directed graph and are based on a weighted adjacency matrix E U A new weighted directed graph is constructed.
Step four: taking the modularity as a division judgment standard to carry out module division on the weighted directed graph obtained in the step three;
the partitioning of the weighted directed graph comprises the steps of:
step 4.1, initializing each variable as a module, sequentially attempting to divide each variable into the modules where the outgoing side pointing variables are located, and calculating the module degree at the moment; selecting to incorporate the variable into the module that maximizes the module degree increment;
the modularity calculation mode is shown in the formula (5);
wherein Q is modularity, A ij Directed edge weights for variable i to variable j,is the sum of the weights of all sides of the directed graph, +.>Inward side weight sum of variable i, +.>Is the sum of the edge weights of the variables j, c i Module to which variable i belongs in the optimization process, delta (c) i ,c j ) As a Cronecker function, when c i =c j When the value is 1, otherwise, the value is 0;
step 4.2, iterating step 4.1 until the module to which each variable belongs is no longer changed, since there may be multiple directed edges connecting the variables in the graph;
step 4.3, aggregating each module into a new node, and updating the directed edge weights in the modules and among the modules;
step 4.4, repeating the steps 1 to 3 until the modularity is not increased any more, and obtaining the optimal division result V of the current weighted directed graph 1 ,V 2 ,…,V B], wherein ,b= [1,2, …, B, variable set for the B-th module]For the module reference numerals, n b The number of variables in the b-th module;
based on the dividing result obtained in the above steps, the training data set is divided into corresponding B modules, specifically wherein />Representing the training data set corresponding to the b-th module.
Step five: based on the module dividing result of the step four, training data by utilizing each moduleEstablishing a KPCA-based local monitoring model in each module, wherein the local monitoring model is specifically as follows:
step 5.1, calculating Module dataThe covariance matrix C after mapping to a high-dimensional space is obtained by introducing a Gaussian radial basis function k (&) in consideration of the complexity of nonlinear mapping phi, and recording k (x i ,x j )=Φ(x i )Φ(x j ) The eigenvalue λ and eigenvector v solution problem of the covariance matrix C after the data mapping to the high-dimensional space can be converted into:
it should be noted that, in addition to the gaussian radial basis kernel function, the kernel function k (·) may also be a linear kernel function or a polynomial kernel function.
Step 5.2, further centering the core matrix K, i.e., replacing K in the above formula with K-KE-EK-EKE, wherein E is a factor of 1/n b N of (2) b A rank unit array; obtaining a characteristic value lambda and a characteristic vector alpha from the above, and preserving the previous k b The largest eigenvalue and the corresponding eigenvector are normalizedWhere p=1, 2, …, k b K can be determined from the cumulative variance contribution, the cumulative variance contribution threshold typically being 85%;
step 5.3, extracting nonlinear principal component of training data sample x, calculating its retained feature vectorProjection on +.>Obtain its score vector t= [ t ] 1 ,t 2 ,…,t k ] T Thereby calculating the monitoring statistic T of the training data sample x 2 =t T Λ -1 t and spe=k (x, x) -t T t;
Step 5.4, repeating the step 5.3 for m training data samples to obtain the monitoring statistic of the training data under the normal working condition, and controlling the limit according to the monitoring index obtained by the nuclear density estimationSPE th,b
Step six: collecting a sample x at the current moment new Obtaining standardized test data based on the mean value, standard deviation and standardized formula (1) in the second stepCorrespondingly, the module dividing result in the fourth step is +.>Dividing the test data into B modules, and obtaining monitoring statistics of each module of the current test data according to the monitoring model of each module in the fifth step>And SPE (SPE) b
Step seven: monitoring statistics of each module according to Bayesian inferenceAnd SPE (SPE) new,b Fused into probability indexWith BIC SPE Obtaining a final detection result of the current process state according to the formula (7);
Where β=0.01 is the level of significance.
Step eight: after detecting that the process has faults, acquiring two monitoring statistics T of each variable pair by adopting a weighted contribution rate method 2 The statistics contribution rate threshold corresponding to the SPE contribution rate is recorded asAnd CSPE (physical layer PE) th The method comprises the steps of carrying out a first treatment on the surface of the Dividing variables exceeding a contribution rate threshold into fault related variable sets, and acquiring a local directed graph structure of the fault according to the fault related variable sets to realize root variable identification and propagation path analysis of the fault.
The fault diagnosis comprises the following steps:
1) The contribution rates of the variable i to two statistics in the acquired samples at the time t are shown in the formulas (8) and (9):
wherein xt,i For the time t, the value of a variable i in a sample is acquired, K t For a kernel matrix of a sample acquired at the moment t, θ can be estimated approximately according to data under normal working conditions;
2) Considering that the detection result relates to a plurality of modules, weighting the contribution rate; variable i vs T in module b 2 The weighted contribution rates of the two monitoring statistics with the SPE are respectively shown as the formula (10) and the formula (11):
/>
wherein ,the contribution weight of the b-th module to the current monitoring statistic is given;
3) The monitoring statistic contribution rate thresholdThe variables with the contribution rate exceeding the threshold are identified as fault related variables, and the obtained set is V F
4) Based on set V F And (3) acquiring a local directed graph composed of variables in the set, qualitatively determining a fault propagation path according to causal connection of the acquired local directed graph, and taking the variables which are not influenced by other abnormal variables on the path as root variables, namely identifying the root node variables as the root variables of faults in the local directed graph composed of fault related variables.
Embodiment III:
the embodiment provides a large-scale industrial process fault detection and diagnosis method based on a weighted directed graph, which aims at a TE reference process to explain the specific implementation process of the invention.
The TE process is a simulation system designed according to the actual production process of american chemical company, and the process flow chart is shown in fig. 2. The process comprises 22 process measurement variables, 12 operation variables and 19 component measurement variables, and 33 process variables consisting of 22 measurement variables and 11 operation variables (excluding the stirring speed operation variable XMV) are selected for modeling and monitoring, and the introduction of the variables is shown in table 1.
Table 1: TE process variable
Reference numerals Description of the invention Reference numerals Description of the invention Reference numerals Description of the invention
1 A feed flow rate 12 Separator liquid level 23 D feed valve
2 D feed flow 13 Separator pressure 24 E feed valve
3 E feed flow 14 Separator bottom flow 25 A feed valve
4 A. C feed flow rate 15 Stripping column liquid level 26 A. C feed valve
5 Circulation flow rate 16 Stripping column pressure 27 Compressor recirculation valve
6 Reactor feed rate 17 Bottom flow of stripping tower 28 Discharge valve
7 Reactor pressure 18 Stripper temperature 29 Separator tank flow
8 Reactor liquid level 19 Steam flow rate of stripping column 30 Stripping column liquid product flow
9 Reactor temperature 20 Compressor power 31 Stripping tower flow velocity valve
10 Discharge rate 21 Reactor cooling water outlet temperature 32 Reactor cooling water flow rate
11 Separator temperature 22 Condenser cooling water outlet temperature 33 Compressor cooling water flow rate
Step one: building a directed graph.
Specifically, the process variables collected by the sensors described in Table 1 and the process variables involved by the controller are used as the node set { v) of the directed graph i I=1, 2, …,33}; further analysis of causal relationships between variables in connection with fig. 1, for example, an increase in reactor temperature (variable number 9) would result in an increase in the reactor cooling water outlet temperature (variable number 21) associated therewith, there is a directed edge e in the directed graph from variable 9 to variable 21 921 The relationships between the variables are analyzed in turn and used as the edge set { e } of the directed graph ij I, j=1, 2, …,33}, constructing a process directed graph based on the node sets and the edge sets, as shown in fig. 3, and obtaining an adjacency matrix E R describing the process directed graph 33×33 Element e of the adjacency matrix ij (i, j=1, 2, …, 33) describes the causal relationship of the variable i and j in the process, and takes on a value of 0 or 1.0 indicates that variable i has an effect on variable j; a 1 indicates that variable i has no effect on variable j.
Step two: collecting 500 pieces of sample data under normal operation condition to form training data set X epsilon R 500×33 To eliminate dimension influence, the samples in the training set are standardized to obtain new training dataCalculating normalized mutual information values NMI (i, j) between any variables i, j according to the normalized training data to obtain a correlation matrix U epsilon R between the process variables 33×33
Step three: according to the adjacent matrix E obtained in the first step and the correlation matrix U obtained in the second step, calculating Hadamard product of the two matrices to construct a weighted adjacent matrix, namely E U =e×u, spread as shown in equation (12), and based on the weighted adjacency matrix E U Constructing a new weighted directed graph:
step four: and realizing the division of the weighted directed graph based on the optimization principle of the modularity Q maximization through a Louvain division algorithm. Initializing each variable as a module, sequentially attempting to divide each variable into the modules where the outgoing edge pointing variables are located, calculating the module degree Q at the moment, and selecting the variable to be integrated into the module with the largest module degree increment; since the outgoing edges of the variables in the directed graph are not always unique, after all the variables are traversed once, the traversal needs to be repeated until the module to which each variable belongs is not changed any more; then, each module is aggregated into a new node, and the directed edge weights in the modules and among the modules are updated; repeating the steps until the modularity Q is no longer increased by regarding each new node as a variable; the result of the division of the modules is obtained, and the variables contained in each module are shown in table 2.
Table 2: module variable
Step five: training a principal component analysis monitoring model of each module core, comprising the following steps:
1) Standardized data according to module resultDivided into six sub-data sets X 1 、X 2 、X 3 、X 4 、X 5 、X 6 From X 1 The following steps are started to be carried out in sequence;
2) Assuming that a nonlinear mapping phi exists to map the original data to a high-dimensional space, firstly calculating a covariance matrix C after the data is mapped to the high-dimensional space, introducing a Gaussian radial basis function k (·) in consideration of the complexity of the nonlinear mapping phi, and recording k (x i ,x j )=Φ(x i )Φ(x j ) The eigenvalue λ and eigenvector v solution problem of the covariance matrix C after the data mapping to the high-dimensional space can be converted into:
further centralizing the core matrix K, namely replacing K in the formula by K-KE-EK-EKE, wherein E is an n-order unit array with a coefficient of 1/n, and n=33; obtaining a characteristic value lambda and a characteristic vector alpha by the above formula, reserving the first k maximum characteristic values and corresponding characteristic vectors and normalizingWhere p=1, 2, …, k b K can be determined from the cumulative variance contribution, the cumulative variance contribution threshold typically being 85%;
3) Extracting nonlinear principal component of training data sample x, and calculating its characteristic vectorProjection on +.>Obtain its score vector t= [ t ] 1 ,t 2 ,…,t k ] T Thereby calculating the monitoring statistic T of the training data sample x 2 =t T Λ -1 t and spe=k (x, x) -t T t;
4) Repeating the step 3 for 500 training data samples to obtain the monitoring statistics of the training data under the normal working condition, and controlling the limit according to the monitoring index obtained by the nuclear density estimationSPE th
Step six: collecting a sample x at the current moment new ∈R 1×33 Obtaining standardized test data based on the mean and standard deviation of the training data setAccording to the different module variables in Table 2, correspondingly +.>Is divided into six modules, rootObtaining monitoring statistics T of each module of the current test data according to the monitoring model of each module in the fifth step 2 And SPE.
Step seven: each module T is deduced according to Bayes 2 Fusion with SPE to form probability indexWith BIC SPE Judging the current process state according to the formula (14);
where α=0.01 is the level of significance.
Step eight: if the process state at the moment t is detected to be normal, sampling and monitoring at the next moment are carried out; if the fault sample is the fault sample, the subsequent fault diagnosis is carried out. First, the weighted contribution rate CT of 33 variables in the sample to faults is calculated 2 (x t,i ) And CSPE (x) t,i ) I= (1, 2, …, 33), for exceeding the contribution rate thresholdThe variables are divided into fault-related variable sets V F And (3) acquiring a local directed graph composed of variables in the set, qualitatively analyzing a fault propagation path according to causal connection of the acquired local directed graph, and taking the variables which are not influenced by other abnormal variables on the path as root variables, namely identifying the root node variables as the root variables of faults in the local directed graph composed of fault related variables.
To verify the effectiveness of the method of the present application, a corresponding simulation experiment was performed in this example, and the method of the present application (denoted WDG-MBKPCA) was compared with the monitoring performance of two currently more typical large-scale process monitoring methods DPCA, MBKPCA, which may be referred to the introduction in "Ge, z., & Song, z. (2013), & Distributed pca model for plant-width process monitoring.industrial & Engineering Chemistry Research (5), 52."; MBKPCA is described in "Li Jinbing, han Bing, feng Shoubo, zhang Jiadong, li Yu, & Zhong Kai et al (2020) & lt/EN & gt.
The specific results are shown in Table 3. Table 3 shows the monitoring results of the above-described process monitoring method for 21 faults of the TE process at a significance level α=0.01 (confidence β=0.99), and for comparison purposes, the fault numbers with the best results under the inventive method are shown in bold fonts in table 3.
It can be seen from the combination of Table 3 that the monitoring performance of the method WDG-MBKPCA is better than that of the other monitoring methods, and the advantages are more obvious especially on faults 1, 11, 17, 19 and 20. For the faults 3, 9 and 15 which are difficult to detect, the WDG-MBKPCA method still achieves relatively high fault detection rate. The method is characterized in that on the basis of considering nonlinear relations among variables, the causal relation and the correlation of the variables in the process are combined to obtain a more reasonable module division result, and hidden local information in the process can be effectively mined, so that better monitoring performance is obtained.
Further, selecting TE process fault 1 and fault 20 as examples shows the monitoring process and result comparison of the above three methods in detail. The fault 1 is that the A/D feeding flow ratio is changed in a step mode, and the WDG-MBKPCA method detects the fault occurrence at the 161 th sampling point by combining with the figures 4A, 4B and 4C, wherein the fault detection rate is 100%. While other methods detect the occurrence of a fault after one or two sampling points. In the fault diagnosis stage, the variable x is selected from the variable average weighted contribution graphs of the 200 th to 205 th samples shown in fig. 5A 1 、x 4 、x 16 、x 19 、x 25 、x 26 and x31 As a fault-related variable set V F E combining process directed graphs 1,25 、e 4,26 、e 4,16 、e 19,31 Connecting variables, searching shortest path between each node after aggregation, and increasing variable x 6 and x18 Perfecting the fault local directed graph as shown in fig. 5B. From this figure, the variable x 4 Is a faultThe root variable is matched with the fault generation mechanism, and the propagation path of the fault is shown as a directed edge in fig. 5B. However, if judged only from the contribution graph, the variable x 4 The final weighted contribution of (c) is not the greatest, and it can be seen that conventional methods may make false diagnoses due to smearing effects, which also demonstrates the effectiveness and superiority of the methods presented in this chapter in fault detection and diagnosis.
Table 3: monitoring performance comparison
Some steps in the embodiments of the present invention may be implemented by using software, and the corresponding software program may be stored in a readable storage medium, such as an optical disc or a hard disk.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims (9)

1. A method of large scale industrial process fault detection and diagnosis based on weighted directed graphs, the method being directed to a large scale industrial process, the method comprising:
step one: for an industrial process, constructing a process directed graph model g= (v, E), and obtaining an adjacency matrix E thereof according to the directed graph; the directed graph comprises a node set { v } i I=1, 2, …, n } and directed edge set { e } ij I, j=1, 2, …, n }, the set of nodes being a set of n process variables of the industrial process, the set of directed edges being a set of causal relationships between the n process variables of the industrial process; the adjacency matrix E is used for describing a directed graph model, the rows and columns of the matrix represent nodes, and the elements of the ith row and the jth column in the matrix represent whether directed edges pointing to the node j from the node i exist or not;
Step two: acquiring sample data under normal operation conditions of an industrial process to form training data, wherein the sample data are data under normal operation conditions of the n process variables;
step three: processing the adjacency matrix of the directed graph according to mutual information values MI (i, j) between any two variables i, j in sample data under the normal operation working condition of the industrial process to obtain a weighted adjacency matrix and a corresponding weighted directed graph;
step four: taking the modularity as a division judgment standard to carry out module division on the weighted directed graph obtained in the step three;
step five: based on the module dividing result of the step four, selecting a kernel function at each module, and training data by using each moduleEstablishing a local monitoring model based on KPCA in each module, and obtaining a monitoring index control limit by training data
Step six: collecting a sample x at the current moment new And performing standardization to obtain standardized test dataAccording to the module dividing result obtained in the step four, correspondingly dividing the current time sample into B modules, and according to the local monitoring model of each module in the step five, obtaining the monitoring statistic of each module of the current test data>And SPE (SPE) b
Step seven: monitoring statistics of each module according to Bayesian inference And SPE (SPE) new,b Fused into probability indexWith BIC SPE Obtaining a final detection result of the current process state, wherein a subscript B represents a B-th module in the B modules;
step eight: if the final detection result is that a fault occurs, acquiring two monitoring statistics T of each variable pair by adopting a weighted contribution rate method 2 The statistics contribution rate threshold corresponding to the SPE contribution rate is recorded asAnd CSPE (physical layer PE) th The method comprises the steps of carrying out a first treatment on the surface of the Dividing variables exceeding a contribution rate threshold into fault related variable sets, acquiring a local directed graph structure of the fault according to the fault related variable sets, and determining source variables and propagation paths of the fault.
2. The method according to claim 1, wherein the second step comprises:
collecting sample data under normal operation condition of industrial process to form training data set X epsilon R m×n
Performing standardization processing on samples corresponding to each row in X according to the formula (1) to obtain new training data
Wherein m is the number of samples, n is the number of process variables, x ε R 1×n In the case of any one of the samples,as normalized samples, μ= [ μ ] 12 ,…,μ n ]For mean line vector, delta E R n×n Is a diagonal matrix and the diagonal element is a standard deviation delta 12 ,…,δ n
3. The method according to claim 2, wherein the step three comprises:
Step 3.1, calculating mutual information values MI (i, j) between any two variables i, j according to a formula (2) according to standardized training data;
normalizing the mutual information according to the formula (3);
wherein NMI (i, j) is normalized mutual information between variables i, j, H (·) is information entropy, p (i, j) is a joint probability distribution function, and p (i), p (j) are edge probability distribution functions of variables i and j respectively;
step 3.2, processing the adjacency matrix of the process directed graph to obtain a weighted adjacency matrix and a corresponding weighted directed graph:
1) Based on the process directed graph model G, an adjacency matrix E R is constructed n×n Element e of the adjacency matrix ij The causal relationship between the variables i and j in the process is described, and the value is 0 or 1:
if said element e ij =0, then the directed edges between the variables do not exist;
if said element e ij =1, then it indicates that there is a directed edge from variable i to variable j;
2) Based on normalized mutual information, a correlation matrix U epsilon R of the process variable is obtained n×n Element u of the correlation matrix ij NMI (i, j) describes the correlation of variables i and j from the information entropy perspective;
3) Constructing a weighted adjacency matrix based on adjacency matrix E and correlation matrix U, wherein the weighted adjacency matrix is defined as E U Obtained by Hadamard product calculation of the original adjacency matrix E and the correlation matrix U, i.e. E U =e×u, develop intoFormula (4) shows that E U The elements describe the weight of the directed edges contained in the original directed graph and are based on a weighted adjacency matrix E U Constructing a new weighted directed graph:
wherein ,ui,j Is an element in the correlation matrix U.
4. A method according to claim 3, wherein said step four comprises:
step 4.1, initializing each variable as a module, sequentially attempting to divide each variable into the modules where the outgoing side pointing variables are located, and calculating the module degree at the moment; selecting to incorporate the variable into the module that maximizes the module degree increment;
the modularity calculation mode is shown in the formula (5);
wherein Q is modularity, A ij Directed edge weights for variable i to variable j,is the sum of the weights of all sides of the directed graph, +.>Inward side weight sum of variable i, +.>Is the sum of the edge weights of the variables j, c i A module to which the variable i belongs in the optimization process; delta (c) i ,c j ) As a Cronecker function, when c i =c j When the value is 1, otherwise, the value is 0;
step 4.2, iterating step 4.1 until the module to which each variable belongs is no longer changed, since there may be multiple directed edges connecting the variables in the graph;
step 4.3, aggregating each module into a new node, and updating the directed edge weights in the modules and among the modules;
Step 4.4, repeating the steps 4.1 to 4.3 until the modularity is not increased any more, and obtaining the optimal division result V of the current weighted directed graph 1 ,V 2 ,…,V B], wherein ,b= [1,2, …, B, variable set for the B-th module]For the module reference numerals, n b The number of variables in the b-th module;
based on the dividing result obtained in the above steps, the training data set is divided into corresponding B modules, specifically wherein />Representing the training data set corresponding to the b-th module.
5. The method of claim 4, wherein the fifth step comprises:
step 5.1, calculating Module dataThe covariance matrix C after being mapped to the high-dimensional space is converted into a eigenvalue lambda and eigenvector v solving problem of the covariance matrix C after the data is mapped to the high-dimensional space by introducing a kernel function k (), taking the complexity of the nonlinear mapping phi into consideration:
step 5.2, further centering the core matrix K, i.e., replacing K in the above formula with K-KE-EK-EKE, wherein E is a factor of 1/n b N of (2) b A rank unit array; obtaining a characteristic value lambda and a characteristic vector alpha from the above, and preserving the previous k b The largest eigenvalue and the corresponding eigenvector are normalizedWhere p=1, 2, …, k b ,k b Determining according to the cumulative variance contribution rate;
Step 5.3, extracting nonlinear principal component of training data sample x, calculating its retained feature vectorProjection on +.>Obtain its score vector t= [ t ] 1 ,t 2 ,…,t k ] T Thereby calculating the monitoring statistic T of the training data sample x 2 =t T Λ -1 t and spe=k (x, x) -t T t;
Step 5.4, repeating the step 5.3 for m training data samples to obtain monitoring statistics of the training data under normal working conditions, and obtaining a monitoring index control limit according to the nuclear density estimationSPE th,b
6. The method of claim 5, wherein the step seven comprises:
monitoring statistics of each module according to Bayesian inferenceAnd SPE (SPE) new,b Fused into probability index->With BIC SPE Obtaining a final detection result of the current process state according to the formula (7);
where β=0.01 is the level of significance.
7. The method of claim 6, wherein the step eight comprises:
step 8.1, calculating the contribution rate of the variable i to two statistics in the acquired sample at the time t according to the following formula (8) and formula (9):
wherein xt,i For the time t, the value of a variable i in a sample is acquired, K t For a kernel matrix of a sample acquired at the moment t, θ can be estimated approximately according to data under normal working conditions;
step 8.2, calculating the variable i vs T in the module b according to the formula (10) and the formula (11) 2 Weighted contribution rate to SPE two monitoring statistics:
wherein ,the contribution weight of the b-th module to the current monitoring statistic is given;
step 8.3, the monitoring statistic contribution rate thresholdThe variables with the contribution rate exceeding the threshold are identified as fault related variables, and the obtained set is V F
Step 8.4, based on set V F And acquiring a local directed graph formed by variables in the set, qualitatively determining a fault propagation path according to causal connection of the acquired local directed graph, and identifying a root node variable as a root variable of a fault.
8. The method of claim 5, wherein the kernel function k (·) in step five is a linear kernel function, a polynomial kernel function, or a gaussian kernel function.
9. The method of claim 5, wherein the cumulative variance contribution threshold is 85%.
CN202310505573.9A 2023-05-06 2023-05-06 Large-scale industrial process fault detection and diagnosis method based on weighted directed graph Pending CN116661410A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310505573.9A CN116661410A (en) 2023-05-06 2023-05-06 Large-scale industrial process fault detection and diagnosis method based on weighted directed graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310505573.9A CN116661410A (en) 2023-05-06 2023-05-06 Large-scale industrial process fault detection and diagnosis method based on weighted directed graph

Publications (1)

Publication Number Publication Date
CN116661410A true CN116661410A (en) 2023-08-29

Family

ID=87727029

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310505573.9A Pending CN116661410A (en) 2023-05-06 2023-05-06 Large-scale industrial process fault detection and diagnosis method based on weighted directed graph

Country Status (1)

Country Link
CN (1) CN116661410A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117151196A (en) * 2023-10-26 2023-12-01 湘江实验室 Layer-by-layer increment expected propagation-based interpretable fault diagnosis attribution method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117151196A (en) * 2023-10-26 2023-12-01 湘江实验室 Layer-by-layer increment expected propagation-based interpretable fault diagnosis attribution method
CN117151196B (en) * 2023-10-26 2024-01-30 湘江实验室 Layer-by-layer increment expected propagation-based interpretable fault diagnosis attribution method

Similar Documents

Publication Publication Date Title
CN108062565B (en) Double-principal element-dynamic core principal element analysis fault diagnosis method based on chemical engineering TE process
Kariwala et al. A branch and bound method for isolation of faulty variables through missing variable analysis
Auret et al. Empirical comparison of tree ensemble variable importance measures
CN111079836B (en) Process data fault classification method based on pseudo label method and weak supervised learning
CN109407652B (en) Multivariable industrial process fault detection method based on main and auxiliary PCA models
Chen et al. Probabilistic contribution analysis for statistical process monitoring: A missing variable approach
Cai et al. A new fault detection method for non-Gaussian process based on robust independent component analysis
CN110880024B (en) Nonlinear process fault identification method and system based on discrimination kernel slow characteristic analysis
Xiang et al. Multimode process monitoring based on fuzzy C-means in locality preserving projection subspace
CN110244692B (en) Chemical process micro-fault detection method
CN112904810B (en) Process industry nonlinear process monitoring method based on effective feature selection
CN108830006B (en) Linear-nonlinear industrial process fault detection method based on linear evaluation factor
CN111949012A (en) Intermittent process fault detection method based on double-weight multi-neighborhood preserving embedding algorithm
CN116661410A (en) Large-scale industrial process fault detection and diagnosis method based on weighted directed graph
CN110766173A (en) Chemical process fault diagnosis method based on mechanism correlation analysis Bayesian network
Wu et al. Fault detection and diagnosis in process data using support vector machines
CN107122611A (en) Penicillin fermentation process quality dependent failure detection method
CN110362063B (en) Fault detection method and system based on global maintenance unsupervised kernel extreme learning machine
CN109683594B (en) Method for accurately identifying and positioning abnormal variable
CN110244690B (en) Multivariable industrial process fault identification method and system
CN111914886B (en) Nonlinear chemical process monitoring method based on online brief kernel learning
Ding et al. Deep forest-based fault diagnosis method for chemical process
CN113253682B (en) Nonlinear chemical process fault detection method
Zhang et al. A comparison of different statistics for detecting multiplicative faults in multivariate statistics-based fault detection approaches
Yang et al. A novel decentralized weighted ReliefF-PCA method for fault detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination