CN115310837A - Complex electromechanical system fault detection method based on causal graph attention neural network - Google Patents
Complex electromechanical system fault detection method based on causal graph attention neural network Download PDFInfo
- Publication number
- CN115310837A CN115310837A CN202210975693.0A CN202210975693A CN115310837A CN 115310837 A CN115310837 A CN 115310837A CN 202210975693 A CN202210975693 A CN 202210975693A CN 115310837 A CN115310837 A CN 115310837A
- Authority
- CN
- China
- Prior art keywords
- causal
- node
- neural network
- attention
- graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000001364 causal effect Effects 0.000 title claims abstract description 165
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 51
- 238000001514 detection method Methods 0.000 title claims abstract description 48
- 238000000034 method Methods 0.000 claims abstract description 48
- 230000007246 mechanism Effects 0.000 claims abstract description 34
- 238000012544 monitoring process Methods 0.000 claims abstract description 20
- 230000000694 effects Effects 0.000 claims description 30
- 230000006870 function Effects 0.000 claims description 26
- 230000004913 activation Effects 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 15
- 238000012549 training Methods 0.000 claims description 10
- 238000012512 characterization method Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 9
- 230000009466 transformation Effects 0.000 claims description 7
- 239000000284 extract Substances 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 6
- 210000002569 neuron Anatomy 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 5
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 4
- 238000007306 functionalization reaction Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 4
- 230000001131 transforming effect Effects 0.000 claims description 4
- 239000013598 vector Substances 0.000 claims description 4
- 238000000926 separation method Methods 0.000 abstract description 8
- 238000013527 convolutional neural network Methods 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000012706 support-vector machine Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000012733 comparative method Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/20—Administration of product repair or maintenance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Engineering & Computer Science (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Health & Medical Sciences (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Public Health (AREA)
- Primary Health Care (AREA)
- Water Supply & Treatment (AREA)
- General Health & Medical Sciences (AREA)
- Game Theory and Decision Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a complex electromechanical system fault detection method based on a causal graph attention neural network, which comprises the following steps of: s1: combining known causal relationships with constraint-based causal discovery, and taking monitoring data of a complex electromechanical system as input to construct a causal graph; s2: extracting node features in the causal graph by using the causal graph attention neural network; s3: adding the features extracted by each layer of causal graph attention neural network, and calculating the independent support scores of each node representation under different attention mechanisms; s4: and splicing and inputting the characteristics of all the nodes into the fully-connected neural network, and finally outputting the fault detection result of the system. The method includes the steps that known causal relationships are fused, and a causal graph is obtained based on a constraint method; and adaptively generating embedded representation of the result variable according to different importance degrees of the cause variable, taking the independent support score as constraint, extracting the characteristic with causal separation property, finally realizing system fault detection and improving detection performance.
Description
Technical Field
The invention relates to the field of fault prediction and health management of a complex electromechanical system, in particular to a fault detection method of the complex electromechanical system based on a causal graph attention neural network.
Background
As representative of modern advanced technology, various complex electromechanical systems are continuously developed. These systems are based on the comprehensive integration of various mechanical, electronic and hydraulic (pneumatic) subsystems, ultimately achieving complex system functions. The structural and functional complexity inside complex electromechanical systems is significantly increased compared to traditional mechanical or electronic systems: the coupling relationship of each module is more complicated, and the boundary between subsystems is more blurred. For these reasons, complex electromechanical systems are also more sensitive to operating conditions. Minor anomalies or faults can cause chain reactions through cascading and propagation, compromising the operation of the entire system. Therefore, how to timely and effectively detect the fault and discover the abnormal operation state is one of the keys for ensuring the healthy operation of the system and improving the safety and the usability of the system.
The current fault detection method based on data driving generally directly models the correlation between input variables and faults, and ignores the causal relationship and spatial structure relationship existing among the variables. Graphical Neural Networks (GNNs) have had excellent success in processing spatially structured data. GNNs are able to mine information about nodes (feature variables) and their edges (relationships) using non-euclidean features provided by graph structures. The GNN makes a great breakthrough in image and video classification tasks and also stimulates the application of the GNN in fault detection. However, although the GNN-based method improves the performance of fault detection to some extent, the current GNN-based fault detection method applies the same weight addition to all neighboring nodes, and ignores the differential contribution of different nodes; in addition, most of the current GNNs utilize graphs constructed by knowledge in specific fields, but for complex electromechanical systems with complex failure mechanisms and numerous monitoring variables, the spatial structure of the complex electromechanical systems is difficult to obtain; the GNN-based fault detection method mostly assumes the correlation among variables, so that the performance and the interpretability of fault detection are limited; meanwhile, with the increase of the number of layers of the GNN, the characteristics of each node tend to be close, and the phenomenon of over-smoothing occurs.
The causal discovery can mine the causal relationship of things, and when the method is applied to the field of fault detection, the method can analyze and monitor the complex causal mechanism among variables and know the fault occurrence and propagation process, thereby being beneficial to improving the performance of a fault detection model. However, for a complex electromechanical system, it is generally difficult to sufficiently mine the causal relationship of a variable complex system only by expert experience, and the problem that the graph structure is unstable and the result has obvious errors easily occurs when a causal graph is constructed by using a data driving method.
Disclosure of Invention
In order to overcome the technical defect of the fault detection of the complex electromechanical system based on data driving in the prior art, the invention provides a fault detection method of the complex electromechanical system based on a causal graph attention neural network, which constructs a causal graph of monitoring variables of the complex electromechanical system by combining a known causal relationship and a cause and effect discovery method based on constraint, and can overcome the problems that the complex system is difficult to analyze by only utilizing expert experience and obvious errors can exist in the cause and effect discovery result based on constraint; then, by utilizing the proposed causal graph attention neural network, the weight of a parent node can be calculated in a self-adaptive manner by utilizing a multi-head causal attention mechanism, and the embedded representation of a child node is generated; further, the feature extracted by each node is used as a constraint term of a loss function and a node representation with causal separation property is extracted by calculating an independent support score; finally, the representation of all the nodes is mapped through a flat layer and a full-connection neural network, and the fault detection result of the target system is output. The fault detection method provided by the invention can be used for mining the complex relation of high-dimensional monitoring variables in a complex electromechanical system from the aspect of causal relation, and overcomes the defects existing in the causal discovery method which only depends on expert experience and data; by utilizing a multi-head attention mechanism based on cause and effect, the weight of a parent node (a cause variable) can be calculated in an adaptive mode, and embedded representation of a child node (an effect variable) is generated; calculating independent support scores for the features extracted from each node to constrain the causal separation characteristics represented by the nodes; finally, the representations of all the nodes are mapped through the flat layer and the full-connection neural network, the fault detection result of the target system is output, and the performance of fault detection of the complex electromechanical system is effectively improved.
Specifically, the invention provides a complex electromechanical system fault detection method based on a causal graph attention neural network, which comprises the following steps:
s1: combining a known cause-and-effect relationship with a constraint-based cause-and-effect discovery method, taking the monitoring variable data of a complex electromechanical system as input, and constructing a cause-and-effect relationship graph of the monitoring variable of the system, wherein the method specifically comprises the following substeps:
s11: determining causal path constraints and causal direction constraints according to the existing knowledge;
s12: generating a causal graph skeleton, and adding or deleting corresponding edges according to causal path constraints;
s13: data preprocessing is carried out to convert all data into numerical types, then a constraint-based cause and effect discovery algorithm is utilized to continuously search and construct a cause and effect graph, and corresponding edges are added according to cause and effect direction constraints;
s14: verifying whether the cause and effect discovery result meets the known relationship constraint, and if so, outputting a cause and effect graph of the result; if not, adjusting the parameter threshold of the PC algorithm and returning to the step S12 again; the PC algorithm is a classic constraint-based cause and effect discovery algorithm;
s2: the method comprises the following steps of extracting and learning the causal graph node characterization by using the proposed causal graph attention neural network, wherein the method specifically comprises the following substeps:
s21: an attention coefficient based on the causal relationship is calculated. Input characteristics of each nodeBy trainable parameters WeR M×F Transforming to a high dimension, applying an attention mechanism to each causal node pair (father-son node pair or cause-effect variable pair), and obtaining a causal inter-pair attention coefficient through a nonlinear activation function and normalization, wherein X is shown as formula (1) j ∈X Pa(i) Representing child node X i The parent node, | | | represents the characteristic splicing operation, leakyReLU is a nonlinear activation function; for nodes without parent (assume X) i ) Defining its attention coefficient as a ij =0 (j ≠ i) anda ii =1;
in the formula, W a Is a trainable parameter;respectively represent nodes X i 、X j Input characteristic of (2), and variable X j Is X i The causal variable of (a);represents node X i 、X j Input feature ofTransforming to M dimension by using trainable parameter W and splicing, and then using trainable parameter W a Transformation to 1-dimension;
s22: a node representation is generated using a multi-headed causal attention mechanism. The representation of each node is a linear weighting of the characteristics of all the father nodes (reason variables) of each node, and a multi-head attention mechanism is adopted to fully extract the characteristics as shown in a formula (2), whereinRepresenting node X i The new growth characteristics of (a); sigma is a value that represents the activation function,and W k Respectively, the child nodes (result variables) X in the k-th head attention i With one of its parent nodes (causal variable) X j Attention coefficients in between, and trainable transformation parameters in the kth attention; k represents the total number of attention mechanism heads; so far, the new token generated by each node comprises M × K dimensions;
s3: adding the features extracted in step S2 by each causal graph attention neural network layer, and calculating its independent support score (IOSS) under different attention mechanisms for each node characterization, as shown in equation (3), where N represents the number of all nodes,representing node X i The characterization of (a) is performed,represents the beta quantile for variable i in the calculation:
in the formula of U s Representing the S times of S times random sampling of independent support theoretical joint distribution of node features, which is M multiplied by K dimensional vector and is calculatedIs obtained by joint distribution ofRepresenting node X normalized by the maximum-minimum value i The mth dimension of (1); β' and β "represent specific values of β;representing node X i The new generative characteristics of (a), comprising M × K dimensions;
s4: inputting the features extracted from all the nodes into a flat layer for splicing, inputting the spliced features into a full-connection neural network comprising a hidden layer and a 1-layer output layer, carrying out nonlinear processing on the features by an activation function in the full-connection neural network, and outputting a fault detection result of a complex electromechanical system by the output layer.
Preferably, the causal path constraint in step S11 means that the path has been constrained byKnowledge determination of variable X i And X j With or without direct causal relationship between them, i.e. constraining causal graph node X i And X j The presence or absence of edges in between; causal directional constraint refers to the determination of a variable X from prior knowledge i Is to cause a variable X j Cause, i.e. constraint causal graph node X i Is X j The ancestor node of (c);
preferably, the step S13 adopts a PC algorithm as a constraint-based cause and effect discovery algorithm;
preferably, in the model training process of step S4, the loss function thereof includes two parts: the Cross Entropy (CE) loss of the model output fault detection result and the real result and the independent support score (IOSS) loss obtained by calculating the node feature of model extraction are shown in a formula (4), a calculation formula of the cross entropy is shown in a formula (4), the total loss of model training is shown in a formula (5), and the causal separation property of the extracted feature of each node can be restrained by calculating the independent support score as restraint, so that the problem of excessive smoothness is relieved, and the performance of extracting the feature of each node is improved ij And p ij Representing the real system state and the predicted system state, respectively, and alpha is the balance coefficient of the two losses.
L=L CE +αIOSS (5)
Preferably, the attention neural network based on the causal graph provided by the invention is optimized by using an Adam algorithm, and hyper-parameters such as the high-dimensional space dimension M in step S21, the number of heads K of the causal attention mechanism in step S22, the quantile β in step S3, the number of hidden layers in step S4, the number of neurons, and the like are determined by a grid search method.
Compared with the prior art, the invention has the following beneficial effects:
(1) The invention provides a method for detecting faults of a complex electromechanical system based on a causal graph attention neural network, which combines a known causal relationship and a constraint-based causal discovery method to construct a causal graph reflecting the causal relationship between monitoring variables of the complex electromechanical system; adaptively aggregating characteristics of parent nodes (causal variables) through a causal graph attention neural network to generate embedded representations of child nodes (causal variables); the features extracted from each node are taken as constraint terms of the loss function and cause-effect separability of the representation of the constraint nodes by calculating an independent support score; finally, the representation of all the nodes is mapped through a flat layer and a full-connection neural network, and fault detection of a complex electromechanical system is output; the performance of fault detection of the current complex electromechanical system can be effectively improved;
(2) The invention provides a causal graph attention neural network by utilizing the characteristic that the child nodes in the causal graph are influenced by the father nodes, the weight of the father nodes can be calculated in a self-adaptive mode through an attention mechanism to generate the embedded representation of the child nodes, so that the embedded representation of the reason variables and the generated result variables can be aggregated according to importance, and the extraction performance of the fault features of a complex electromechanical system can be improved;
(3) The method considers the problem that the neural network of the graph possibly causes node feature convergence and excessive smoothness along with the increase of the number of layers, provides the characteristic calculation independent support score for each node, and uses the characteristic calculation independent support score as a constraint term of a loss function to promote the extracted node characteristics to have causal separation property, thereby being beneficial to enhancing the efficiency of node feature extraction;
(4) The method utilizes the known causal relationship and the monitoring data to construct the monitoring variable causal graph of the complex electromechanical system, extracts characteristics and finally realizes fault detection by combining a causal influence mechanism and considering different importance of the causal variables according to the occurrence and propagation of the system fault and relevant causal information, thereby improving the fault detection accuracy and performance of the complex electromechanical system and having extremely high economic benefit and social benefit.
Drawings
FIG. 1 is a schematic flow chart illustrating the steps of a method for detecting faults of a complex electromechanical system based on a causal graph attention neural network according to the present invention;
FIG. 2 is a block diagram of exemplary steps of a complex electromechanical system fault detection method based on a causal graph attention neural network proposed by the present invention;
FIG. 3 is a simplified block diagram of a high speed rail braking system according to an embodiment of the present invention;
FIG. 4 is a causal graph constructed in accordance with an embodiment of the present invention, incorporating known causal relationships and constraint-based causal discovery methods.
Detailed Description
Exemplary embodiments, features and aspects of the present invention will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
Specifically, the invention provides a complex electromechanical system fault detection method based on a causal graph attention neural network, as shown in fig. 1, which comprises the following steps:
s1: the method comprises the following steps of combining known causal relationship with a constraint-based causal discovery method, taking the monitoring variable data of a complex electromechanical system as input, and constructing a causal relationship graph of system monitoring variables, wherein the causal relationship graph specifically comprises the following substeps:
s11: determining causal path constraints and causal direction constraints according to the existing knowledge;
s12: generating a causal graph skeleton, and adding or deleting corresponding edges according to causal path constraints;
s13: data preprocessing is carried out to convert all data into numerical types, then a constraint-based cause and effect discovery algorithm is utilized to continuously search and construct a cause and effect graph, and corresponding edges are added according to cause and effect direction constraints;
s14: verifying whether the cause and effect discovery result meets the known relationship constraint, and if so, outputting a cause and effect graph of the result; if not, adjusting the parameter threshold of the PC algorithm and returning to the step S12 again;
s2: the method comprises the following steps of extracting and learning the causal graph node characterization by using the proposed causal graph attention neural network, wherein the method specifically comprises the following substeps:
s21: an attention coefficient based on the causal relationship is calculated. Input characteristics of each nodeBy trainable parameters W ∈ R M×F Transforming to M dimension, wherein M is more than or equal to 2, F represents the dimension of the original characteristic, applying an attention mechanism to each causal node pair (father-son node pair or cause-result variable pair), and obtaining a causal inter-pair attention coefficient through a nonlinear activation function and normalization, wherein X is shown as formula (1) j ∈X Pa(i) Representing node X j Is node X i Parent node of (2), i.e. node X j Is node X i The cause variable of (1), wherein | represents the characteristic splicing operation, and LeakyReLU is a nonlinear activation function; for nodes without father nodes, defining the attention coefficient as a ij =0 (j ≠ i) and a ii =1。
In the formula, W a W is a trainable parameter;respectively represent nodes X i 、X j Is input feature of, and node X j Is X i The causal variable of (a);represents node X i 、X j Input feature ofAfter the trainable parameters W are transformed to M dimensions and spliced, the trainable parameters W are used a Transformation to 1-dimension;
s22: a node representation is generated using a multi-headed causal attention mechanism. The representation of each node is represented as a linear weighting of the features of all its parents (causal variables), and a multi-head attention mechanism is adopted to fully extract the features as shown in formula (2), whereinTo representNode X i The new growth characteristics of (a); sigma denotes the function of the activation, which,and W k Respectively, the child nodes (result variables) X in the k-th head attention i With one of its parent nodes (causal variable) X j Attention coefficients in between, and trainable transformation parameters in the kth attention; k represents the total number of attention mechanism heads; to this end, the new tokens generated by each node include a total of M × K dimensions.
S3: adding the features extracted in step S2 by each causal graph attention neural network layer, and calculating its independent support score (IOSS) under different attention mechanisms for each node characterization, as shown in equation (3), where N represents the number of all nodes,representing node X i The (d) dimension j of (a),represents the beta quantile for variable i in the calculation:
in the formula of U s Representing the S times of S times random sampling of independent support theoretical joint distribution of node features, which is M multiplied by K dimensional vector and is calculatedIs obtained by joint distribution ofRepresenting node X after max-min normalization i The mth dimension of (1); β' and β ″ represent specific values of β.Representing node X i The new generative characteristics of (a), comprising M × K dimensions.
The Quantile (Quantile), also called Quantile, refers to a numerical point that divides the probability distribution range of a random variable into several equal parts, and there are usually a median (i.e., a binary), a quartile, a percentile, and the like. Here, it means for each U s First, calculate it and eachIs the square of the distance ofNote that since there are a total of N nodes, there is a need for each U s A series of N1-dimensional real values can be obtained, and the beta' quantiles of the N values are taken, namely each U s A quantile is obtained, and finally all Us obtain S1-dimensional real numerical values in total; and taking the quantile of beta' for the S number as a final calculation result, wherein the final calculation result is a 1-dimensional real number, namely a numerical value of the IOSS, and is used for representing the causal coupling loss represented by each dimension of the node.
S4: inputting the features extracted from all the nodes into a flat layer for splicing, inputting the spliced features into a full-connection neural network comprising a hidden layer and a 1-layer output layer, carrying out nonlinear processing on the features by an activation function in the full-connection neural network, and outputting a fault detection result of a complex electromechanical system by the output layer.
Preferably, the causal path constraint in step S11 refers to determining the variable X by prior knowledge i And X j With or without direct causal relationship between them, i.e. constraining causal graph node X i And X j The presence or absence of edges in between; causal directional constraint refers to the determination of variable X through prior knowledge i Is to cause a variable X j Cause, i.e. constraint causal graph node X i Is X j Ancestor nodes of (1);
preferably, the step S13 adopts a PC algorithm as a constraint-based cause and effect discovery algorithm;
preferably, in the model training process of step S4, the loss function includes two parts: the Cross Entropy (CE) loss of the model output fault detection result and the real result and the independent support score (IOSS) loss obtained by calculating the node feature of model extraction are shown in a formula (4), a calculation formula of the cross entropy is shown in a formula (4), the total loss of model training is shown in a formula (5), and the causal separation property of the extracted feature of each node can be restrained by calculating the independent support score as restraint, so that the problem of excessive smoothness is relieved, and the performance of extracting the feature of each node is improved ij And p ij Representing the real system state and the predicted system state respectively, and alpha is the balance coefficient of the two losses.
L=L CE +αIOSS (5)
Preferably, the attention neural network based on the causal graph provided by the present invention is optimized by using Adam algorithm, and hyper-parameters such as the high-dimensional spatial dimension M in step S21, the number of heads K of the causal attention mechanism in step S22, the quantile β in step S3, the number of hidden layers in S4, and the number of neurons are determined by a grid search method.
The fault detection process of the present invention will be further described in detail with reference to the operation state monitoring data collected from a high-speed rail brake system, and fig. 3 is a simplified structure diagram of the high-speed rail brake system, which includes 39 monitoring variables (including information such as brake valve state, line voltage, line current, etc., which are respectively denoted by X1, X2, \ 8230;, X39). The complex electromechanical system fault detection method based on the multi-source causal graph path convolution, disclosed by the invention, comprises the following specific implementation steps as shown in FIG. 2:
s1: the method combines the known cause and effect relationship with a constraint-based cause and effect discovery method, takes 39 monitoring variable data as input, constructs a cause and effect relationship graph of the monitoring variables of the high-speed rail brake system, and specifically comprises the following sub-steps:
s11: the causal path constraint and the causal direction constraint of monitoring variables in the high-speed rail brake system are determined according to the existing knowledge and are shown in the table 1;
s12: generating a causal graph skeleton, and adding or deleting corresponding edges according to causal path constraints;
s13: performing data preprocessing to convert all data into numerical types, specifically, respectively converting variables of various types into numerical types by adopting encoding methods such as label encoding and dummy variables, and normalizing the data by using a maximum-minimum method, for example: the train operation mode is a category type variable, the state displayed by the value of the variable is not a numerical value, and after the label coding is carried out, the variable is converted into a numerical value code representing the corresponding state, such as 0,1,2; continuously searching and constructing a causal graph by using a constraint-based causal discovery algorithm, and adding corresponding edges according to the constraint of a causal direction; a constraint-based cause and effect discovery algorithm adopts a PC algorithm;
s14: verifying whether the causal discovery result meets the known relationship constraint, and if so, outputting a causal graph of the result; if not, adjusting the parameter threshold of the PC algorithm and returning to S12 again.
Preferably, the causal path constraint in step S11 refers to determining the variable X by prior knowledge i And X j With or without direct causal relationship between them, i.e. constraining causal graph node X i And X j The presence or absence of edges in between; causal directional constraint refers to the determination of a variable X from prior knowledge i Is to cause a variable X j Cause, i.e. constraining causal graph node X i Is X j The ancestor node of (c); the added known causal relationship can suppress errors generated by the data driven causal discovery algorithm, and improve the reliability of the result.
Preferably, the PC algorithm is adopted as the constraint-based cause and effect discovery algorithm in step S13, and finally, a cause and effect relationship graph of the monitoring variables of the high-speed rail brake system is obtained as shown in fig. 4.
TABLE 1 establishment of causal path constraints and causal direction constraints using known causal relationships
S2: the method for extracting and learning the node characteristics of the causal graph obtained in the step 4 by using the proposed causal graph attention neural network specifically comprises the following substeps:
s21: an attention coefficient based on the causal relationship is calculated. Input characteristics of each nodeBy trainable parameters W ∈ R 2×1 Transforming the data into 2-dimensional vectors, applying an attention mechanism to each causal node pair (father-son node pair or cause-effect variable pair), and obtaining a causal inter-pair attention coefficient through a nonlinear activation function and normalization, wherein X is shown as formula (1) j ∈X Pa(i) Representing child node X i The parent node of (1), wherein | represents the characteristic splicing operation, leakyReLU is a nonlinear activation function, and the calculation is shown as the formula (2); for a node without a parent node, namely X36, the attention coefficient is defined as a 36,j =0 (j ≠ 36) and a 36,36 =1。
S22: a node representation is generated using a multi-headed causal attention mechanism. The representation of each node is a linear weighting of the characteristics of all the father nodes (reason variables) of each node, and a multi-head attention mechanism is adopted to fully extract the characteristics as shown in a formula (3), whereinRepresenting node X i A new growth characteristic of (a); adopts elu as an activation function, the calculation formula is shown as formula (4),and W k Respectively, a child node (result variable) X in the kth head attention i With one of its parent nodes (causal variable) X j Attention coefficients in between, and trainable transformation parameters in the kth attention; a K =8 attention mechanism is adopted; the new tokens generated by each node comprise a total of 2 x 8 dimensions.
S3: the features extracted by the 3-layer causal graph attention neural network layer are superposed, and the independent support scores (IOSS) under different attention mechanisms are calculated for the characterization of each node, as shown in formula (5), wherein N represents the number of all nodes,representing node X i The characterization of (a) is performed,represents the beta quantile for variable i in the calculation:
in the formula of U s S times of S times random sampling of independent support theoretical joint distribution representing node characteristics, the theoretical joint distribution is calculatedIs obtained in whichRepresenting node X normalized by the maximum-minimum value i The mth dimension of (1); β' and β "represent specific values of β.Representing node X i The new generation signature of (2), comprising 2 x 8 dimensions.
S4: inputting the features extracted from all the nodes into a flat layer for splicing, inputting the splicing result into a two-layer fully-connected neural network respectively comprising 8 and 2 neurons, and finally outputting the fault detection result of the high-speed rail braking system through a sigmoid activation function: where 0 indicates normal and 1 indicates fault.
Preferably, in the model training process of step S4, the loss function thereof includes two parts: the Cross Entropy (CE) loss of the model output fault detection result and the real result and the independent support score (IOSS) loss obtained by calculating the node feature of model extraction are shown in a formula (6), a calculation formula of the cross entropy is shown in a formula (6), the total loss of model training is shown in a formula (7), and the causal separation of the extracted feature of each node can be improved by calculating the independent support score as constraint, so that the problem of excessive smoothness is relieved, and the performance of extracting the feature of each node is improved, wherein n represents the number of input data samples, m =2 represents the number of categories of the system feature, and y =2 represents the number of categories of the system feature ij And p ij Representing the true system state and the predicted system state, respectively, and α =1e-4 is the balance coefficient of the two losses.
L=L CE +αIOSS (7)
Wherein L is CE For cross entropy loss, L is the loss function of model training.
Preferably, the attention neural network based on the causal graph provided by the invention is optimized by using an Adam algorithm, and hyper-parameters such as the high-dimensional space dimension M in step S21, the head number K of the causal attention mechanism in step S22, the quantile β in step S3, the number of neurons in the hidden layer in step S4 and the like are determined by a grid search method.
In order to further verify the effectiveness and highlight the performance of the method, the method is compared with a Support Vector Machine (SVM), an Artificial Neural Network (ANN), a Convolutional Neural Network (CNN), a traditional graph convolutional neural network (GCN) and a traditional graph attention neural network (GAT) method, and two common imbalance data fault detection performance evaluation indexes are selected: and (3) taking the F1 score and the G-mean score as standards, carrying out method performance comparison, wherein the calculation formula of the scores is shown as formulas (8) and (9):
wherein precision = TP/(TP + FP), recall = TPR = TP/(TP + FN), TNR = TN/(TN + FP); TP, FP, TN, and FN respectively represent the number of samples correctly classified as failed, the number of samples incorrectly classified as failed, the number of samples correctly classified as normal, and the number of samples incorrectly classified as normal; the values of F1 and G-mean are both in the interval of [0,1] and the higher the value is, the better the performance of the method is represented.
The results obtained by comparison are shown in the following table 2, and the results show that the complex electromechanical system fault detection method provided by the invention has excellent fault detection capability. The causal discovery method combining the existing knowledge and based on the constraint can effectively extract the causal relationship of high-dimensional monitoring variables in the complex electromechanical system, so that fault detection modeling can be carried out according to the causal effect of each component of the system; the attention neural network based on the causal graph can adaptively aggregate characteristics of a parent node (causal variable) according to different importance degrees of the parent node to generate embedded characteristics of a child node (causal variable) by combining the nature and attention mechanism of the causal relationship, and improves the node characteristic extraction performance; in addition, the independent support scores are used as the constraint of node extraction representation, so that the characteristics under different attention mechanisms can be prompted to have causal separation properties, and the performance of the fault detection model is greatly improved.
TABLE 2 Fault detection result index evaluation by the inventive and comparative methods
Method | F1 score | G-mean score |
The method of the invention | 0.8473 | 0.9634 |
Tradition graph attention neural network | 0.7974 | 0.9574 |
Conventional graph convolutional neural network | 0.7951 | 0.8396 |
Support vector machine | 0.6892 | 0.7451 |
Artificial neural network | 0.5454 | 0.6849 |
Convolutional neural network | 0.7470 | 0.7952 |
Finally, it should be noted that: the above-mentioned embodiments are only used for illustrating the technical solution of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.
Claims (5)
1. A complex electromechanical system fault detection method based on a causal graph attention neural network is characterized by comprising the following steps: which comprises the following steps;
s1: the method comprises the following steps of combining known causal relationship with a constraint-based causal discovery method, taking the monitoring variable data of a complex electromechanical system as input, and constructing a causal relationship graph of system monitoring variables, wherein the causal relationship graph specifically comprises the following substeps:
s11: determining causal path constraints and causal direction constraints according to the existing knowledge;
s12: generating a causal graph skeleton, and adding or deleting corresponding edges according to causal path constraints;
s13: data preprocessing is carried out to convert all data into numerical types, then a constraint-based cause and effect discovery algorithm is utilized to continuously search and construct a cause and effect graph, and corresponding edges are added according to cause and effect direction constraints;
s14: verifying whether the cause and effect discovery result meets the known relationship constraint, and if so, outputting a cause and effect graph of the result; if not, adjusting the parameter threshold of the constraint-based cause and effect discovery algorithm and returning to the step S12 again;
s2: extracting and learning the causal graph node characterization by using a causal graph attention neural network, which specifically comprises the following sub-steps:
s21: calculating a causal attention coefficient: input characteristics of each nodeBy trainable parameters W ∈ R M×F Transforming to M dimension, wherein M is more than or equal to 2 and is a positive integer, F represents the dimension of the original characteristic, applying a causality-based attention mechanism to the cause-effect node pairs under each causality mechanism, and obtaining a causality inter-pair attention coefficient through a nonlinear activation function and normalization, as shown in formula (1), wherein X is X j ∈X Pa(i) Representing node X j Is node X i Of parent node, i.e. node X j Is node X i The reason variable, | | represents the characteristic splicing operation, leakyReLU is the nonlinear activation function; for nodes without father nodes, defining the attention coefficient as a ij =0 (j ≠ i) and a ii =1;
In the formula, W a Is a trainable parameter;respectively represent nodes X i 、X j Is input feature of, and node X j Is X i The causal variable of (a);represents node X i 、X j Input feature ofThe trainable parameters W are used after being transformed to M dimensions and spliced through the trainable parameters W a Transformation to 1-dimension;
s22: generating a node representation using a multi-headed causal attention mechanism: the representation of each node is represented by linear weighting of all the father nodes, namely the characteristics of the cause variable, and a multi-head attention mechanism is adopted to fully extract the characteristics as shown in a formula (2), whereinRepresenting node X i The new growth characteristics of (a); sigma is a value that represents the activation function,and W k Respectively, the child node X in the k-th head attention i With one of its parent nodes X j Attention coefficients in between, and trainable transformation parameters in the kth head attention; k represents the total number of attention mechanism heads; so far, the new token generated by each node comprises M × K dimensions;
s3: adding the node representations extracted in step S2 by each causal graph attention neural network layer, and calculating their independent support scores IOSS under different attention mechanisms for each node representation, as shown in formula (3), wherein N represents the number of all nodes,representing node X i The characterization of (a) is performed,represents the corresponding value of the beta quantile for variable i in the calculation:
in the formula of U s Representing the S times of S times random sampling of independent support theoretical joint distribution of node features, which is M multiplied by K dimensional vector and is calculatedJoint distribution ofIs obtained in whichRepresenting node X normalized by the maximum-minimum value i The mth dimension of (1); β' and β "represent specific values of β;representing node X i The new generative characteristics of (a), comprising M × K dimensions;
s4: inputting the features extracted from all the nodes into a flat layer for splicing, inputting the spliced features into a full-connection neural network comprising a hidden layer and a 1-layer output layer, carrying out nonlinear processing on the features by an activation function in the full-connection neural network, and outputting a fault detection result of a complex electromechanical system by the output layer.
2. The complex electromechanical system fault detection method based on the causal graph attention neural network of claim 1, wherein: the causal path constraint in step S11 refers to determining the variable X through the existing knowledge i And X j With or without direct causal relationship between them, i.e. constraint causal graph node X i And X j Whether an edge exists between the two; causal directional constraint refers to the determination of variable X through prior knowledge i Is to cause a variable X j Cause, i.e. constraint causal graph node X i Is X j Ancestor node of.
3. The complex electromechanical system fault detection method based on the causal graph attention neural network of claim 1, wherein: the constraint-based cause and effect discovery algorithm in step S13 is specifically a PC algorithm.
4. The complex electromechanical system fault detection method based on the causal graph attention neural network of claim 1, wherein: step S4, in the model training process, the loss function of the model comprises two parts: namely, the cross entropy CE loss sum of the model output fault detection result and the real resultThe IOSS loss of the independent support score obtained by calculating the characteristics of the model extraction nodes is represented by the formula (4), the calculation formula of the cross entropy is represented by the formula (5), the total loss of the model training is represented by the formula (5), and the causal separability of the characteristics extracted by each node can be improved by calculating the independent support score as the constraint, so that the problem of excessive smoothness is relieved, and the performance of the characteristic extraction of each node is improved ij And p ij Respectively representing a real system state and a predicted system state, wherein alpha is a balance coefficient of two losses;
L=L CE +αIOSS (5)
wherein L is CE For cross entropy loss, L is the loss function of model training.
5. The method for detecting the faults of the complex electromechanical system based on the causal graph attention neural network as claimed in claim 1, wherein: the attention neural network based on the causal graph is optimized by using an Adam algorithm, and hyper-parameters such as the dimension M in the step S21, the number K of heads of the causal attention mechanism in the step S22, the quantile beta in the step S3, the number of hidden layers in the step S4, the number of neurons and the like are determined by a grid search method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210975693.0A CN115310837A (en) | 2022-08-15 | 2022-08-15 | Complex electromechanical system fault detection method based on causal graph attention neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210975693.0A CN115310837A (en) | 2022-08-15 | 2022-08-15 | Complex electromechanical system fault detection method based on causal graph attention neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115310837A true CN115310837A (en) | 2022-11-08 |
Family
ID=83862085
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210975693.0A Pending CN115310837A (en) | 2022-08-15 | 2022-08-15 | Complex electromechanical system fault detection method based on causal graph attention neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115310837A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115771165A (en) * | 2022-12-06 | 2023-03-10 | 华中科技大学 | Industrial robot fault detection and positioning method and system under fault-free sample |
CN115793590A (en) * | 2023-01-30 | 2023-03-14 | 江苏达科数智技术有限公司 | Data processing method and platform suitable for system safety operation and maintenance |
-
2022
- 2022-08-15 CN CN202210975693.0A patent/CN115310837A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115771165A (en) * | 2022-12-06 | 2023-03-10 | 华中科技大学 | Industrial robot fault detection and positioning method and system under fault-free sample |
CN115771165B (en) * | 2022-12-06 | 2024-06-04 | 华中科技大学 | Industrial robot fault detection and positioning method and system under fault-free sample |
CN115793590A (en) * | 2023-01-30 | 2023-03-14 | 江苏达科数智技术有限公司 | Data processing method and platform suitable for system safety operation and maintenance |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112508085B (en) | Social network link prediction method based on perceptual neural network | |
CN115310837A (en) | Complex electromechanical system fault detection method based on causal graph attention neural network | |
CN102520341B (en) | Analog circuit fault diagnosis method based on Bayes-KFCM (Kernelized Fuzzy C-Means) algorithm | |
CN111353373B (en) | Related alignment domain adaptive fault diagnosis method | |
JPWO2008114863A1 (en) | Diagnostic equipment | |
CN116628597B (en) | Heterogeneous graph node classification method based on relationship path attention | |
CN116821776B (en) | Heterogeneous graph network node classification method based on graph self-attention mechanism | |
CN113505655A (en) | Bearing fault intelligent diagnosis method for digital twin system | |
CN113361559A (en) | Multi-mode data knowledge information extraction method based on deep width joint neural network | |
CN110851654A (en) | Industrial equipment fault detection and classification method based on tensor data dimension reduction | |
CN114580934A (en) | Early warning method for food detection data risk based on unsupervised anomaly detection | |
CN115879505A (en) | Self-adaptive correlation perception unsupervised deep learning anomaly detection method | |
CN116743555A (en) | Robust multi-mode network operation and maintenance fault detection method, system and product | |
Zhang et al. | An intrusion detection method based on stacked sparse autoencoder and improved gaussian mixture model | |
Pranavan et al. | Contrastive predictive coding for anomaly detection in multi-variate time series data | |
CN114861778A (en) | Method for rapidly classifying rolling bearing states under different loads by improving width transfer learning | |
CN116935128A (en) | Zero sample abnormal image detection method based on learning prompt | |
Cheddadi et al. | Improving equity and access to higher education using artificial intelligence | |
CN116467930A (en) | Transformer-based structured data general modeling method | |
CN115426194A (en) | Data processing method and device, storage medium and electronic equipment | |
CN115168864A (en) | Intelligent cross contract vulnerability detection method based on feature cross | |
CN111797732B (en) | Video motion identification anti-attack method insensitive to sampling | |
Wang et al. | Hierarchical multimodal fusion network with dynamic multi-task learning | |
CN114022739A (en) | Zero sample learning method based on combination of alignment variational self-encoder and triple | |
CN113159976A (en) | Identification method for important users of microblog network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |