CN115310837A - Complex electromechanical system fault detection method based on causal graph attention neural network - Google Patents

Complex electromechanical system fault detection method based on causal graph attention neural network Download PDF

Info

Publication number
CN115310837A
CN115310837A CN202210975693.0A CN202210975693A CN115310837A CN 115310837 A CN115310837 A CN 115310837A CN 202210975693 A CN202210975693 A CN 202210975693A CN 115310837 A CN115310837 A CN 115310837A
Authority
CN
China
Prior art keywords
causal
node
neural network
attention
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210975693.0A
Other languages
Chinese (zh)
Inventor
刘杰
郑舒文
王冲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202210975693.0A priority Critical patent/CN115310837A/en
Publication of CN115310837A publication Critical patent/CN115310837A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Abstract

The invention provides a complex electromechanical system fault detection method based on a causal graph attention neural network, which comprises the following steps of: s1: combining known causal relationships with constraint-based causal discovery, and taking monitoring data of a complex electromechanical system as input to construct a causal graph; s2: extracting node features in the causal graph by using the causal graph attention neural network; s3: adding the features extracted by each layer of causal graph attention neural network, and calculating the independent support scores of each node representation under different attention mechanisms; s4: and splicing and inputting the characteristics of all the nodes into the fully-connected neural network, and finally outputting the fault detection result of the system. The method includes the steps that known causal relationships are fused, and a causal graph is obtained based on a constraint method; and adaptively generating embedded representation of the result variable according to different importance degrees of the cause variable, taking the independent support score as constraint, extracting the characteristic with causal separation property, finally realizing system fault detection and improving detection performance.

Description

Complex electromechanical system fault detection method based on causal graph attention neural network
Technical Field
The invention relates to the field of fault prediction and health management of a complex electromechanical system, in particular to a fault detection method of the complex electromechanical system based on a causal graph attention neural network.
Background
As representative of modern advanced technology, various complex electromechanical systems are continuously developed. These systems are based on the comprehensive integration of various mechanical, electronic and hydraulic (pneumatic) subsystems, ultimately achieving complex system functions. The structural and functional complexity inside complex electromechanical systems is significantly increased compared to traditional mechanical or electronic systems: the coupling relationship of each module is more complicated, and the boundary between subsystems is more blurred. For these reasons, complex electromechanical systems are also more sensitive to operating conditions. Minor anomalies or faults can cause chain reactions through cascading and propagation, compromising the operation of the entire system. Therefore, how to timely and effectively detect the fault and discover the abnormal operation state is one of the keys for ensuring the healthy operation of the system and improving the safety and the usability of the system.
The current fault detection method based on data driving generally directly models the correlation between input variables and faults, and ignores the causal relationship and spatial structure relationship existing among the variables. Graphical Neural Networks (GNNs) have had excellent success in processing spatially structured data. GNNs are able to mine information about nodes (feature variables) and their edges (relationships) using non-euclidean features provided by graph structures. The GNN makes a great breakthrough in image and video classification tasks and also stimulates the application of the GNN in fault detection. However, although the GNN-based method improves the performance of fault detection to some extent, the current GNN-based fault detection method applies the same weight addition to all neighboring nodes, and ignores the differential contribution of different nodes; in addition, most of the current GNNs utilize graphs constructed by knowledge in specific fields, but for complex electromechanical systems with complex failure mechanisms and numerous monitoring variables, the spatial structure of the complex electromechanical systems is difficult to obtain; the GNN-based fault detection method mostly assumes the correlation among variables, so that the performance and the interpretability of fault detection are limited; meanwhile, with the increase of the number of layers of the GNN, the characteristics of each node tend to be close, and the phenomenon of over-smoothing occurs.
The causal discovery can mine the causal relationship of things, and when the method is applied to the field of fault detection, the method can analyze and monitor the complex causal mechanism among variables and know the fault occurrence and propagation process, thereby being beneficial to improving the performance of a fault detection model. However, for a complex electromechanical system, it is generally difficult to sufficiently mine the causal relationship of a variable complex system only by expert experience, and the problem that the graph structure is unstable and the result has obvious errors easily occurs when a causal graph is constructed by using a data driving method.
Disclosure of Invention
In order to overcome the technical defect of the fault detection of the complex electromechanical system based on data driving in the prior art, the invention provides a fault detection method of the complex electromechanical system based on a causal graph attention neural network, which constructs a causal graph of monitoring variables of the complex electromechanical system by combining a known causal relationship and a cause and effect discovery method based on constraint, and can overcome the problems that the complex system is difficult to analyze by only utilizing expert experience and obvious errors can exist in the cause and effect discovery result based on constraint; then, by utilizing the proposed causal graph attention neural network, the weight of a parent node can be calculated in a self-adaptive manner by utilizing a multi-head causal attention mechanism, and the embedded representation of a child node is generated; further, the feature extracted by each node is used as a constraint term of a loss function and a node representation with causal separation property is extracted by calculating an independent support score; finally, the representation of all the nodes is mapped through a flat layer and a full-connection neural network, and the fault detection result of the target system is output. The fault detection method provided by the invention can be used for mining the complex relation of high-dimensional monitoring variables in a complex electromechanical system from the aspect of causal relation, and overcomes the defects existing in the causal discovery method which only depends on expert experience and data; by utilizing a multi-head attention mechanism based on cause and effect, the weight of a parent node (a cause variable) can be calculated in an adaptive mode, and embedded representation of a child node (an effect variable) is generated; calculating independent support scores for the features extracted from each node to constrain the causal separation characteristics represented by the nodes; finally, the representations of all the nodes are mapped through the flat layer and the full-connection neural network, the fault detection result of the target system is output, and the performance of fault detection of the complex electromechanical system is effectively improved.
Specifically, the invention provides a complex electromechanical system fault detection method based on a causal graph attention neural network, which comprises the following steps:
s1: combining a known cause-and-effect relationship with a constraint-based cause-and-effect discovery method, taking the monitoring variable data of a complex electromechanical system as input, and constructing a cause-and-effect relationship graph of the monitoring variable of the system, wherein the method specifically comprises the following substeps:
s11: determining causal path constraints and causal direction constraints according to the existing knowledge;
s12: generating a causal graph skeleton, and adding or deleting corresponding edges according to causal path constraints;
s13: data preprocessing is carried out to convert all data into numerical types, then a constraint-based cause and effect discovery algorithm is utilized to continuously search and construct a cause and effect graph, and corresponding edges are added according to cause and effect direction constraints;
s14: verifying whether the cause and effect discovery result meets the known relationship constraint, and if so, outputting a cause and effect graph of the result; if not, adjusting the parameter threshold of the PC algorithm and returning to the step S12 again; the PC algorithm is a classic constraint-based cause and effect discovery algorithm;
s2: the method comprises the following steps of extracting and learning the causal graph node characterization by using the proposed causal graph attention neural network, wherein the method specifically comprises the following substeps:
s21: an attention coefficient based on the causal relationship is calculated. Input characteristics of each node
Figure BDA00037982552500000314
By trainable parameters WeR M×F Transforming to a high dimension, applying an attention mechanism to each causal node pair (father-son node pair or cause-effect variable pair), and obtaining a causal inter-pair attention coefficient through a nonlinear activation function and normalization, wherein X is shown as formula (1) j ∈X Pa(i) Representing child node X i The parent node, | | | represents the characteristic splicing operation, leakyReLU is a nonlinear activation function; for nodes without parent (assume X) i ) Defining its attention coefficient as a ij =0 (j ≠ i) anda ii =1;
Figure BDA0003798255250000031
in the formula, W a Is a trainable parameter;
Figure BDA0003798255250000032
respectively represent nodes X i 、X j Input characteristic of (2), and variable X j Is X i The causal variable of (a);
Figure BDA0003798255250000033
represents node X i 、X j Input feature of
Figure BDA0003798255250000034
Transforming to M dimension by using trainable parameter W and splicing, and then using trainable parameter W a Transformation to 1-dimension;
s22: a node representation is generated using a multi-headed causal attention mechanism. The representation of each node is a linear weighting of the characteristics of all the father nodes (reason variables) of each node, and a multi-head attention mechanism is adopted to fully extract the characteristics as shown in a formula (2), wherein
Figure BDA0003798255250000035
Representing node X i The new growth characteristics of (a); sigma is a value that represents the activation function,
Figure BDA0003798255250000036
and W k Respectively, the child nodes (result variables) X in the k-th head attention i With one of its parent nodes (causal variable) X j Attention coefficients in between, and trainable transformation parameters in the kth attention; k represents the total number of attention mechanism heads; so far, the new token generated by each node comprises M × K dimensions;
Figure BDA0003798255250000037
s3: adding the features extracted in step S2 by each causal graph attention neural network layer, and calculating its independent support score (IOSS) under different attention mechanisms for each node characterization, as shown in equation (3), where N represents the number of all nodes,
Figure BDA0003798255250000038
representing node X i The characterization of (a) is performed,
Figure BDA0003798255250000039
represents the beta quantile for variable i in the calculation:
Figure BDA00037982552500000310
in the formula of U s Representing the S times of S times random sampling of independent support theoretical joint distribution of node features, which is M multiplied by K dimensional vector and is calculated
Figure BDA00037982552500000311
Is obtained by joint distribution of
Figure BDA00037982552500000312
Representing node X normalized by the maximum-minimum value i The mth dimension of (1); β' and β "represent specific values of β;
Figure BDA00037982552500000313
representing node X i The new generative characteristics of (a), comprising M × K dimensions;
s4: inputting the features extracted from all the nodes into a flat layer for splicing, inputting the spliced features into a full-connection neural network comprising a hidden layer and a 1-layer output layer, carrying out nonlinear processing on the features by an activation function in the full-connection neural network, and outputting a fault detection result of a complex electromechanical system by the output layer.
Preferably, the causal path constraint in step S11 means that the path has been constrained byKnowledge determination of variable X i And X j With or without direct causal relationship between them, i.e. constraining causal graph node X i And X j The presence or absence of edges in between; causal directional constraint refers to the determination of a variable X from prior knowledge i Is to cause a variable X j Cause, i.e. constraint causal graph node X i Is X j The ancestor node of (c);
preferably, the step S13 adopts a PC algorithm as a constraint-based cause and effect discovery algorithm;
preferably, in the model training process of step S4, the loss function thereof includes two parts: the Cross Entropy (CE) loss of the model output fault detection result and the real result and the independent support score (IOSS) loss obtained by calculating the node feature of model extraction are shown in a formula (4), a calculation formula of the cross entropy is shown in a formula (4), the total loss of model training is shown in a formula (5), and the causal separation property of the extracted feature of each node can be restrained by calculating the independent support score as restraint, so that the problem of excessive smoothness is relieved, and the performance of extracting the feature of each node is improved ij And p ij Representing the real system state and the predicted system state, respectively, and alpha is the balance coefficient of the two losses.
Figure BDA0003798255250000041
L=L CE +αIOSS (5)
Preferably, the attention neural network based on the causal graph provided by the invention is optimized by using an Adam algorithm, and hyper-parameters such as the high-dimensional space dimension M in step S21, the number of heads K of the causal attention mechanism in step S22, the quantile β in step S3, the number of hidden layers in step S4, the number of neurons, and the like are determined by a grid search method.
Compared with the prior art, the invention has the following beneficial effects:
(1) The invention provides a method for detecting faults of a complex electromechanical system based on a causal graph attention neural network, which combines a known causal relationship and a constraint-based causal discovery method to construct a causal graph reflecting the causal relationship between monitoring variables of the complex electromechanical system; adaptively aggregating characteristics of parent nodes (causal variables) through a causal graph attention neural network to generate embedded representations of child nodes (causal variables); the features extracted from each node are taken as constraint terms of the loss function and cause-effect separability of the representation of the constraint nodes by calculating an independent support score; finally, the representation of all the nodes is mapped through a flat layer and a full-connection neural network, and fault detection of a complex electromechanical system is output; the performance of fault detection of the current complex electromechanical system can be effectively improved;
(2) The invention provides a causal graph attention neural network by utilizing the characteristic that the child nodes in the causal graph are influenced by the father nodes, the weight of the father nodes can be calculated in a self-adaptive mode through an attention mechanism to generate the embedded representation of the child nodes, so that the embedded representation of the reason variables and the generated result variables can be aggregated according to importance, and the extraction performance of the fault features of a complex electromechanical system can be improved;
(3) The method considers the problem that the neural network of the graph possibly causes node feature convergence and excessive smoothness along with the increase of the number of layers, provides the characteristic calculation independent support score for each node, and uses the characteristic calculation independent support score as a constraint term of a loss function to promote the extracted node characteristics to have causal separation property, thereby being beneficial to enhancing the efficiency of node feature extraction;
(4) The method utilizes the known causal relationship and the monitoring data to construct the monitoring variable causal graph of the complex electromechanical system, extracts characteristics and finally realizes fault detection by combining a causal influence mechanism and considering different importance of the causal variables according to the occurrence and propagation of the system fault and relevant causal information, thereby improving the fault detection accuracy and performance of the complex electromechanical system and having extremely high economic benefit and social benefit.
Drawings
FIG. 1 is a schematic flow chart illustrating the steps of a method for detecting faults of a complex electromechanical system based on a causal graph attention neural network according to the present invention;
FIG. 2 is a block diagram of exemplary steps of a complex electromechanical system fault detection method based on a causal graph attention neural network proposed by the present invention;
FIG. 3 is a simplified block diagram of a high speed rail braking system according to an embodiment of the present invention;
FIG. 4 is a causal graph constructed in accordance with an embodiment of the present invention, incorporating known causal relationships and constraint-based causal discovery methods.
Detailed Description
Exemplary embodiments, features and aspects of the present invention will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
Specifically, the invention provides a complex electromechanical system fault detection method based on a causal graph attention neural network, as shown in fig. 1, which comprises the following steps:
s1: the method comprises the following steps of combining known causal relationship with a constraint-based causal discovery method, taking the monitoring variable data of a complex electromechanical system as input, and constructing a causal relationship graph of system monitoring variables, wherein the causal relationship graph specifically comprises the following substeps:
s11: determining causal path constraints and causal direction constraints according to the existing knowledge;
s12: generating a causal graph skeleton, and adding or deleting corresponding edges according to causal path constraints;
s13: data preprocessing is carried out to convert all data into numerical types, then a constraint-based cause and effect discovery algorithm is utilized to continuously search and construct a cause and effect graph, and corresponding edges are added according to cause and effect direction constraints;
s14: verifying whether the cause and effect discovery result meets the known relationship constraint, and if so, outputting a cause and effect graph of the result; if not, adjusting the parameter threshold of the PC algorithm and returning to the step S12 again;
s2: the method comprises the following steps of extracting and learning the causal graph node characterization by using the proposed causal graph attention neural network, wherein the method specifically comprises the following substeps:
s21: an attention coefficient based on the causal relationship is calculated. Input characteristics of each node
Figure BDA0003798255250000061
By trainable parameters W ∈ R M×F Transforming to M dimension, wherein M is more than or equal to 2, F represents the dimension of the original characteristic, applying an attention mechanism to each causal node pair (father-son node pair or cause-result variable pair), and obtaining a causal inter-pair attention coefficient through a nonlinear activation function and normalization, wherein X is shown as formula (1) j ∈X Pa(i) Representing node X j Is node X i Parent node of (2), i.e. node X j Is node X i The cause variable of (1), wherein | represents the characteristic splicing operation, and LeakyReLU is a nonlinear activation function; for nodes without father nodes, defining the attention coefficient as a ij =0 (j ≠ i) and a ii =1。
Figure BDA0003798255250000062
In the formula, W a W is a trainable parameter;
Figure BDA0003798255250000063
respectively represent nodes X i 、X j Is input feature of, and node X j Is X i The causal variable of (a);
Figure BDA0003798255250000064
represents node X i 、X j Input feature of
Figure BDA0003798255250000065
After the trainable parameters W are transformed to M dimensions and spliced, the trainable parameters W are used a Transformation to 1-dimension;
s22: a node representation is generated using a multi-headed causal attention mechanism. The representation of each node is represented as a linear weighting of the features of all its parents (causal variables), and a multi-head attention mechanism is adopted to fully extract the features as shown in formula (2), wherein
Figure BDA0003798255250000066
To representNode X i The new growth characteristics of (a); sigma denotes the function of the activation, which,
Figure BDA0003798255250000067
and W k Respectively, the child nodes (result variables) X in the k-th head attention i With one of its parent nodes (causal variable) X j Attention coefficients in between, and trainable transformation parameters in the kth attention; k represents the total number of attention mechanism heads; to this end, the new tokens generated by each node include a total of M × K dimensions.
Figure BDA0003798255250000068
S3: adding the features extracted in step S2 by each causal graph attention neural network layer, and calculating its independent support score (IOSS) under different attention mechanisms for each node characterization, as shown in equation (3), where N represents the number of all nodes,
Figure BDA0003798255250000069
representing node X i The (d) dimension j of (a),
Figure BDA00037982552500000610
represents the beta quantile for variable i in the calculation:
Figure BDA0003798255250000071
in the formula of U s Representing the S times of S times random sampling of independent support theoretical joint distribution of node features, which is M multiplied by K dimensional vector and is calculated
Figure BDA0003798255250000072
Is obtained by joint distribution of
Figure BDA0003798255250000073
Representing node X after max-min normalization i The mth dimension of (1); β' and β ″ represent specific values of β.
Figure BDA0003798255250000074
Representing node X i The new generative characteristics of (a), comprising M × K dimensions.
The Quantile (Quantile), also called Quantile, refers to a numerical point that divides the probability distribution range of a random variable into several equal parts, and there are usually a median (i.e., a binary), a quartile, a percentile, and the like. Here, it means for each U s First, calculate it and each
Figure BDA0003798255250000075
Is the square of the distance of
Figure BDA0003798255250000076
Note that since there are a total of N nodes, there is a need for each U s A series of N1-dimensional real values can be obtained, and the beta' quantiles of the N values are taken, namely each U s A quantile is obtained, and finally all Us obtain S1-dimensional real numerical values in total; and taking the quantile of beta' for the S number as a final calculation result, wherein the final calculation result is a 1-dimensional real number, namely a numerical value of the IOSS, and is used for representing the causal coupling loss represented by each dimension of the node.
S4: inputting the features extracted from all the nodes into a flat layer for splicing, inputting the spliced features into a full-connection neural network comprising a hidden layer and a 1-layer output layer, carrying out nonlinear processing on the features by an activation function in the full-connection neural network, and outputting a fault detection result of a complex electromechanical system by the output layer.
Preferably, the causal path constraint in step S11 refers to determining the variable X by prior knowledge i And X j With or without direct causal relationship between them, i.e. constraining causal graph node X i And X j The presence or absence of edges in between; causal directional constraint refers to the determination of variable X through prior knowledge i Is to cause a variable X j Cause, i.e. constraint causal graph node X i Is X j Ancestor nodes of (1);
preferably, the step S13 adopts a PC algorithm as a constraint-based cause and effect discovery algorithm;
preferably, in the model training process of step S4, the loss function includes two parts: the Cross Entropy (CE) loss of the model output fault detection result and the real result and the independent support score (IOSS) loss obtained by calculating the node feature of model extraction are shown in a formula (4), a calculation formula of the cross entropy is shown in a formula (4), the total loss of model training is shown in a formula (5), and the causal separation property of the extracted feature of each node can be restrained by calculating the independent support score as restraint, so that the problem of excessive smoothness is relieved, and the performance of extracting the feature of each node is improved ij And p ij Representing the real system state and the predicted system state respectively, and alpha is the balance coefficient of the two losses.
Figure BDA0003798255250000077
L=L CE +αIOSS (5)
Preferably, the attention neural network based on the causal graph provided by the present invention is optimized by using Adam algorithm, and hyper-parameters such as the high-dimensional spatial dimension M in step S21, the number of heads K of the causal attention mechanism in step S22, the quantile β in step S3, the number of hidden layers in S4, and the number of neurons are determined by a grid search method.
The fault detection process of the present invention will be further described in detail with reference to the operation state monitoring data collected from a high-speed rail brake system, and fig. 3 is a simplified structure diagram of the high-speed rail brake system, which includes 39 monitoring variables (including information such as brake valve state, line voltage, line current, etc., which are respectively denoted by X1, X2, \ 8230;, X39). The complex electromechanical system fault detection method based on the multi-source causal graph path convolution, disclosed by the invention, comprises the following specific implementation steps as shown in FIG. 2:
s1: the method combines the known cause and effect relationship with a constraint-based cause and effect discovery method, takes 39 monitoring variable data as input, constructs a cause and effect relationship graph of the monitoring variables of the high-speed rail brake system, and specifically comprises the following sub-steps:
s11: the causal path constraint and the causal direction constraint of monitoring variables in the high-speed rail brake system are determined according to the existing knowledge and are shown in the table 1;
s12: generating a causal graph skeleton, and adding or deleting corresponding edges according to causal path constraints;
s13: performing data preprocessing to convert all data into numerical types, specifically, respectively converting variables of various types into numerical types by adopting encoding methods such as label encoding and dummy variables, and normalizing the data by using a maximum-minimum method, for example: the train operation mode is a category type variable, the state displayed by the value of the variable is not a numerical value, and after the label coding is carried out, the variable is converted into a numerical value code representing the corresponding state, such as 0,1,2; continuously searching and constructing a causal graph by using a constraint-based causal discovery algorithm, and adding corresponding edges according to the constraint of a causal direction; a constraint-based cause and effect discovery algorithm adopts a PC algorithm;
s14: verifying whether the causal discovery result meets the known relationship constraint, and if so, outputting a causal graph of the result; if not, adjusting the parameter threshold of the PC algorithm and returning to S12 again.
Preferably, the causal path constraint in step S11 refers to determining the variable X by prior knowledge i And X j With or without direct causal relationship between them, i.e. constraining causal graph node X i And X j The presence or absence of edges in between; causal directional constraint refers to the determination of a variable X from prior knowledge i Is to cause a variable X j Cause, i.e. constraining causal graph node X i Is X j The ancestor node of (c); the added known causal relationship can suppress errors generated by the data driven causal discovery algorithm, and improve the reliability of the result.
Preferably, the PC algorithm is adopted as the constraint-based cause and effect discovery algorithm in step S13, and finally, a cause and effect relationship graph of the monitoring variables of the high-speed rail brake system is obtained as shown in fig. 4.
TABLE 1 establishment of causal path constraints and causal direction constraints using known causal relationships
Figure BDA0003798255250000091
S2: the method for extracting and learning the node characteristics of the causal graph obtained in the step 4 by using the proposed causal graph attention neural network specifically comprises the following substeps:
s21: an attention coefficient based on the causal relationship is calculated. Input characteristics of each node
Figure BDA0003798255250000092
By trainable parameters W ∈ R 2×1 Transforming the data into 2-dimensional vectors, applying an attention mechanism to each causal node pair (father-son node pair or cause-effect variable pair), and obtaining a causal inter-pair attention coefficient through a nonlinear activation function and normalization, wherein X is shown as formula (1) j ∈X Pa(i) Representing child node X i The parent node of (1), wherein | represents the characteristic splicing operation, leakyReLU is a nonlinear activation function, and the calculation is shown as the formula (2); for a node without a parent node, namely X36, the attention coefficient is defined as a 36,j =0 (j ≠ 36) and a 36,36 =1。
Figure BDA0003798255250000093
Figure BDA0003798255250000094
S22: a node representation is generated using a multi-headed causal attention mechanism. The representation of each node is a linear weighting of the characteristics of all the father nodes (reason variables) of each node, and a multi-head attention mechanism is adopted to fully extract the characteristics as shown in a formula (3), wherein
Figure BDA0003798255250000095
Representing node X i A new growth characteristic of (a); adopts elu as an activation function, the calculation formula is shown as formula (4),
Figure BDA0003798255250000096
and W k Respectively, a child node (result variable) X in the kth head attention i With one of its parent nodes (causal variable) X j Attention coefficients in between, and trainable transformation parameters in the kth attention; a K =8 attention mechanism is adopted; the new tokens generated by each node comprise a total of 2 x 8 dimensions.
Figure BDA0003798255250000101
Figure BDA0003798255250000102
S3: the features extracted by the 3-layer causal graph attention neural network layer are superposed, and the independent support scores (IOSS) under different attention mechanisms are calculated for the characterization of each node, as shown in formula (5), wherein N represents the number of all nodes,
Figure BDA0003798255250000103
representing node X i The characterization of (a) is performed,
Figure BDA0003798255250000104
represents the beta quantile for variable i in the calculation:
Figure BDA0003798255250000105
in the formula of U s S times of S times random sampling of independent support theoretical joint distribution representing node characteristics, the theoretical joint distribution is calculated
Figure BDA0003798255250000106
Is obtained in which
Figure BDA0003798255250000107
Representing node X normalized by the maximum-minimum value i The mth dimension of (1); β' and β "represent specific values of β.
Figure BDA0003798255250000108
Representing node X i The new generation signature of (2), comprising 2 x 8 dimensions.
S4: inputting the features extracted from all the nodes into a flat layer for splicing, inputting the splicing result into a two-layer fully-connected neural network respectively comprising 8 and 2 neurons, and finally outputting the fault detection result of the high-speed rail braking system through a sigmoid activation function: where 0 indicates normal and 1 indicates fault.
Preferably, in the model training process of step S4, the loss function thereof includes two parts: the Cross Entropy (CE) loss of the model output fault detection result and the real result and the independent support score (IOSS) loss obtained by calculating the node feature of model extraction are shown in a formula (6), a calculation formula of the cross entropy is shown in a formula (6), the total loss of model training is shown in a formula (7), and the causal separation of the extracted feature of each node can be improved by calculating the independent support score as constraint, so that the problem of excessive smoothness is relieved, and the performance of extracting the feature of each node is improved, wherein n represents the number of input data samples, m =2 represents the number of categories of the system feature, and y =2 represents the number of categories of the system feature ij And p ij Representing the true system state and the predicted system state, respectively, and α =1e-4 is the balance coefficient of the two losses.
Figure BDA0003798255250000109
L=L CE +αIOSS (7)
Wherein L is CE For cross entropy loss, L is the loss function of model training.
Preferably, the attention neural network based on the causal graph provided by the invention is optimized by using an Adam algorithm, and hyper-parameters such as the high-dimensional space dimension M in step S21, the head number K of the causal attention mechanism in step S22, the quantile β in step S3, the number of neurons in the hidden layer in step S4 and the like are determined by a grid search method.
In order to further verify the effectiveness and highlight the performance of the method, the method is compared with a Support Vector Machine (SVM), an Artificial Neural Network (ANN), a Convolutional Neural Network (CNN), a traditional graph convolutional neural network (GCN) and a traditional graph attention neural network (GAT) method, and two common imbalance data fault detection performance evaluation indexes are selected: and (3) taking the F1 score and the G-mean score as standards, carrying out method performance comparison, wherein the calculation formula of the scores is shown as formulas (8) and (9):
Figure BDA0003798255250000111
Figure BDA0003798255250000112
wherein precision = TP/(TP + FP), recall = TPR = TP/(TP + FN), TNR = TN/(TN + FP); TP, FP, TN, and FN respectively represent the number of samples correctly classified as failed, the number of samples incorrectly classified as failed, the number of samples correctly classified as normal, and the number of samples incorrectly classified as normal; the values of F1 and G-mean are both in the interval of [0,1] and the higher the value is, the better the performance of the method is represented.
The results obtained by comparison are shown in the following table 2, and the results show that the complex electromechanical system fault detection method provided by the invention has excellent fault detection capability. The causal discovery method combining the existing knowledge and based on the constraint can effectively extract the causal relationship of high-dimensional monitoring variables in the complex electromechanical system, so that fault detection modeling can be carried out according to the causal effect of each component of the system; the attention neural network based on the causal graph can adaptively aggregate characteristics of a parent node (causal variable) according to different importance degrees of the parent node to generate embedded characteristics of a child node (causal variable) by combining the nature and attention mechanism of the causal relationship, and improves the node characteristic extraction performance; in addition, the independent support scores are used as the constraint of node extraction representation, so that the characteristics under different attention mechanisms can be prompted to have causal separation properties, and the performance of the fault detection model is greatly improved.
TABLE 2 Fault detection result index evaluation by the inventive and comparative methods
Method F1 score G-mean score
The method of the invention 0.8473 0.9634
Tradition graph attention neural network 0.7974 0.9574
Conventional graph convolutional neural network 0.7951 0.8396
Support vector machine 0.6892 0.7451
Artificial neural network 0.5454 0.6849
Convolutional neural network 0.7470 0.7952
Finally, it should be noted that: the above-mentioned embodiments are only used for illustrating the technical solution of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (5)

1. A complex electromechanical system fault detection method based on a causal graph attention neural network is characterized by comprising the following steps: which comprises the following steps;
s1: the method comprises the following steps of combining known causal relationship with a constraint-based causal discovery method, taking the monitoring variable data of a complex electromechanical system as input, and constructing a causal relationship graph of system monitoring variables, wherein the causal relationship graph specifically comprises the following substeps:
s11: determining causal path constraints and causal direction constraints according to the existing knowledge;
s12: generating a causal graph skeleton, and adding or deleting corresponding edges according to causal path constraints;
s13: data preprocessing is carried out to convert all data into numerical types, then a constraint-based cause and effect discovery algorithm is utilized to continuously search and construct a cause and effect graph, and corresponding edges are added according to cause and effect direction constraints;
s14: verifying whether the cause and effect discovery result meets the known relationship constraint, and if so, outputting a cause and effect graph of the result; if not, adjusting the parameter threshold of the constraint-based cause and effect discovery algorithm and returning to the step S12 again;
s2: extracting and learning the causal graph node characterization by using a causal graph attention neural network, which specifically comprises the following sub-steps:
s21: calculating a causal attention coefficient: input characteristics of each node
Figure FDA0003798255240000017
By trainable parameters W ∈ R M×F Transforming to M dimension, wherein M is more than or equal to 2 and is a positive integer, F represents the dimension of the original characteristic, applying a causality-based attention mechanism to the cause-effect node pairs under each causality mechanism, and obtaining a causality inter-pair attention coefficient through a nonlinear activation function and normalization, as shown in formula (1), wherein X is X j ∈X Pa(i) Representing node X j Is node X i Of parent node, i.e. node X j Is node X i The reason variable, | | represents the characteristic splicing operation, leakyReLU is the nonlinear activation function; for nodes without father nodes, defining the attention coefficient as a ij =0 (j ≠ i) and a ii =1;
Figure FDA0003798255240000011
In the formula, W a Is a trainable parameter;
Figure FDA0003798255240000012
respectively represent nodes X i 、X j Is input feature of, and node X j Is X i The causal variable of (a);
Figure FDA0003798255240000013
represents node X i 、X j Input feature of
Figure FDA0003798255240000014
The trainable parameters W are used after being transformed to M dimensions and spliced through the trainable parameters W a Transformation to 1-dimension;
s22: generating a node representation using a multi-headed causal attention mechanism: the representation of each node is represented by linear weighting of all the father nodes, namely the characteristics of the cause variable, and a multi-head attention mechanism is adopted to fully extract the characteristics as shown in a formula (2), wherein
Figure FDA0003798255240000015
Representing node X i The new growth characteristics of (a); sigma is a value that represents the activation function,
Figure FDA0003798255240000016
and W k Respectively, the child node X in the k-th head attention i With one of its parent nodes X j Attention coefficients in between, and trainable transformation parameters in the kth head attention; k represents the total number of attention mechanism heads; so far, the new token generated by each node comprises M × K dimensions;
Figure FDA0003798255240000021
s3: adding the node representations extracted in step S2 by each causal graph attention neural network layer, and calculating their independent support scores IOSS under different attention mechanisms for each node representation, as shown in formula (3), wherein N represents the number of all nodes,
Figure FDA0003798255240000028
representing node X i The characterization of (a) is performed,
Figure FDA0003798255240000022
represents the corresponding value of the beta quantile for variable i in the calculation:
Figure FDA0003798255240000023
in the formula of U s Representing the S times of S times random sampling of independent support theoretical joint distribution of node features, which is M multiplied by K dimensional vector and is calculated
Figure FDA0003798255240000024
Joint distribution ofIs obtained in which
Figure FDA0003798255240000025
Representing node X normalized by the maximum-minimum value i The mth dimension of (1); β' and β "represent specific values of β;
Figure FDA0003798255240000026
representing node X i The new generative characteristics of (a), comprising M × K dimensions;
s4: inputting the features extracted from all the nodes into a flat layer for splicing, inputting the spliced features into a full-connection neural network comprising a hidden layer and a 1-layer output layer, carrying out nonlinear processing on the features by an activation function in the full-connection neural network, and outputting a fault detection result of a complex electromechanical system by the output layer.
2. The complex electromechanical system fault detection method based on the causal graph attention neural network of claim 1, wherein: the causal path constraint in step S11 refers to determining the variable X through the existing knowledge i And X j With or without direct causal relationship between them, i.e. constraint causal graph node X i And X j Whether an edge exists between the two; causal directional constraint refers to the determination of variable X through prior knowledge i Is to cause a variable X j Cause, i.e. constraint causal graph node X i Is X j Ancestor node of.
3. The complex electromechanical system fault detection method based on the causal graph attention neural network of claim 1, wherein: the constraint-based cause and effect discovery algorithm in step S13 is specifically a PC algorithm.
4. The complex electromechanical system fault detection method based on the causal graph attention neural network of claim 1, wherein: step S4, in the model training process, the loss function of the model comprises two parts: namely, the cross entropy CE loss sum of the model output fault detection result and the real resultThe IOSS loss of the independent support score obtained by calculating the characteristics of the model extraction nodes is represented by the formula (4), the calculation formula of the cross entropy is represented by the formula (5), the total loss of the model training is represented by the formula (5), and the causal separability of the characteristics extracted by each node can be improved by calculating the independent support score as the constraint, so that the problem of excessive smoothness is relieved, and the performance of the characteristic extraction of each node is improved ij And p ij Respectively representing a real system state and a predicted system state, wherein alpha is a balance coefficient of two losses;
Figure FDA0003798255240000027
L=L CE +αIOSS (5)
wherein L is CE For cross entropy loss, L is the loss function of model training.
5. The method for detecting the faults of the complex electromechanical system based on the causal graph attention neural network as claimed in claim 1, wherein: the attention neural network based on the causal graph is optimized by using an Adam algorithm, and hyper-parameters such as the dimension M in the step S21, the number K of heads of the causal attention mechanism in the step S22, the quantile beta in the step S3, the number of hidden layers in the step S4, the number of neurons and the like are determined by a grid search method.
CN202210975693.0A 2022-08-15 2022-08-15 Complex electromechanical system fault detection method based on causal graph attention neural network Pending CN115310837A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210975693.0A CN115310837A (en) 2022-08-15 2022-08-15 Complex electromechanical system fault detection method based on causal graph attention neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210975693.0A CN115310837A (en) 2022-08-15 2022-08-15 Complex electromechanical system fault detection method based on causal graph attention neural network

Publications (1)

Publication Number Publication Date
CN115310837A true CN115310837A (en) 2022-11-08

Family

ID=83862085

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210975693.0A Pending CN115310837A (en) 2022-08-15 2022-08-15 Complex electromechanical system fault detection method based on causal graph attention neural network

Country Status (1)

Country Link
CN (1) CN115310837A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115793590A (en) * 2023-01-30 2023-03-14 江苏达科数智技术有限公司 Data processing method and platform suitable for system safety operation and maintenance

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115793590A (en) * 2023-01-30 2023-03-14 江苏达科数智技术有限公司 Data processing method and platform suitable for system safety operation and maintenance

Similar Documents

Publication Publication Date Title
CN102520341B (en) Analog circuit fault diagnosis method based on Bayes-KFCM (Kernelized Fuzzy C-Means) algorithm
CN112508085B (en) Social network link prediction method based on perceptual neural network
Yin et al. Wasserstein generative adversarial network and convolutional neural network (WG-CNN) for bearing fault diagnosis
CN111353373B (en) Related alignment domain adaptive fault diagnosis method
JPWO2008114863A1 (en) Diagnostic equipment
CN116628597B (en) Heterogeneous graph node classification method based on relationship path attention
CN112147432A (en) BiLSTM module based on attention mechanism, transformer state diagnosis method and system
CN116821776B (en) Heterogeneous graph network node classification method based on graph self-attention mechanism
CN113505655A (en) Bearing fault intelligent diagnosis method for digital twin system
CN113361559A (en) Multi-mode data knowledge information extraction method based on deep width joint neural network
CN115310837A (en) Complex electromechanical system fault detection method based on causal graph attention neural network
CN115879505A (en) Self-adaptive correlation perception unsupervised deep learning anomaly detection method
Zhang et al. An intrusion detection method based on stacked sparse autoencoder and improved gaussian mixture model
Pranavan et al. Contrastive predictive coding for anomaly detection in multi-variate time series data
Yuan et al. Improving fault tolerance in diagnosing power system failures with optimal hierarchical extreme learning machine
CN116935128A (en) Zero sample abnormal image detection method based on learning prompt
CN116743555A (en) Robust multi-mode network operation and maintenance fault detection method, system and product
CN116467930A (en) Transformer-based structured data general modeling method
CN115659135A (en) Anomaly detection method for multi-source heterogeneous industrial sensor data
CN115168864A (en) Intelligent cross contract vulnerability detection method based on feature cross
CN111797732B (en) Video motion identification anti-attack method insensitive to sampling
CN114580934A (en) Early warning method for food detection data risk based on unsupervised anomaly detection
Wang et al. Hierarchical multimodal fusion network with dynamic multi-task learning
CN114022739A (en) Zero sample learning method based on combination of alignment variational self-encoder and triple
CN113159976A (en) Identification method for important users of microblog network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination