CN115310837A

CN115310837A - Complex electromechanical system fault detection method based on causal graph attention neural network

Info

Publication number: CN115310837A
Application number: CN202210975693.0A
Authority: CN
Inventors: 刘杰; 郑舒文; 王冲
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2022-08-15
Filing date: 2022-08-15
Publication date: 2022-11-08

Abstract

The invention provides a complex electromechanical system fault detection method based on a causal graph attention neural network, which comprises the following steps of: s1: combining known causal relationships with constraint-based causal discovery, and taking monitoring data of a complex electromechanical system as input to construct a causal graph; s2: extracting node features in the causal graph by using the causal graph attention neural network; s3: adding the features extracted by each layer of causal graph attention neural network, and calculating the independent support scores of each node representation under different attention mechanisms; s4: and splicing and inputting the characteristics of all the nodes into the fully-connected neural network, and finally outputting the fault detection result of the system. The method includes the steps that known causal relationships are fused, and a causal graph is obtained based on a constraint method; and adaptively generating embedded representation of the result variable according to different importance degrees of the cause variable, taking the independent support score as constraint, extracting the characteristic with causal separation property, finally realizing system fault detection and improving detection performance.

Description

Complex electromechanical system fault detection method based on causal graph attention neural network

Technical Field

The invention relates to the field of fault prediction and health management of a complex electromechanical system, in particular to a fault detection method of the complex electromechanical system based on a causal graph attention neural network.

Background

As representative of modern advanced technology, various complex electromechanical systems are continuously developed. These systems are based on the comprehensive integration of various mechanical, electronic and hydraulic (pneumatic) subsystems, ultimately achieving complex system functions. The structural and functional complexity inside complex electromechanical systems is significantly increased compared to traditional mechanical or electronic systems: the coupling relationship of each module is more complicated, and the boundary between subsystems is more blurred. For these reasons, complex electromechanical systems are also more sensitive to operating conditions. Minor anomalies or faults can cause chain reactions through cascading and propagation, compromising the operation of the entire system. Therefore, how to timely and effectively detect the fault and discover the abnormal operation state is one of the keys for ensuring the healthy operation of the system and improving the safety and the usability of the system.

The current fault detection method based on data driving generally directly models the correlation between input variables and faults, and ignores the causal relationship and spatial structure relationship existing among the variables. Graphical Neural Networks (GNNs) have had excellent success in processing spatially structured data. GNNs are able to mine information about nodes (feature variables) and their edges (relationships) using non-euclidean features provided by graph structures. The GNN makes a great breakthrough in image and video classification tasks and also stimulates the application of the GNN in fault detection. However, although the GNN-based method improves the performance of fault detection to some extent, the current GNN-based fault detection method applies the same weight addition to all neighboring nodes, and ignores the differential contribution of different nodes; in addition, most of the current GNNs utilize graphs constructed by knowledge in specific fields, but for complex electromechanical systems with complex failure mechanisms and numerous monitoring variables, the spatial structure of the complex electromechanical systems is difficult to obtain; the GNN-based fault detection method mostly assumes the correlation among variables, so that the performance and the interpretability of fault detection are limited; meanwhile, with the increase of the number of layers of the GNN, the characteristics of each node tend to be close, and the phenomenon of over-smoothing occurs.

The causal discovery can mine the causal relationship of things, and when the method is applied to the field of fault detection, the method can analyze and monitor the complex causal mechanism among variables and know the fault occurrence and propagation process, thereby being beneficial to improving the performance of a fault detection model. However, for a complex electromechanical system, it is generally difficult to sufficiently mine the causal relationship of a variable complex system only by expert experience, and the problem that the graph structure is unstable and the result has obvious errors easily occurs when a causal graph is constructed by using a data driving method.

Disclosure of Invention

In order to overcome the technical defect of the fault detection of the complex electromechanical system based on data driving in the prior art, the invention provides a fault detection method of the complex electromechanical system based on a causal graph attention neural network, which constructs a causal graph of monitoring variables of the complex electromechanical system by combining a known causal relationship and a cause and effect discovery method based on constraint, and can overcome the problems that the complex system is difficult to analyze by only utilizing expert experience and obvious errors can exist in the cause and effect discovery result based on constraint; then, by utilizing the proposed causal graph attention neural network, the weight of a parent node can be calculated in a self-adaptive manner by utilizing a multi-head causal attention mechanism, and the embedded representation of a child node is generated; further, the feature extracted by each node is used as a constraint term of a loss function and a node representation with causal separation property is extracted by calculating an independent support score; finally, the representation of all the nodes is mapped through a flat layer and a full-connection neural network, and the fault detection result of the target system is output. The fault detection method provided by the invention can be used for mining the complex relation of high-dimensional monitoring variables in a complex electromechanical system from the aspect of causal relation, and overcomes the defects existing in the causal discovery method which only depends on expert experience and data; by utilizing a multi-head attention mechanism based on cause and effect, the weight of a parent node (a cause variable) can be calculated in an adaptive mode, and embedded representation of a child node (an effect variable) is generated; calculating independent support scores for the features extracted from each node to constrain the causal separation characteristics represented by the nodes; finally, the representations of all the nodes are mapped through the flat layer and the full-connection neural network, the fault detection result of the target system is output, and the performance of fault detection of the complex electromechanical system is effectively improved.

Specifically, the invention provides a complex electromechanical system fault detection method based on a causal graph attention neural network, which comprises the following steps:

s1: combining a known cause-and-effect relationship with a constraint-based cause-and-effect discovery method, taking the monitoring variable data of a complex electromechanical system as input, and constructing a cause-and-effect relationship graph of the monitoring variable of the system, wherein the method specifically comprises the following substeps:

s11: determining causal path constraints and causal direction constraints according to the existing knowledge;

s12: generating a causal graph skeleton, and adding or deleting corresponding edges according to causal path constraints;

s13: data preprocessing is carried out to convert all data into numerical types, then a constraint-based cause and effect discovery algorithm is utilized to continuously search and construct a cause and effect graph, and corresponding edges are added according to cause and effect direction constraints;

s14: verifying whether the cause and effect discovery result meets the known relationship constraint, and if so, outputting a cause and effect graph of the result; if not, adjusting the parameter threshold of the PC algorithm and returning to the step S12 again; the PC algorithm is a classic constraint-based cause and effect discovery algorithm;

s2: the method comprises the following steps of extracting and learning the causal graph node characterization by using the proposed causal graph attention neural network, wherein the method specifically comprises the following substeps:

s21: an attention coefficient based on the causal relationship is calculated. Input characteristics of each node

By trainable parameters WeR ^M×F Transforming to a high dimension, applying an attention mechanism to each causal node pair (father-son node pair or cause-effect variable pair), and obtaining a causal inter-pair attention coefficient through a nonlinear activation function and normalization, wherein X is shown as formula (1) _j ∈X _Pa(i) Representing child node X _i The parent node, | | | represents the characteristic splicing operation, leakyReLU is a nonlinear activation function; for nodes without parent (assume X) _i ) Defining its attention coefficient as a _ij =0 (j ≠ i) anda _ii ＝1；

in the formula, W _a Is a trainable parameter;

respectively represent nodes X _i 、X _j Input characteristic of (2), and variable X _j Is X _i The causal variable of (a);

represents node X _i 、X _j Input feature of

Transforming to M dimension by using trainable parameter W and splicing, and then using trainable parameter W _a Transformation to 1-dimension;

s22: a node representation is generated using a multi-headed causal attention mechanism. The representation of each node is a linear weighting of the characteristics of all the father nodes (reason variables) of each node, and a multi-head attention mechanism is adopted to fully extract the characteristics as shown in a formula (2), wherein

Representing node X _i The new growth characteristics of (a); sigma is a value that represents the activation function,

and W ^k Respectively, the child nodes (result variables) X in the k-th head attention _i With one of its parent nodes (causal variable) X _j Attention coefficients in between, and trainable transformation parameters in the kth attention; k represents the total number of attention mechanism heads; so far, the new token generated by each node comprises M × K dimensions;

s3: adding the features extracted in step S2 by each causal graph attention neural network layer, and calculating its independent support score (IOSS) under different attention mechanisms for each node characterization, as shown in equation (3), where N represents the number of all nodes,

representing node X _i The characterization of (a) is performed,

represents the beta quantile for variable i in the calculation:

in the formula of U _s Representing the S times of S times random sampling of independent support theoretical joint distribution of node features, which is M multiplied by K dimensional vector and is calculated

Is obtained by joint distribution of

Representing node X normalized by the maximum-minimum value _i The mth dimension of (1); β' and β "represent specific values of β;

representing node X _i The new generative characteristics of (a), comprising M × K dimensions;

s4: inputting the features extracted from all the nodes into a flat layer for splicing, inputting the spliced features into a full-connection neural network comprising a hidden layer and a 1-layer output layer, carrying out nonlinear processing on the features by an activation function in the full-connection neural network, and outputting a fault detection result of a complex electromechanical system by the output layer.

Preferably, the causal path constraint in step S11 means that the path has been constrained byKnowledge determination of variable X _i And X _j With or without direct causal relationship between them, i.e. constraining causal graph node X _i And X _j The presence or absence of edges in between; causal directional constraint refers to the determination of a variable X from prior knowledge _i Is to cause a variable X _j Cause, i.e. constraint causal graph node X _i Is X _j The ancestor node of (c);

preferably, the step S13 adopts a PC algorithm as a constraint-based cause and effect discovery algorithm;

preferably, in the model training process of step S4, the loss function thereof includes two parts: the Cross Entropy (CE) loss of the model output fault detection result and the real result and the independent support score (IOSS) loss obtained by calculating the node feature of model extraction are shown in a formula (4), a calculation formula of the cross entropy is shown in a formula (4), the total loss of model training is shown in a formula (5), and the causal separation property of the extracted feature of each node can be restrained by calculating the independent support score as restraint, so that the problem of excessive smoothness is relieved, and the performance of extracting the feature of each node is improved _ij And p _ij Representing the real system state and the predicted system state, respectively, and alpha is the balance coefficient of the two losses.

L＝L _CE +αIOSS (5)

Preferably, the attention neural network based on the causal graph provided by the invention is optimized by using an Adam algorithm, and hyper-parameters such as the high-dimensional space dimension M in step S21, the number of heads K of the causal attention mechanism in step S22, the quantile β in step S3, the number of hidden layers in step S4, the number of neurons, and the like are determined by a grid search method.

Compared with the prior art, the invention has the following beneficial effects:

(1) The invention provides a method for detecting faults of a complex electromechanical system based on a causal graph attention neural network, which combines a known causal relationship and a constraint-based causal discovery method to construct a causal graph reflecting the causal relationship between monitoring variables of the complex electromechanical system; adaptively aggregating characteristics of parent nodes (causal variables) through a causal graph attention neural network to generate embedded representations of child nodes (causal variables); the features extracted from each node are taken as constraint terms of the loss function and cause-effect separability of the representation of the constraint nodes by calculating an independent support score; finally, the representation of all the nodes is mapped through a flat layer and a full-connection neural network, and fault detection of a complex electromechanical system is output; the performance of fault detection of the current complex electromechanical system can be effectively improved;

(2) The invention provides a causal graph attention neural network by utilizing the characteristic that the child nodes in the causal graph are influenced by the father nodes, the weight of the father nodes can be calculated in a self-adaptive mode through an attention mechanism to generate the embedded representation of the child nodes, so that the embedded representation of the reason variables and the generated result variables can be aggregated according to importance, and the extraction performance of the fault features of a complex electromechanical system can be improved;

(3) The method considers the problem that the neural network of the graph possibly causes node feature convergence and excessive smoothness along with the increase of the number of layers, provides the characteristic calculation independent support score for each node, and uses the characteristic calculation independent support score as a constraint term of a loss function to promote the extracted node characteristics to have causal separation property, thereby being beneficial to enhancing the efficiency of node feature extraction;

(4) The method utilizes the known causal relationship and the monitoring data to construct the monitoring variable causal graph of the complex electromechanical system, extracts characteristics and finally realizes fault detection by combining a causal influence mechanism and considering different importance of the causal variables according to the occurrence and propagation of the system fault and relevant causal information, thereby improving the fault detection accuracy and performance of the complex electromechanical system and having extremely high economic benefit and social benefit.

Drawings

FIG. 1 is a schematic flow chart illustrating the steps of a method for detecting faults of a complex electromechanical system based on a causal graph attention neural network according to the present invention;

FIG. 2 is a block diagram of exemplary steps of a complex electromechanical system fault detection method based on a causal graph attention neural network proposed by the present invention;

FIG. 3 is a simplified block diagram of a high speed rail braking system according to an embodiment of the present invention;

FIG. 4 is a causal graph constructed in accordance with an embodiment of the present invention, incorporating known causal relationships and constraint-based causal discovery methods.

Detailed Description

Exemplary embodiments, features and aspects of the present invention will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

Specifically, the invention provides a complex electromechanical system fault detection method based on a causal graph attention neural network, as shown in fig. 1, which comprises the following steps:

s1: the method comprises the following steps of combining known causal relationship with a constraint-based causal discovery method, taking the monitoring variable data of a complex electromechanical system as input, and constructing a causal relationship graph of system monitoring variables, wherein the causal relationship graph specifically comprises the following substeps:

s14: verifying whether the cause and effect discovery result meets the known relationship constraint, and if so, outputting a cause and effect graph of the result; if not, adjusting the parameter threshold of the PC algorithm and returning to the step S12 again;

By trainable parameters W ∈ R ^M×F Transforming to M dimension, wherein M is more than or equal to 2, F represents the dimension of the original characteristic, applying an attention mechanism to each causal node pair (father-son node pair or cause-result variable pair), and obtaining a causal inter-pair attention coefficient through a nonlinear activation function and normalization, wherein X is shown as formula (1) _j ∈X _Pa(i) Representing node X _j Is node X _i Parent node of (2), i.e. node X _j Is node X _i The cause variable of (1), wherein | represents the characteristic splicing operation, and LeakyReLU is a nonlinear activation function; for nodes without father nodes, defining the attention coefficient as a _ij =0 (j ≠ i) and a _ii ＝1。

In the formula, W _a W is a trainable parameter;

respectively represent nodes X _i 、X _j Is input feature of, and node X _j Is X _i The causal variable of (a);

represents node X _i 、X _j Input feature of

After the trainable parameters W are transformed to M dimensions and spliced, the trainable parameters W are used _a Transformation to 1-dimension;

s22: a node representation is generated using a multi-headed causal attention mechanism. The representation of each node is represented as a linear weighting of the features of all its parents (causal variables), and a multi-head attention mechanism is adopted to fully extract the features as shown in formula (2), wherein

To representNode X _i The new growth characteristics of (a); sigma denotes the function of the activation, which,

and W ^k Respectively, the child nodes (result variables) X in the k-th head attention _i With one of its parent nodes (causal variable) X _j Attention coefficients in between, and trainable transformation parameters in the kth attention; k represents the total number of attention mechanism heads; to this end, the new tokens generated by each node include a total of M × K dimensions.

representing node X _i The (d) dimension j of (a),

represents the beta quantile for variable i in the calculation:

Is obtained by joint distribution of

Representing node X after max-min normalization _i The mth dimension of (1); β' and β ″ represent specific values of β.

Representing node X _i The new generative characteristics of (a), comprising M × K dimensions.

The Quantile (Quantile), also called Quantile, refers to a numerical point that divides the probability distribution range of a random variable into several equal parts, and there are usually a median (i.e., a binary), a quartile, a percentile, and the like. Here, it means for each U _s First, calculate it and each

Is the square of the distance of

Note that since there are a total of N nodes, there is a need for each U _s A series of N1-dimensional real values can be obtained, and the beta' quantiles of the N values are taken, namely each U _s A quantile is obtained, and finally all Us obtain S1-dimensional real numerical values in total; and taking the quantile of beta' for the S number as a final calculation result, wherein the final calculation result is a 1-dimensional real number, namely a numerical value of the IOSS, and is used for representing the causal coupling loss represented by each dimension of the node.

Preferably, the causal path constraint in step S11 refers to determining the variable X by prior knowledge _i And X _j With or without direct causal relationship between them, i.e. constraining causal graph node X _i And X _j The presence or absence of edges in between; causal directional constraint refers to the determination of variable X through prior knowledge _i Is to cause a variable X _j Cause, i.e. constraint causal graph node X _i Is X _j Ancestor nodes of (1);

preferably, in the model training process of step S4, the loss function includes two parts: the Cross Entropy (CE) loss of the model output fault detection result and the real result and the independent support score (IOSS) loss obtained by calculating the node feature of model extraction are shown in a formula (4), a calculation formula of the cross entropy is shown in a formula (4), the total loss of model training is shown in a formula (5), and the causal separation property of the extracted feature of each node can be restrained by calculating the independent support score as restraint, so that the problem of excessive smoothness is relieved, and the performance of extracting the feature of each node is improved _ij And p _ij Representing the real system state and the predicted system state respectively, and alpha is the balance coefficient of the two losses.

L＝L _CE +αIOSS (5)

Preferably, the attention neural network based on the causal graph provided by the present invention is optimized by using Adam algorithm, and hyper-parameters such as the high-dimensional spatial dimension M in step S21, the number of heads K of the causal attention mechanism in step S22, the quantile β in step S3, the number of hidden layers in S4, and the number of neurons are determined by a grid search method.

The fault detection process of the present invention will be further described in detail with reference to the operation state monitoring data collected from a high-speed rail brake system, and fig. 3 is a simplified structure diagram of the high-speed rail brake system, which includes 39 monitoring variables (including information such as brake valve state, line voltage, line current, etc., which are respectively denoted by X1, X2, \ 8230;, X39). The complex electromechanical system fault detection method based on the multi-source causal graph path convolution, disclosed by the invention, comprises the following specific implementation steps as shown in FIG. 2:

s1: the method combines the known cause and effect relationship with a constraint-based cause and effect discovery method, takes 39 monitoring variable data as input, constructs a cause and effect relationship graph of the monitoring variables of the high-speed rail brake system, and specifically comprises the following sub-steps:

s11: the causal path constraint and the causal direction constraint of monitoring variables in the high-speed rail brake system are determined according to the existing knowledge and are shown in the table 1;

s13: performing data preprocessing to convert all data into numerical types, specifically, respectively converting variables of various types into numerical types by adopting encoding methods such as label encoding and dummy variables, and normalizing the data by using a maximum-minimum method, for example: the train operation mode is a category type variable, the state displayed by the value of the variable is not a numerical value, and after the label coding is carried out, the variable is converted into a numerical value code representing the corresponding state, such as 0,1,2; continuously searching and constructing a causal graph by using a constraint-based causal discovery algorithm, and adding corresponding edges according to the constraint of a causal direction; a constraint-based cause and effect discovery algorithm adopts a PC algorithm;

s14: verifying whether the causal discovery result meets the known relationship constraint, and if so, outputting a causal graph of the result; if not, adjusting the parameter threshold of the PC algorithm and returning to S12 again.

Preferably, the causal path constraint in step S11 refers to determining the variable X by prior knowledge _i And X _j With or without direct causal relationship between them, i.e. constraining causal graph node X _i And X _j The presence or absence of edges in between; causal directional constraint refers to the determination of a variable X from prior knowledge _i Is to cause a variable X _j Cause, i.e. constraining causal graph node X _i Is X _j The ancestor node of (c); the added known causal relationship can suppress errors generated by the data driven causal discovery algorithm, and improve the reliability of the result.

Preferably, the PC algorithm is adopted as the constraint-based cause and effect discovery algorithm in step S13, and finally, a cause and effect relationship graph of the monitoring variables of the high-speed rail brake system is obtained as shown in fig. 4.

TABLE 1 establishment of causal path constraints and causal direction constraints using known causal relationships

S2: the method for extracting and learning the node characteristics of the causal graph obtained in the step 4 by using the proposed causal graph attention neural network specifically comprises the following substeps:

By trainable parameters W ∈ R ^2×1 Transforming the data into 2-dimensional vectors, applying an attention mechanism to each causal node pair (father-son node pair or cause-effect variable pair), and obtaining a causal inter-pair attention coefficient through a nonlinear activation function and normalization, wherein X is shown as formula (1) _j ∈X _Pa(i) Representing child node X _i The parent node of (1), wherein | represents the characteristic splicing operation, leakyReLU is a nonlinear activation function, and the calculation is shown as the formula (2); for a node without a parent node, namely X36, the attention coefficient is defined as a _36,j =0 (j ≠ 36) and a _36,36 ＝1。

S22: a node representation is generated using a multi-headed causal attention mechanism. The representation of each node is a linear weighting of the characteristics of all the father nodes (reason variables) of each node, and a multi-head attention mechanism is adopted to fully extract the characteristics as shown in a formula (3), wherein

Representing node X _i A new growth characteristic of (a); adopts elu as an activation function, the calculation formula is shown as formula (4),

and W ^k Respectively, a child node (result variable) X in the kth head attention _i With one of its parent nodes (causal variable) X _j Attention coefficients in between, and trainable transformation parameters in the kth attention; a K =8 attention mechanism is adopted; the new tokens generated by each node comprise a total of 2 x 8 dimensions.

S3: the features extracted by the 3-layer causal graph attention neural network layer are superposed, and the independent support scores (IOSS) under different attention mechanisms are calculated for the characterization of each node, as shown in formula (5), wherein N represents the number of all nodes,

representing node X _i The characterization of (a) is performed,

represents the beta quantile for variable i in the calculation:

in the formula of U _s S times of S times random sampling of independent support theoretical joint distribution representing node characteristics, the theoretical joint distribution is calculated

Is obtained in which

Representing node X normalized by the maximum-minimum value _i The mth dimension of (1); β' and β "represent specific values of β.

Representing node X _i The new generation signature of (2), comprising 2 x 8 dimensions.

S4: inputting the features extracted from all the nodes into a flat layer for splicing, inputting the splicing result into a two-layer fully-connected neural network respectively comprising 8 and 2 neurons, and finally outputting the fault detection result of the high-speed rail braking system through a sigmoid activation function: where 0 indicates normal and 1 indicates fault.

Preferably, in the model training process of step S4, the loss function thereof includes two parts: the Cross Entropy (CE) loss of the model output fault detection result and the real result and the independent support score (IOSS) loss obtained by calculating the node feature of model extraction are shown in a formula (6), a calculation formula of the cross entropy is shown in a formula (6), the total loss of model training is shown in a formula (7), and the causal separation of the extracted feature of each node can be improved by calculating the independent support score as constraint, so that the problem of excessive smoothness is relieved, and the performance of extracting the feature of each node is improved, wherein n represents the number of input data samples, m =2 represents the number of categories of the system feature, and y =2 represents the number of categories of the system feature _ij And p _ij Representing the true system state and the predicted system state, respectively, and α =1e-4 is the balance coefficient of the two losses.

L＝L _CE +αIOSS (7)

Wherein L is _CE For cross entropy loss, L is the loss function of model training.

Preferably, the attention neural network based on the causal graph provided by the invention is optimized by using an Adam algorithm, and hyper-parameters such as the high-dimensional space dimension M in step S21, the head number K of the causal attention mechanism in step S22, the quantile β in step S3, the number of neurons in the hidden layer in step S4 and the like are determined by a grid search method.

In order to further verify the effectiveness and highlight the performance of the method, the method is compared with a Support Vector Machine (SVM), an Artificial Neural Network (ANN), a Convolutional Neural Network (CNN), a traditional graph convolutional neural network (GCN) and a traditional graph attention neural network (GAT) method, and two common imbalance data fault detection performance evaluation indexes are selected: and (3) taking the F1 score and the G-mean score as standards, carrying out method performance comparison, wherein the calculation formula of the scores is shown as formulas (8) and (9):

wherein precision = TP/(TP + FP), recall = TPR = TP/(TP + FN), TNR = TN/(TN + FP); TP, FP, TN, and FN respectively represent the number of samples correctly classified as failed, the number of samples incorrectly classified as failed, the number of samples correctly classified as normal, and the number of samples incorrectly classified as normal; the values of F1 and G-mean are both in the interval of [0,1] and the higher the value is, the better the performance of the method is represented.

The results obtained by comparison are shown in the following table 2, and the results show that the complex electromechanical system fault detection method provided by the invention has excellent fault detection capability. The causal discovery method combining the existing knowledge and based on the constraint can effectively extract the causal relationship of high-dimensional monitoring variables in the complex electromechanical system, so that fault detection modeling can be carried out according to the causal effect of each component of the system; the attention neural network based on the causal graph can adaptively aggregate characteristics of a parent node (causal variable) according to different importance degrees of the parent node to generate embedded characteristics of a child node (causal variable) by combining the nature and attention mechanism of the causal relationship, and improves the node characteristic extraction performance; in addition, the independent support scores are used as the constraint of node extraction representation, so that the characteristics under different attention mechanisms can be prompted to have causal separation properties, and the performance of the fault detection model is greatly improved.

TABLE 2 Fault detection result index evaluation by the inventive and comparative methods

Method	F1 score	G-mean score
			The method of the invention	0.8473	0.9634
Tradition graph attention neural network	0.7974	0.9574
			Conventional graph convolutional neural network	0.7951	0.8396
Support vector machine	0.6892	0.7451
			Artificial neural network	0.5454	0.6849
Convolutional neural network	0.7470	0.7952

Finally, it should be noted that: the above-mentioned embodiments are only used for illustrating the technical solution of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A complex electromechanical system fault detection method based on a causal graph attention neural network is characterized by comprising the following steps: which comprises the following steps;

s14: verifying whether the cause and effect discovery result meets the known relationship constraint, and if so, outputting a cause and effect graph of the result; if not, adjusting the parameter threshold of the constraint-based cause and effect discovery algorithm and returning to the step S12 again;

s2: extracting and learning the causal graph node characterization by using a causal graph attention neural network, which specifically comprises the following sub-steps:

s21: calculating a causal attention coefficient: input characteristics of each node

By trainable parameters W ∈ R ^M×F Transforming to M dimension, wherein M is more than or equal to 2 and is a positive integer, F represents the dimension of the original characteristic, applying a causality-based attention mechanism to the cause-effect node pairs under each causality mechanism, and obtaining a causality inter-pair attention coefficient through a nonlinear activation function and normalization, as shown in formula (1), wherein X is X _j ∈X _Pa(i) Representing node X _j Is node X _i Of parent node, i.e. node X _j Is node X _i The reason variable, | | represents the characteristic splicing operation, leakyReLU is the nonlinear activation function; for nodes without father nodes, defining the attention coefficient as a _ij =0 (j ≠ i) and a _ii ＝1；

In the formula, W _a Is a trainable parameter;

represents node X _i 、X _j Input feature of

The trainable parameters W are used after being transformed to M dimensions and spliced through the trainable parameters W _a Transformation to 1-dimension;

s22: generating a node representation using a multi-headed causal attention mechanism: the representation of each node is represented by linear weighting of all the father nodes, namely the characteristics of the cause variable, and a multi-head attention mechanism is adopted to fully extract the characteristics as shown in a formula (2), wherein

and W ^k Respectively, the child node X in the k-th head attention _i With one of its parent nodes X _j Attention coefficients in between, and trainable transformation parameters in the kth head attention; k represents the total number of attention mechanism heads; so far, the new token generated by each node comprises M × K dimensions;

s3: adding the node representations extracted in step S2 by each causal graph attention neural network layer, and calculating their independent support scores IOSS under different attention mechanisms for each node representation, as shown in formula (3), wherein N represents the number of all nodes,

representing node X _i The characterization of (a) is performed,

represents the corresponding value of the beta quantile for variable i in the calculation:

Joint distribution ofIs obtained in which

2. The complex electromechanical system fault detection method based on the causal graph attention neural network of claim 1, wherein: the causal path constraint in step S11 refers to determining the variable X through the existing knowledge _i And X _j With or without direct causal relationship between them, i.e. constraint causal graph node X _i And X _j Whether an edge exists between the two; causal directional constraint refers to the determination of variable X through prior knowledge _i Is to cause a variable X _j Cause, i.e. constraint causal graph node X _i Is X _j Ancestor node of.

3. The complex electromechanical system fault detection method based on the causal graph attention neural network of claim 1, wherein: the constraint-based cause and effect discovery algorithm in step S13 is specifically a PC algorithm.

4. The complex electromechanical system fault detection method based on the causal graph attention neural network of claim 1, wherein: step S4, in the model training process, the loss function of the model comprises two parts: namely, the cross entropy CE loss sum of the model output fault detection result and the real resultThe IOSS loss of the independent support score obtained by calculating the characteristics of the model extraction nodes is represented by the formula (4), the calculation formula of the cross entropy is represented by the formula (5), the total loss of the model training is represented by the formula (5), and the causal separability of the characteristics extracted by each node can be improved by calculating the independent support score as the constraint, so that the problem of excessive smoothness is relieved, and the performance of the characteristic extraction of each node is improved _ij And p _ij Respectively representing a real system state and a predicted system state, wherein alpha is a balance coefficient of two losses;

L＝L _CE +αIOSS (5)

5. The method for detecting the faults of the complex electromechanical system based on the causal graph attention neural network as claimed in claim 1, wherein: the attention neural network based on the causal graph is optimized by using an Adam algorithm, and hyper-parameters such as the dimension M in the step S21, the number K of heads of the causal attention mechanism in the step S22, the quantile beta in the step S3, the number of hidden layers in the step S4, the number of neurons and the like are determined by a grid search method.