CN115659174A

CN115659174A - Multi-sensor fault diagnosis method, medium and equipment based on graph regularization CNN-BilSTM

Info

Publication number: CN115659174A
Application number: CN202211252600.8A
Authority: CN
Inventors: 李鹏; 秦泰春; 王成城; 施建明; 党炜; 易难
Original assignee: Technology and Engineering Center for Space Utilization of CAS; Beijing Institute of Spacecraft Environment Engineering
Current assignee: Technology and Engineering Center for Space Utilization of CAS; Beijing Institute of Spacecraft Environment Engineering
Priority date: 2022-10-13
Filing date: 2022-10-13
Publication date: 2023-01-31

Abstract

The invention discloses a multi-sensor fault diagnosis method, medium and equipment based on graph regularization CNN-BilSTM, and relates to the field of computer fault diagnosis systems based on specific calculation models. The method comprises the following steps: collecting state monitoring data of multiple sensors, and preprocessing the collected data to obtain a training set, a verification set and a test set; establishing a CNN-BilSTM network, adding an image regularization item into a full connection layer of the CNN-BilSTM network closest to a classifier, and completing construction of a GR-CNN-BilSTM model; training a GR-CNN-BilSTM model by using data in a training set, evaluating the GR-CNN-BilSTM model by using data in a verification set, and acquiring network parameters when the performance of the GR-CNN-BilSTM model is optimal; and inputting the data of the test set into the GR-CNN-BilSTM model with the optimal performance for fault diagnosis to obtain a fault diagnosis result. The invention improves the training efficiency and the diagnosis accuracy and overcomes the defect of low training efficiency of the conventional depth map regularization fault diagnosis method.

Description

Multi-sensor fault diagnosis method, medium and equipment based on graph regularization CNN-BilSTM

Technical Field

The invention relates to the field of computer fault diagnosis systems based on specific calculation models, in particular to a multi-sensor fault diagnosis method, medium and equipment based on graph regularization CNN-BilSTM.

Background

The fault diagnosis is used as a key link of fault Prediction and Health Management (PHM), fault detection, isolation and fault type identification are realized through a signal detection technology and a data analysis means, and a corresponding solution and an operation and maintenance strategy are provided for maintenance of a system.

The fault diagnosis mainly comprises two parts of feature extraction and classification. The traditional fault diagnosis method based on time-frequency signal processing relies on manual experience to extract features, and has limitation when being used under the conditions that the current volume is huge, the multiple sources are heterogeneous, the generation is rapid, and the value is sparse, and the state monitoring data is large. Deep learning is widely researched and applied in equipment fault diagnosis in the fields of mechanical manufacturing, aerospace, energy and chemical engineering and the like by virtue of the advantages of powerful processing of complex recognition tasks and feature extraction capability. The conventional Convolutional Neural Network (CNN) in deep learning can effectively avoid complex feature extraction and data reconstruction processes; the bidirectional long-short-term memory network (BilSTM) simultaneously processes the input of the network in the front and back directions, thereby improving the prediction precision of the model. Therefore, deep learning based on CNN and BiLSTM network models is used for fault diagnosis, hereinafter referred to as: the existing fault diagnosis method based on deep learning.

The existing fault diagnosis method based on deep learning is based on sample independent assumption, and correlation information among samples is ignored. In fact, in a particular feature space, samples of the same class are similar and closer together, and samples of different classes are more closely spaced and farther apart. Therefore, in order to fully utilize the geometric structure information among samples to improve the performance of the deep learning algorithm model, the knowledge of the manifold learning method and the spectrogram theory is introduced into the traditional regularization method to form the conventional depth map regularization fault diagnosis method. The existing depth map regularization fault diagnosis method enables data to keep the local geometric structure of the data in an original characteristic space in a new projection space, namely, each data point is similar in the distributed essential geometric structure and is also similar in embedding or projecting to the new space. However, the network neighbor graph of the conventional depth graph regularization fault diagnosis method is constructed in an original high-dimensional space, and has the problems of large calculation amount, high interference susceptibility and low training efficiency.

Disclosure of Invention

The invention aims to provide a multi-sensor fault diagnosis method, medium and equipment based on graph regularization CNN-BilSTM, so as to solve the problem of low training efficiency of the conventional deep graph regularization fault diagnosis method.

In order to achieve the above object, the multi-sensor fault diagnosis method based on graph regularization CNN-BiLSTM of the present invention includes:

s1, collecting state monitoring data of a plurality of sensors, and preprocessing the collected data to obtain a training set, a verification set and a test set;

s2, establishing a CNN-BilSTM network, adding a graph regularization item into a full connection layer of the CNN-BilSTM network, wherein the full connection layer is the full connection layer closest to the classifier, and finishing the construction of a GR-CNN-BilSTM model;

s3, training the GR-CNN-BilSTM model by using data in a training set, evaluating the GR-CNN-BilSTM model by using data in a verification set, and acquiring network parameters when the performance of the GR-CNN-BilSTM model is optimal;

and S4, inputting the data of the test set into the GR-CNN-BilSTM model with the optimal performance for fault diagnosis to obtain a fault diagnosis result.

Preferably, in step S1, the acquired data is subjected to z-score normalization, min-max normalization or normalization.

Preferably, step S2, establishing a CNN-BiLSTM network, specifically:

a1, inputting a training set into a CNN (CNN) for network structure training, extracting deep layer feature data through a convolutional layer, and processing through a maximum pooling layer to obtain dimension reduction deep layer feature data;

a2, inputting the dimensionality reduction deep layer feature data into a BilSTM to carry out network structure training, splicing and fusing output data obtained by the BilSTM training to obtain and learn fused fault feature data;

a3, inputting the fused fault characteristic data into a full connection layer module, arranging a softmax classifier at an output layer of the full connection layer module, classifying and outputting fault states, updating network structure parameters, returning to A1, and training next batch of data until all data of a training set participate in training as an iteration;

a4, completing the construction of the CNN-BilSTM network model after the iteration preset times are reached;

wherein the softmax classifier uses a cross-entropy loss function L _c Comprises the following steps:

in the formula (1), r _i Real labels of data samples i in the training set; y is _i A prediction label of a data sample i in the training set; and N is the total number of data samples in the training set.

Preferably, a full connection layer module is arranged in the CNN-BiLSTM network, and the full connection layer module comprises a first full connection layer FC1, a second full connection layer FC2 and an output layer which are sequentially arranged;

the CNN-BilSTM maps the characteristic information obtained by training or testing to an output layer through a first full connection layer FC1 and a second full connection layer FC2 in sequence;

the calculation equation of each layer in the full connection layer module is as follows:

o ^a ＝g(W _a h ^a +b _a ) (2)

in the formula (2), o ^a Is the output of the a-th fully connected layer; h is a total of ^a Input of the a-th fully connected layer; b _a A bias weight for the a-th fully-connected layer; w _a The forgetting gate weight of the a-th full-connection layer; g () is the ReLU function of the first fully connected layer.

Preferably, in step S2, the graph regularization term is added to a fully-connected layer of the CNN-BiLSTM network, where the fully-connected layer is the fully-connected layer closest to the classifier, and specifically:

constructing a neighbor graph on the basis of training set data, specifically comprising the following steps: confirming k adjacent points of any data sample i, and establishing connection among the k adjacent points of the data sample i;

adding a neighbor graph regularization term into a feature space of a full-connection layer nearest to a classifier in a CNN-BilSTM network based on the neighbor graph, wherein the neighbor graph regularization term L _g Comprises the following steps:

wherein i represents any data sample in the training set, j represents a data sample adjacent to the data sample i in the training set, N is the total number of the data samples in the training set, H _ij Representing the connection weights, s, of data samples i and data samples j in a neighbor graph _i Representing the corresponding feature vector, s, of a data sample i _j Representing the feature vector, H, corresponding to the data sample j _ij The calculation formula of (2) is as follows:

in equation (4), σ represents an average distance value between all neighboring points in the neighboring map, and the weight between non-neighboring points is 0.

Preferably, in step S3, the GR-CNN-BiLSTM model is trained by using data in a training set, specifically:

based on the training set, obtaining a training sample set X after feature engineering and a corresponding label set y, and inputting the training sample set X and the corresponding label set y into a GR-CNN-BilSTM model to obtain network parameters of the GR-CNN-BilSTM model;

setting the hyperparameter of the GR-CNN-BilSTM model; the hyper-parameters include: the method comprises the steps of learning rate, maximum iteration times, graph regularization item introduction time, data sample batch times, graph regularization item coefficients and neighbor point number;

initializing network parameters of a GR-CNN-BilSTM model;

inputting the batch b data samples of the training sample set X into a GR-CNN-BilSTM model, wherein b is more than or equal to 1 to obtain a prediction label, then calculating a pre-training stage and an adjusting and optimizing stage, and updating the network weight and the bias by using an optimizer according to the calculation result; finishing the training of the GR-CNN-BilSTM model;

wherein, when the iteration number e is less than e _g In the pre-training stage, the calculation formula of the objective function L is:

L＝L _c (5)

when e is _g ＜e＜e _max In the tuning and optimizing stage, the calculation formula of the objective function L is as follows:

L＝L _c +γL _g (6)

in the formula (5) and the formula (6), L represents an objective function of the GR-CNN-BilSTM model and represents the difference between the expected output and the real output; l is _c Representing a cross entropy loss function; l is _g Representing a neighbor graph regularization term, gamma represents a graph regularization term weight coefficient, e represents iteration times, e represents a graph regularization term weight coefficient _max Representing the maximum number of iterations; e.g. of the type _g The representation diagram regular term introduces the time instant.

More preferably, the optimizer selects Adam algorithm.

Preferably, the GR-CNN-BilSTM is evaluated using data in a validation setWhen evaluating the parameter accuracy and/or macro f ₁ The GR-CNN-BiLSTM model performed best when score was highest.

The invention has the beneficial effects that:

the multi-sensor fault diagnosis method based on the graph regularization CNN-BilSTM enables a network to learn the geometric relation characteristics with better clustering effect and higher identification degree by constructing the graph regularization item on the low-dimensional characteristics, improves the training efficiency and the diagnosis accuracy, and overcomes the defect of low training efficiency of the conventional deep graph regularization fault diagnosis method.

Drawings

FIG. 1 is a schematic flow diagram of a multi-sensor fault diagnosis method based on graph regularization CNN-BilSTM;

FIG. 2 is a schematic diagram of a CNN-BilSTM network architecture;

FIG. 3 is a schematic diagram of a fault diagnosis process based on the GR-CNN-BilSTM model;

FIG. 4 is a graphical representation of GR-CNN-BilSt training accuracy and loss versus epoch;

FIG. 5 is a variation E ₀ A graph of test set classification accuracy and training time of the values;

FIG. 6 is a graphical illustration of test set classification accuracy for different values of k;

FIG. 7 is a graphical illustration of test set classification accuracy for different gamma values;

FIG. 8 is a schematic diagram of confusion matrices for different model algorithm test results, wherein (a) is a schematic diagram of confusion matrices representing KNN test results, (b) is a schematic diagram of confusion matrices representing GB Bayes test results, (c) is a schematic diagram of confusion matrices representing MLP algorithm test results, (d) is a schematic diagram of confusion matrices representing CNN algorithm test results, (e) is a schematic diagram of confusion matrices representing CNN-BilSTM network structure test results, and (f) is a schematic diagram of confusion matrices representing GR-CNN-BilSTM model algorithm test results;

fig. 9 is a two-dimensional visualization diagram of full-connected layer FC2 layer features of different deep learning models, where, (a) represents a two-dimensional visualization diagram of full-connected layer FC2 layer features of a volume CNN, (b) represents a two-dimensional visualization diagram of full-connected layer FC2 layer features of an LSTM, (c) represents a two-dimensional visualization diagram of full-connected layer FC2 layer features of a CNN-BiLSTM network structure, and (d) represents a two-dimensional visualization diagram of full-connected layer FC2 layer features of a GR-CNN-BiLSTM model;

fig. 10 is an LSTM cell structure.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.

The background for CNN and BilSTM is briefly introduced as follows:

the CNN-BilSTM combines the advantages of CNN extraction of spatial feature information and BilSTM extraction of temporal feature information. The CNN can extract deep characteristic information of multi-sensor data and reduce data dimensionality under the condition of keeping main characteristic information. BilSTM extracts information that is likely to be ignored by CNN for time-series data from both directions. BilSTM processes the input of the network in the front and back two directions at the same time, capture the past and future information; the memory of the model to the initial and final stages of the original information input is enhanced.

Convolutional Neural Network (CNN)

CNN is a feedforward neural network that contains convolution calculations and has a deep structure with the ability to extract and combine low-level local features into high-level more abstract features. Its advantages are weight sharing structure and translation invariant property. Considering a CNN structure of L layers, the i-th layer convolution signature mapping can be expressed as:

wherein, the first and the second end of the pipe are connected with each other,

q represents the number of current output features for the output of the e-th feature of the l convolutional layer,

and

the weight matrix and the offset corresponding to the first convolutional layer, M represents the index value of the filter, and M is the number of convolutional layers output. The activation function takes a modified linear unit (ReLU) as an example, to obtain a nonlinear feature, enhance the feature expression capability of the model, accelerate the convergence process of CNN,

is the output result.

The pooling function uses the overall statistical characteristics of neighboring outputs at a location instead of the output of the network at that location. And introducing a pooling layer in the middle of the continuous convolution layer, and reducing the characteristic space and network parameters after convolution on the basis of keeping space invariance through nonlinear down-sampling. In order to achieve rapid convergence and good generalization, the graph regularization-based CNN-BilSTM multi-sensor fault diagnosis method adopts a maximum pooling layer (max-pooling), and selects the maximum values of all neurons in a region:

wherein the content of the first and second substances,

is the output of the maximum pooling layer, where T and R represent the pooling layer step size and pooling layer kernel size, respectively, and μ represents the pooling window.

(II) bidirectional long-short time memory neural network BilSTM

The long-time memory network LSTM is a variant of the recurrent neural network RNN, and can overcome the problems of gradient extinction and gradient explosion in the long-time sequence training process by keeping the long-time memory in the unit state. The LSTM describes dynamic characteristics through time series circulation, and a long-term correlation model is established. The LSTM cell model is mainly controlled by a forgetting gate (forgate), an input gate (input gate), and an output gate (output gate). The LSTM cell structure is shown in fig. 10. The biggest difference between LSTM and RNN is that the input gate can briefly store the relevant information.

In order to obtain more information before and after each time step in a given sequence, a BilSTM with independent forward and reverse paths is constructed, and the past information and the future information are captured simultaneously within a certain time step, so that the memory of the model for the initial information input starting and ending stages is enhanced. The forward LSTM can find a change rule, and the backward LSTM can reduce noise influence and smooth a prediction result.

At time step t, x _t 、h _t Respectively an input state and a hidden state, f _t 、i _t 、o _t 、c _t Respectively a forgetting gate, an input gate, an output gate and a memory unit. The forgetting gate determines the throughput of information, the input gate determines whether the new information is memorized by the cell unit, and the final output of the LSTM unit is determined by the output of the output gate and the output of the memory unit. The forward path and the backward path have the same theory, the application only takes the backward path as an example, and the update equation is as follows:

in the formula, σ represents a sigmoid function, and W, V, and b are cyclic weights of a forgetting gate, an input and an offset in a cell, respectively, and are model parameters to be updated.

The forgetting gate and the input gate can determine the next cell state c _t+1 And candidate cell state

Fuse to the Current Unit State c _t The method (1). The output gate can control the current unit state c _t To a hidden state h _t The conversion of (1).

In the formula, tanh is a hyperbolic tangent function.

According to the input sequence (x) _t ,…,x _t ) A forward hidden state sequence can be obtained

Likewise, the input time series (x) is processed in reverse order _t ,…,x _t ) Can generate a backward hidden state sequence

Final output sequence y of bi-directional LSTM after concatenation of forward and backward hidden states _t Can be expressed as

U denotes transpose.

Examples

In this embodiment, as shown in fig. 1 and fig. 3, the multi-sensor fault diagnosis method based on graph regularization CNN-BiLSTM includes:

s1, collecting state monitoring data of a plurality of sensors, and preprocessing the collected data to obtain a training set and a testing set;

s2, establishing a CNN-BilSTM network, adding a graph regularization item into a full connection layer of the CNN-BilSTM network, wherein the full connection layer is the full connection layer closest to the classifier, and finishing GR-CNN-BilSTM model construction;

s3, evaluating the GR-CNN-BilSTM model by using data in a test set, and acquiring parameters when the performance of the GR-CNN-BilSTM model is optimal;

and S4, inputting the data of the test set into the GR-CNN-BilSTM model which is trained in the S3 for fault diagnosis to obtain a fault diagnosis result.

More detailed explanation:

(I) construction of CNN-BilSTM network structure

The main challenges of classification of different fault classes are the non-linearity and data uncertainty associated with systematic variation, manifested as correlation between different features and time-variability of the same feature in the time domain. Therefore, constructing a CNN-BiLSTM network structure in this embodiment non-linearly maps high-dimensional data to a low-dimensional space, and includes multiple layers of CNNs for extracting correlation relationships and a temporal correlation relationship BiLSTM, where the CNN-BiLSTM network structure is shown in fig. 2.

Specifically, the method comprises the following steps: first, a multi-scale one-dimensional CNN including an input layer, an output layer, and a plurality of hidden layers is constructed. And inputting the data in the training set into a CNN convolution layer for self-adaptive extraction of deep characteristic data, and performing maximum pooling layer processing to obtain dimension-reduced deep characteristic data for reducing data dimensions under the condition of keeping main characteristic information.

Then, using the feature data of the dimensionality reduction deep layer as the input of a BilSTM layer, training a network model and learning feature information, specifically: and splicing and fusing output data obtained by BiLSTM training to obtain and learn fused fault characteristic data. Mapping the hidden state characteristics to an output network layer by using a full connection layer FC1 and a full connection layer FC2 in a full connection layer module of a CNN-BilSTM network structure, wherein the calculation equation of each layer in the full connection layer module is as follows:

o ^a ＝g(W _a h ^a +b _a ) (6)

in the formula (2), o ^a Is the output of the a-th fully connected layer; h is ^a Input of the a-th fully connected layer; b _a A bias weight for the a-th fully-connected layer; w _a The forgetting gate weight of the a-th full-connection layer; g () is the ReLU function of the first fully connected layer.

And finally, inputting the fused fault characteristic data into a full connection layer module, setting a softmax classifier at an output layer of the full connection layer module, classifying and outputting fault states, updating network structure parameters by using an optimizer, then performing training of next batch of data, taking the training as an iteration after all data of a training set participate in the training, and completing the construction of a CNN-BilSTM network model after preset times of iteration are reached.

And (3) utilizing a SoftMax classifier to learn and diagnose the classification of the fault state, wherein a cross entropy loss function is selected as a target function of the SoftMax classifier:

in the formula (7), r _i Real labels of data samples i in the training set; y is _i A prediction label of a data sample i in the training set; and N is the total number of data samples in the training set.

In the CNN-BilSTM network structure, the forward propagation layer by layer can realize nonlinear mapping, and the back propagation algorithm updates the parameters of the CNN-BilSTM network structure to achieve the effect of model learning. And (3) by iteratively reducing the loss function between the output and the truth value, the network parameter of each full connection layer is converged to an acceptable range, and the CNN-BilTM network structure optimization can be completed.

(II) building GR-CNN-BilSTM model by adding graph regularization items

The spectral clustering method has high-efficiency clustering performance, and the main idea is to regard all data as points in a space, and the points are connected by edges to form an undirected weighted graph. The weight value represents the distance between two points, namely a similarity matrix is constructed to represent the relationship between data points, and then the graph is segmented to obtain the final clustering result.

In this embodiment, a local manifold structure of data is ensured from the similarity of samples by adding a constraint of graph regularization to the CNN-BiLSTM network structure. Different from the traditional method for designing a graph regularization term in original high-dimensional data, the embodiment designs a neighbor graph in a feature space, the graph regularization term is added in a second full connection layer FC2 nearest to a classifier, and features output by the second full connection layer FC2 are directly input into the classifier, so that the features of the second full connection layer FC2 are most easily separated, the feature dimension of the second full connection layer FC2 is also the lowest, and the graph regularization term applied to the second full connection layer FC2 can not only reduce the calculated amount to the greatest extent, but also improve the accuracy of composition.

In the feature space, the adjacency matrix H represents the neighbor relation between two samples. If s _j At s _i K in the neighborhood of (i.e. s) _j ∈ kNN(s _i ) Then connect on the neighbor graph and construct an undirected neighbor graph. Connection weight H of data sample i and data sample j in neighbor graph _ij The gaussian kernel function is used to define:

in equation (8), σ represents the average distance value between all neighboring points in the neighboring map, and the weight between non-neighboring points is 0.

Constructing a neighbor graph on the basis of training set data, specifically comprising the following steps: confirming k adjacent points of any data sample i, and establishing connection among the k adjacent points of the data sample i; adding a neighbor graph regularization term into a feature space of a full-connection layer nearest to a classifier in a CNN-BilSTM network based on the neighbor graph, wherein the neighbor graph regularization term L _g Comprises the following steps:

in the formula (9), i represents any data sample in the training set, j represents a data sample adjacent to the data sample i in the training set, N is the total number of data samples in the training set, H _ij Representing the connection weights, s, of data samples i and data samples j in a neighbor graph _i Representing the corresponding feature vector, s, of a data sample i _j Representing the feature vector corresponding to data sample j.

Continuously iterating L by using back propagation gradient descent algorithm _g Minimizing, reducing the distance between adjacent points, enabling the GR-CNN-BilSTM model to learn the characteristics of good clustering effect and high identification degree, and realizing efficient fault classification.

(III) training GR-CNN-BilSTM model by utilizing two-stage training method

In this specific example, because the training of the CNN-BiLSTM network structure is insufficient and high-order features are not easily distinguished, it is difficult to correctly construct a neighbor graph, and in addition, introducing graph regularization terms too early may increase the complexity and computation time of the GR-CNN-BiLSTM model, the training algorithm shown in table 1 includes two stages of training the GR-CNN-BiLSTM model, and the target function L of the expected output and the real output of the network depiction model at each stage is different.

(1) A pre-training phase. At number of iterations e<e _g Then, L only contains the loss function L of the classifier _c 。

L＝L _c (10)

(2) And (5) adjusting and optimizing. At number of iterations e _g <e<e _max The network objective function L simultaneously comprises the loss function L of the classifier _c And a graph regularization term L _g The classifier is used for losing the low-dimensional features with remarkable learning data, and the graph regularization term enables the learned features to be more discriminative.

L＝L _c +γL _g (11)

In the formula (10) and the formula (11), L represents an objective function of the GR-CNN-BilSTM model and represents the difference between the expected output and the real output; l is _c Represents the objective function of the classifier, L in this embodiment _c Is a cross entropy loss function; l is _g Representing a neighbor graph regularization term, gamma represents a graph regularization term weight coefficient, e represents iteration times, e represents a graph regularization term weight coefficient _max Representing the maximum number of iterations; e.g. of the type _g The introduction time of the figure regular term is represented, and manual determination and debugging are needed.

When any one or more of the evaluation parameters Accuracy, precision, recall and macro f1-score of the deep learning model are optimal, the performance of the deep learning model is optimal. In the embodiment, when the GR-CNN-BilSt model is evaluated by using the data in the verification set, the parameter accuracy and/or macro f are/is evaluated ₁ The GR-CNN-BiLSTM model performed best when score was highest.

TABLE 1 GR-CNN-BilSTM model two-stage training algorithm

The model parameter set of the GR-CNN-BilSTM model after training comprises network weights and bias of each layer.

In order to better illustrate the effectiveness and beneficial effects of the multi-sensor fault diagnosis method based on graph regularization CNN-BiLSTM claimed by the embodiment, the effectiveness of the method described in the embodiment is verified by specific experiments and model comparison.

Experimental validation and model comparison

1.1 dataset description of ultrasound Motor

Experiments are adopted to verify the effectiveness of the multi-sensor fault diagnosis method based on graph regularization CNN-BilSTM.

The data set comprises five signals of driving voltage, driving current, driving frequency, isolated pole feedback voltage and internal temperature of the ultrasonic motor in a normal state and 4 fault states of piezoelectric ceramic piece cracking, friction plate abrasion, adhesive layer loosening, elastomer tooth breakage and the like. And constructing a training set and a testing set based on the collected multi-sensor data, ensuring that the data of the training set and the data of the testing set are completely independent, and ensuring that the training set and the testing set contain different types of fault modes. Each sample in the training set and the test set contains data collected by five signal channel sensors, and the number of samples in the training set and the test set is 18652. The target is to train the GR-CNN-BilSTM model through a training set and then verify the performance of the GR-CNN-BilSTM model by using a test set.

TABLE 2 Normal and Fault data set descriptions of 8 Electrical machines

The raw data collected by the multiple sensors is difficult to be directly used for fault diagnosis due to high sampling frequency and large noise, so the raw data collected by the multiple sensors is subjected to normalization processing by adopting z-score to eliminate the scale effect, and effective comparison among different fault characteristics is ensured.

In the formula (12), x _i As the original data, it is the original data,

is a characteristic mean value, σ _x Denotes the characteristic standard deviation, x _{i_norm} Denotes the Z fraction, mean deviation in standard deviation.

1.2 model parameter setting and sensitivity analysis

And leading the preprocessed 5-dimensional multi-sensor signals into a GR-CNN-BilSTM model. The network hyper-parameter is set as follows: the batch size of the CNN is 149, the number of convolution kernels is 32, the size of the convolution kernels is 2, the step size is 2, and the size of the pooling window is 2; the number of the hidden layer cells in the BilSTM is set to 128, and the number of the full connection layer FC1 cells and the number of the full connection layer FC2 cells are respectively set to 64 and 32; the classifier output class is 5. In the training process of the GR-CNN-BilSTM model, the adam algorithm is adopted, the initial learning rate is set to be 0.001, and the weight attenuation is set to be e ^-5 . The classification result is coded by adopting a one-hot coding mode, so that the output of the states of no fault, cracking of the piezoelectric ceramic piece, abrasion of the friction plate, loosening of the adhesive layer and tooth breakage of the elastic body is [1,0,0,0,0 ]]、[0,1,0,0,0]、[0,0, 1,0,0]、[0,0,0,1,0]、[0,0,0,0,1]. The effect of the maximum number of iterations epoch parameter into the GR-CNN-BilSTM model on the accuracy and loss of the training set is shown in FIG. 4. When the maximum iteration time epoch value is 120, the training accuracy and the loss tend to be stable, and overfitting can cause overfitting of the model, so the maximum iteration time epoch is set to be 120 times, and the model parameters are updated through iterative optimization to minimize the loss function.

Introducing a regular term of a neighbor graph into a moment E ₀ The number of neighbor points k is set to 60, the number of neighbor points k is set to 3, and the graph regularization term coefficient γ is set to 0.01. Next, through five-fold cross validation of hierarchical division, E in the model is verified ₀ Sensitivity of key parameters such as k and gamma which influence the network performance is analyzed. For saving training timeMeanwhile, in the following analysis process, the number of samples is one tenth of the original samples, namely the number of samples is 1865.

1.2.1 deterministic analysis of canonical term introduction moments of graphs

When γ =0.01,k =3, E is different ₀ Under the conditions, the mean values of test set diagnostic accuracy and training time for graph regularized CNN-BilSTM are shown in FIG. 5. It can be found that with E ₀ Increasing, gradually shortening training time, increasing and then decreasing classification accuracy at E ₀ And 60 f maximum. This is because the original samples are severely aliased in the data space and are difficult to distinguish. When the graph regularization term is introduced too early, the precision of the constructed neighbor graph is poor, so that the model learns wrong geometric relationship characteristics, and when the graph regularization term is introduced too late, the parameter optimization effect is poor due to the fact that iteration termination number is close to. To balance accuracy with training efficiency, E is selected ₀ The value was 60.

1.2.2 deterministic analysis of neighbor Point count

If the number k of neighboring points is too large, a wrong neighboring point may be selected, so that the accuracy of constructing the neighboring graph is affected. Since the training batch size is 149, each class contains 30 samples on average, and therefore the k value should be less than 30. As shown in FIG. 6 as E ₀ Test set diagnostic accuracy and mean of training time for GR-CNN-BiLSTM at different k conditions =60, γ = 0.01. It can be found that as k increases, the classification accuracy increases first, reaches a maximum value when k is 3, and then gradually decreases, so that the value of k is selected to be 3.

1.2.3 deterministic analysis of regularization coefficients of graphs

If the graph regularization coefficient γ is too large, it may affect the learning process of the classifier, and if it is too small, it cannot serve the purpose of learning the geometric structure information between samples to improve the model performance. When E is ₀ The mean values of the test set diagnostic accuracy of the GR-CNN-BiLSTM model under different γ conditions when =60,k =3 are shown in fig. 7. It can be found that as γ increases, the classification accuracy increases first, reaches a maximum value when k is 0.01, and then gradually decreases, so the value of γ is selected to be 0.01.

1.3 evaluation of the model

Based on the data set, the effect of the GR-CNN-BiLSTM model-based multi-sensor fault diagnosis method and the effect of several common model-based fault diagnosis methods in this embodiment are compared. In this embodiment, accuracy, precision, recall and f are used to evaluate the model effect ₁ Score evaluation index. Besides the common indexes for representing the classification accuracy of each fault type, such as average classification accuracy, precision and recall rate, a multi-classification index macro f which is not influenced by data imbalance is selected ₁ Score compares the multi-classification performance of different models. macro f ₁ Score is at f ₁ And averaging (see the following formula) is carried out on the basis of the score, the performance of the multi-sensor fault diagnosis method based on the GR-CNN-BilSTM model is evaluated, and the performance of the method disclosed by the embodiment is compared with that of a common fault diagnosis algorithm.

V in the formula (13) represents a v-th fault type, and in the experiment, v =1,2,3,4,5 represents a normal state of an ultrasonic motor, an electric ceramic sheet cracking fault, a friction sheet abrasion fault, a glue layer loosening fault and an elastic body broken tooth fault.

While the graph regularization term of the conventional depth map regularization network is constructed in the original data space, the graph regularization term of the multi-sensor fault diagnosis method based on the GR-CNN-BilSTM model is constructed in the feature space. Since the dimension (64) of the second full-connectivity FC2 layer feature space is much lower than the original training data set dimension (74610 =14922 × 5), the GR-CNN-BiLSTM model described in this embodiment obviously has higher training efficiency.

1.3.1 comparison with diagnosis results of commonly used failure diagnosis models

In order to verify the diagnosis effect of the GR-CNN-BiLSTM model-based multi-sensor fault diagnosis method of the present embodiment, table 3 shows the comparison results of the GR-CNN-BiLSTM model and other six common models. Can find outThe diagnosis accuracy results of k-nearest neighbors (KNN), gaussian negative Bayes (GB Bayes) and multi-layer perceptron (MLP) learning algorithms are 0.8779, 0.8259 and 0.8852, respectively, the effect is inferior to the GR-CNN-BiLSTM model of the present embodiment, and the above three learning algorithms have the problems of dependence on manual feature extraction, overfitting, parameter selection, and the like. Compared with a shallow machine learning method, the GR-CNN-BilSTM model of the embodiment has a remarkably improved fault diagnosis effect due to strong feature extraction capability, and confirms the necessity of establishing a deep network structure. In addition, the GR-CNN-BilSTM model, the CNN model and the CNN-BilSTM three deep learning models of the embodiment are compared, under the condition that the parameter settings of the three deep learning models are consistent, the accuracy of the CNN model and the CNN-BilSTM are 0.9158 and 0.9389 respectively, the accuracy of the GR-CNN-BilSTM model is 0.9491, and the accuracy, the recall rate and the macro f of the GR-CNN-BilSTM model are consistent ₁ The indexes such as score and the like are best in performance. This is because CNN can learn the short-term correlation of multiple signals in each step, while BiLSTM can learn the long-term correlation of sequences in both forward and backward directions. By learning spatial and temporal features simultaneously, macro f of CNN-BilSTM ₁ Macro f of-score ratio CNN ₁ Improvement in score by 3.75%; by introducing graph regularization term, GR-CNN-BilSTM macro f ₁ Macro f of-score ratio CNN-BiLSTM ₁ Increase in-score by 1.47%. Fig. 8 shows confusion matrices of different model test results, and it can be found that the GR-CNN-BiLSTM model has the best classification effect, and can almost completely and correctly classify fault types in test samples except for partial misjudgments for fault 1 and fault 4.

TABLE 3 evaluation results based on different fault diagnosis classification models

In order to visually verify the effect of the GR-CNN-BilSTM-based multi-sensor fault diagnosis method, a t-SNE algorithm is adopted to perform two-dimensional visualization on the characteristics of the full-connection FC2 layer of the different-deep learning model, as shown in FIG. 9. It can be found that the GR-CNN-BilSTM model can better distinguish 5 states of normal and different fault types, and the boundary overlapping phenomenon of partial data distribution can occur in other three deep learning models.

1.3.2 comparison of diagnostic results for Multi-sensor data with Single-sensor data

In order to verify the necessity of fusing multi-sensor data by the fault diagnosis method, the influence of input samples of multi-sensor signals and single-sensor signals on the diagnosis result is analyzed. Considering that each of the 5 signals is used for fault diagnosis separately and compared with the results of fault diagnosis carried out by using all 5 signal combinations, the results show that model parameters of GR-CNN-BilSTM are consistent.

TABLE 4 Fault diagnosis result comparison based on Multi-sensor and Single-sensor data

Table 4 shows the comparison results, and it can be found that satisfactory diagnosis results cannot be obtained based on any single signal, and the combination of the signals of the multiple sensors can improve the diagnosis effect. When all 5 signals are used, all evaluation indexes are optimal, the result verifies that the fault diagnosis based on single-channel sensor data is difficult to comprehensively reflect the health state of the system, and the fault diagnosis effect can be remarkably improved by fusing multi-sensor data and expanding the characteristic space.

By adopting the technical scheme disclosed by the invention, the following beneficial effects are obtained: the GR-CNN-BilSTM-based multi-sensor fault diagnosis method has the advantages that under the condition that multi-sensor state monitoring data can be achieved, the CNN can extract the space correlation of multi-dimensional features and the BilSTM can extract time sequence feature information; a neighbor graph regularization method is adopted to mine data geometric relationship characteristics in the low-dimensional characteristics, and a graph regularization deep learning frame is constructed; and a two-stage model training algorithm is further designed, and efficient fault recognition and classification are realized on the basis of balancing training time and diagnosis accuracy. The main conclusions obtained by experimental verification and analysis are as follows: (1) The classification performance of the GR-CNN-BilSTM model depended on by the method is superior to that of a kNN, bayes and MLP shallow machine learning method and a CNN and CNN-BilSTM deep learning method; (2) Compared with the input sample of a single sensor signal, the method disclosed by the application can improve the accuracy of the fault diagnosis model by fusing multi-sensor data; (3) The method overcomes the defect of low training efficiency of the traditional depth map regularization network, and enables the network to learn the geometric relationship characteristics with better clustering effect and higher identification degree by constructing the map regularization item on the low-dimensional characteristics, thereby improving the training efficiency and the diagnosis accuracy.

The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and improvements can be made without departing from the principle of the present invention, and such modifications and improvements should also be considered within the scope of the present invention.

Claims

1. A multi-sensor fault diagnosis method based on graph regularization CNN-BilSTM is characterized by comprising the following steps:

2. The graph regularization CNN-BilSTM based multi-sensor fault diagnosis method as claimed in claim 1, wherein in step S1, the collected data is subjected to z-score normalization, min-max normalization or normalization.

3. The graph regularization-based multi-sensor fault diagnosis method CNN-BilSTM as claimed in claim 1, wherein step S2, a CNN-BilSTM network is established, specifically:

a1, inputting a training set into a CNN (CNN) for network structure training, extracting deep layer feature data through a convolutional layer, and processing through a maximum pooling layer to obtain dimension-reduced deep layer feature data;

in the formula (1), r _i Real labels of data samples i in the training set; y is _i A predictive label for a data sample i in the training set; and N is the total number of data samples in the training set.

4. The graph regularization-based multi-sensor fault diagnosis method of CNN-BilSTM according to claim 3, wherein a full connection layer module is arranged in the CNN-BilSTM network, and the full connection layer module comprises a first full connection layer FC1, a second full connection layer FC2 and an output layer which are arranged in sequence;

o ^a ＝h(W _a h ^a +b _a ) (2)

5. The graph-regularized CNN-BiLSTM-based multi-sensor fault diagnosis method according to claim 1, wherein in step S2, a graph-regularization term is added to a fully-connected layer of the CNN-BiLSTM network, where the fully-connected layer is a fully-connected layer closest to the classifier, and specifically:

wherein i represents any data sample in the training set, j represents the data sample adjacent to the data sample i in the training set, N is the total number of the data samples in the training set, H _ij Representing the connection weights, s, of data samples i and data samples j in a neighbor graph _i Representing the corresponding feature vector, s, of a data sample i _j Indicating correspondence of data sample jFeature vector, H _ij The calculation formula of (2) is as follows:

in equation (4), σ represents the average distance value between all neighboring points in the neighboring map, and the weight between non-neighboring points is 0.

6. The graph regularization-based multiple sensor fault diagnosis method for CNN-BiLSTM as claimed in claim 1, wherein in step S3, the GR-CNN-BiLSTM model is trained using data in a training set, specifically:

initializing network parameters of a GR-CNN-BilSTM model;

L＝L _c (5)

L＝l _c +γL _g (6)

in equations (5) and (6), L represents the target function of the GR-CNN-BilSTM model, characterizing the expected output and trueDifference of real output; l is _c Representing a cross entropy loss function; l is _g Representing a regularization term of a neighbor graph, gamma representing a weight coefficient of the regularization term of the graph, e representing the iteration number, e _max Representing the maximum number of iterations; e.g. of a cylinder _g The representation diagram regular term introduces the time instant.

7. The graph regularization CNN-BilSTM based multi-sensor fault diagnosis method of claim 6, wherein the optimizer employs an Adam algorithm.

8. The graph regularization CNN-BilSTM based multi-sensor fault diagnosis method of claim 1, wherein when the GR-CNN-BilSTM model is evaluated by using data in a validation set, the GR-CNN-BilSTM model performs optimally when the evaluation parameter accuracy and/or macro f1-score is/are the highest.

9. A computer-readable storage medium storing computer instructions for causing a computer to perform the graph regularization CNN-BiLSTM based multi-sensor fault diagnosis method according to any one of claims 1 to 8.

10. An electronic device, comprising: a memory and a processor, the memory and the processor being communicatively connected to each other, the memory storing computer instructions, and the processor executing the computer instructions to perform the graph regularization-based CNN-BiLSTM multi-sensor fault diagnosis method according to any one of claims 1 to 8.