CN112561035A - Fault diagnosis method based on CNN and LSTM depth feature fusion - Google Patents

Fault diagnosis method based on CNN and LSTM depth feature fusion Download PDF

Info

Publication number
CN112561035A
CN112561035A CN202011446332.4A CN202011446332A CN112561035A CN 112561035 A CN112561035 A CN 112561035A CN 202011446332 A CN202011446332 A CN 202011446332A CN 112561035 A CN112561035 A CN 112561035A
Authority
CN
China
Prior art keywords
cnn
lstm
network
fusion
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011446332.4A
Other languages
Chinese (zh)
Other versions
CN112561035B (en
Inventor
周福娜
张志强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Maritime University
Original Assignee
Shanghai Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Maritime University filed Critical Shanghai Maritime University
Priority to CN202011446332.4A priority Critical patent/CN112561035B/en
Publication of CN112561035A publication Critical patent/CN112561035A/en
Application granted granted Critical
Publication of CN112561035B publication Critical patent/CN112561035B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
    • Y04S10/52Outage or fault management, e.g. fault detection or location

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a fault diagnosis method based on the depth feature fusion of CNN and LSTM, which is widely applied to the field of fault diagnosis in recent years by deep learning. However, the feature extraction and fault diagnosis by using a single deep learning model face the problems of insufficient data utilization and incomplete feature extraction, thereby affecting the precision of fault diagnosis. Aiming at the problems, a feature fusion mechanism is provided, two different neural networks are used for respectively extracting features of the one-dimensional sequence data and the two-dimensional oscillogram data: the autocorrelation characteristics of the one-dimensional sequence data are extracted using LSTM and the cross-correlation characteristics of the two-dimensional oscillogram data are extracted using CNN. The purpose of complementary fusion of the two neural network characteristics is realized by adding the characteristic fusion network, which is different from the existing simple splicing fusion of the characteristics. Therefore, the data is more fully utilized, the feature extraction is more comprehensive, and the fault diagnosis is more accurate.

Description

Fault diagnosis method based on CNN and LSTM depth feature fusion
Technical Field
The invention relates to the technical field of gearbox fault diagnosis based on deep learning, in particular to a fault diagnosis method based on depth feature fusion of CNN and LSTM, and the fault diagnosis of a gearbox based on deep learning is realized.
Background
With the rapid development of modern industrial technology, the structure of a large-scale automation system is more and more complex, the coupling degree between different parts of production equipment is higher and higher, and a fault occurring at one place can cause the breakdown of the whole system and even cause a catastrophic event. Therefore, accurate and reliable real-time fault diagnosis of mechanical equipment is crucial.
Common fault diagnosis methods are generally classified into three categories: empirical knowledge based methods, analytical model based methods, and data driven based methods. The method based on empirical knowledge and the method based on analytical model are limited by the completeness of prior knowledge and the accuracy of a mathematical model in engineering practice, and the model has poor expandability and has great limitation in fault diagnosis. The method based on data driving is not limited by rich prior knowledge and an established accurate mechanism model, and can carry out fault diagnosis on a complex system only by establishing a fault diagnosis model based on data through a data feature extraction technology, so that the method is widely concerned in recent years. Deep learning is a data driving method and has strong self-adaptive capacity. The method is a multi-level feature learning method, the features of each layer are converted into more abstract high-level features by utilizing nonlinear components, and deep learning is widely concerned by experts in the field of fault diagnosis due to strong feature representation capability. Among various deep learning models, the earliest Convolutional Neural Networks (CNNs) for image recognition have been successfully applied to feature extraction. The unique modeling properties of CNNs help to find local structures or configurable relationships in the observations. In recent years, a failure diagnosis method based on CNN has been widely studied. Although CNN has achieved great success in fault diagnosis, CNN focuses more on local features, ignoring the overall-to-local relationship of the signal. For sequence signals, there is a lack of long-term dependence between the internals. This long-term dependency hidden within the sequence is considered to be a feature that is very helpful for fault diagnosis.
The long-short term memory network (LSTM) is an important branch of the Recurrent Neural Network (RNN), is very suitable for processing the problem highly related to the time series, and can learn the long-term dependence hidden in the time series data. In the field of fault diagnosis and prediction, due to the excellent sequence autocorrelation feature extraction capability, the method is widely concerned by field experts. Although LSTM has been successful in diagnosing time series data faults, its inherent sequential nature makes the LSTM model fail to take into account local features of the data, which results in incomplete feature extraction, inefficient data utilization, and information loss.
Therefore, it is a problem to be studied by those skilled in the art to provide a feature fusion method based on deep learning with a better effect for fault diagnosis.
Disclosure of Invention
The invention provides an online fault diagnosis method based on the fusion of CNN and LSTM characteristics, aiming at the technical problems that the existing training data cannot be utilized to carry out fault diagnosis with maximum efficiency due to insufficient data utilization and incomplete characteristic extraction of the existing fault diagnosis method.
Specifically, the invention realizes the above purpose by the following scheme:
an online fault diagnosis method based on CNN and LSTM feature fusion is characterized by comprising the following steps:
s1, establishing a data set, wherein the data set comprises a training set and a testing set, the training set and the testing set both comprise one-dimensional sequence data and corresponding two-dimensional oscillogram data, and the two-dimensional oscillogram is drawn by the one-dimensional sequence data;
the step S1 includes the steps of:
s1.1, selecting one-dimensional sequence sample data of gear boxes with different fault types, and setting different fault type labels;
s1.2, drawing a corresponding oscillogram by utilizing Matlab on the one-dimensional sequence sample data in the step S1.1 to obtain two-dimensional oscillogram data;
s1.3, dividing the one-dimensional data and the two-dimensional data in the steps S1.1 and S1.2 into a training set and a test set according to a certain proportion;
s2, extracting local cross correlation characteristics and trend characteristics of the two-dimensional oscillogram data in the training set through a Convolutional Neural Network (CNN)FCNN
The step S2 includes the steps of:
s2.1, building a convolutional neural network Net according to two-dimensional oscillogram data in the training setCNNAs shown in equation (7):
[NetCNN,TrCNN]=Feedforward(θCNN;MCL,Mpool;SIZEcl,SIZEpool;X2D) (7)
where feed forward is a function of generating a neural network, MCLIs the number of convolutional layers of the CNN network; mpoolIs the number of pooled layers of the CNN network; SIZE (silicon carbide)clRepresents the convolution kernel size; SIZE (silicon carbide)poolRepresenting pooled kernel size; thetaCNN={WCNN,bCNNIs a network parameter, WCNNIs a weight matrix, bCNNIs a bias vector; x2DRepresenting input two-dimensional oscillogram data. Training a CNN network based on two-dimensional waveform image data;
s2.2, extracting feature F of two-dimensional waveform image by using trained convolutional neural network and network parametersCNN,
FCNN=GCNN(NetCNN,TrCNN,X2D) (8)
Wherein G isCNNIs a nonlinear output function, Tr, of a CNN networkCNNRepresenting the trained CNN network model parameters;
s3, extracting autocorrelation characteristics F among sequences of one-dimensional sequence data in the training set through a long-time memory neural network LSTMLSTM
The step S3 includes the steps of:
s3.1, building a long-short term memory network Net according to the one-dimensional sequence data in the training setLSTMAs shown in formula (9):
[NetLSTM,TrLSTM]=Feedforward(θLSTM;HLSTM;X1D) (9)
wherein, thetaLSTM={WLSTM,bLSTMIs a network parameter, WLSTMIs the moment of weightArray, bLSTMIs an offset vector, HLSTMNumber of neurons of the hidden layer, X1DRepresenting input one-dimensional sequence data; training an LSTM network based on the one-dimensional sequence data;
s3.2, extracting one-dimensional sequence data characteristic F by using trained network structure parametersLSTM
FLSTM=GLSTM(NetLSTM,TrLSTM,X1D) (10)
Wherein G isLSTMIs a nonlinear output function, Tr, of an LSTM networkLSTMRepresenting the parameters of the well-trained LSTM network model;
s4, extracting local cross-correlation characteristic F of image according to CNN in step S2CNNAnd the autocorrelation feature F of the LSTM extracted sequence in step S3LSTMFusing two different types of features through a multilayer fusion network to obtain fused features Ffusion
The step S4 includes the steps of:
s4.1, extracting the local cross-correlation characteristic F of the image extracted by the CNN in the step S2CNNAnd the autocorrelation feature F of the LSTM extracted sequence in step S3LSTMSplicing;
s4.2, establishing a feature fusion network NetfusionTraining the fusion network parameters to obtain the fused characteristic Ffusion. As shown in equation (6):
Ffusion=Gfusion(Netfusion,Trfusion,X1D,X2D) (11)
wherein G isfusionIs a non-linear output function of the network, TrfusionAre the trained model parameters.
S5, the fusion feature F in the step S4fusionAs an input to the Softmax classifier, fault diagnosis classification is performed, as shown in equation (7):
result=Softmax(Ffusion,θ) (12)
wherein result represents the classification accuracy, and theta represents a Softmax network model parameter;
s6, same asTime-adjusted fault diagnosis network NetCNN、NetLSTM、NetfusionSoftmax network parameters;
and S7, inputting all the data in the test set in the step S2 into the network model to obtain the fault diagnosis classification result of the test set, and evaluating the effect of the network model.
Compared with the prior art, the invention has the beneficial effects that: and respectively performing feature extraction on the one-dimensional sequence data and the two-dimensional image data by using two different neural networks of CNN and LSTM: extracting autocorrelation characteristics of the one-dimensional sequence data by using the LSTM, and extracting cross-correlation characteristics of the two-dimensional image data by using the CNN; adjusting the number of output nodes of the two types of neural network feature output layers to enable the two types of extracted data features to have the same structure; by adding a feature fusion network, the cross-correlation features extracted by CNN and the auto-correlation features extracted by LSTM are fused, so that the purpose of complementary fusion of the two network features is realized. Compared with the prior art, the method and the device solve the problem of high misclassification rate caused by the fact that the features extracted by using a CNN or LSTM network model alone are not accurate enough, and the fused features are used for fault diagnosis, so that the data can be more fully utilized, the feature extraction is more comprehensive, and the fault diagnosis is more accurate. The invention can effectively improve the precision of fault diagnosis, has certain promotion effect on further development, popularization and application of fault diagnosis and deep learning, and has practical significance on promoting the progress of industrial production.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a structural diagram of a fault diagnosis method based on the depth feature fusion of CNN and LSTM according to the present invention.
FIG. 2 is a global optimization diagram of the fault diagnosis method based on the depth feature fusion of CNN and LSTM.
Fig. 3 is a graph showing the accuracy of fault diagnosis of LSTM in the case where the sample sequence length is 100 in the experiment.
Fig. 4 is a diagram showing the accuracy of failure diagnosis of CNN in the case where the sample sequence length in the experiment is 100.
Fig. 5 is a fault diagnosis precision diagram based on the depth feature fusion of CNN and LSTM in the present invention under the condition that the sample sequence length in the experiment is 100.
Fig. 6 is a graph showing the accuracy of fault diagnosis of LSTM in the case where the sample sequence length is 400 in the experiment.
Fig. 7 is a diagram showing the accuracy of failure diagnosis of CNN in the case where the sample sequence length in the experiment is 400.
Fig. 8 is a fault diagnosis precision diagram based on the depth feature fusion of CNN and LSTM in the present invention under the condition that the sample sequence length in the experiment is 400.
Fig. 9 is a graph showing the accuracy of fault diagnosis of LSTM in the case where the sample sequence length is 900 in the experiment.
Fig. 10 is a diagram showing the accuracy of failure diagnosis of CNN in the case where the sample sequence length in the experiment is 900.
In the experiment of FIG. 11, the length of the sample sequence is 900, and the fault diagnosis precision graph is based on the depth feature fusion of CNN and LSTM.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.
Fig. 1 shows a structure diagram of a fault diagnosis method based on CNN and LSTM depth feature fusion, which includes the following steps:
s1, establishing a data set,
the original gearbox fault data set has 15600 sequence sample data, which comprises six fault types: pitting, tooth breakage, abrasion, point grinding, breakage grinding and normal, and setting fault labels as 1, 2, 3, 4, 5 and 6 respectively, and setting a data set as 10: 3, dividing the ratio into a training set and a testing set, and drawing a corresponding oscillogram of each sample.
S2, extracting local cross correlation characteristics and trend characteristics F of the two-dimensional oscillogram data in the training set through a convolutional neural network CNN according to the two-dimensional oscillogram data in the training setCNN
The step S2 includes the steps of:
s2.1, building a convolutional neural network Net according to two-dimensional oscillogram data in the training setCNNAs shown in equation (13):
[NetCNN,TrCNN]=Feedforward(θCNN;MCL,Mpool;SIZEcl,SIZEpool;X2D) (13)
where feed forward is a function of generating a neural network, MCLIs the number of convolutional layers of the CNN network; mpoolIs the number of pooled layers of the CNN network; SIZE (silicon carbide)clRepresents the convolution kernel size; SIZE (silicon carbide)poolRepresenting pooled kernel size; thetaCNN={WCNN,bCNNIs a network parameter, WCNNIs a weight matrix, bCNNIs a bias vector; x2DRepresenting input two-dimensional oscillogram data. Training a CNN network based on two-dimensional waveform image data;
s2.2, extracting feature F of two-dimensional waveform image by using trained convolutional neural network and network parametersCNN,
FCNN=GCNN(NetCNN,TrCNN,X2D) (14)
Wherein G isCNNIs a nonlinear output function, Tr, of a CNN networkCNNRepresenting the trained CNN network model parameters;
s3, extracting autocorrelation characteristics F among sequences of one-dimensional sequence data in the training set through a long-time memory neural network LSTMLSTM
The step S3 includes the steps of:
s3.1, according to a one-dimensional sequence in the training setLong-short term memory network (Net) built by line dataLSTMAs shown in equation (15):
[NetLSTM,TrLSTM]=Feedforward(θLSTM;HLSTM;X1D) (15)
wherein, thetaLSTM={WLSTM,bLSTMIs a network parameter, WLSTMIs a weight matrix, bLSTMIs an offset vector, HLSTMNumber of neurons of the hidden layer, X1DRepresenting input one-dimensional sequence data; training an LSTM network based on the one-dimensional sequence data;
s3.2, extracting one-dimensional sequence data characteristic F by using trained network structure parametersLSTM
FLSTM=GLSTM(NetLSTM,TrLSTM,X1D) (16)
Wherein G isLSTMIs a nonlinear output function, Tr, of an LSTM networkLSTMRepresenting the parameters of the well-trained LSTM network model;
s4, extracting local cross-correlation characteristic F of image according to CNN in step S2CNNAnd the autocorrelation feature F of the LSTM extracted sequence in step S3LSTMFusing two different types of features through a multilayer fusion network to obtain fused features Ffusion
The step S4 includes the steps of:
s4.1, extracting the local cross-correlation characteristic F of the image extracted by the CNN in the step S2CNNAnd the autocorrelation feature F of the LSTM extracted sequence in step S3LSTMSplicing;
s4.2, establishing a feature fusion network NetfusionTraining the fusion network parameters to obtain the fused characteristic Ffusion. As shown in equation (17):
Ffusion=Gfusion(Netfusion,Trfusion,X1D,X2D) (17)
wherein G isfusionIs a non-linear output function of the network, TrfusionAre the trained model parameters.
S5, the fusion feature F in the step S4fusionAs input to the Softmax classifier, fault diagnosis classification is performed, as shown in equation (18):
result=Softmax(Ffusion,θ) (18)
wherein result represents the classification accuracy, and theta represents a Softmax network model parameter;
s6, adjusting fault diagnosis network Net simultaneouslyCNN、NetLSTM、NetfusionSoftmax network parameters, global optimization is shown in fig. 2. The loss function is shown by the following equation:
Figure BDA0002824582860000081
with a global Errorglobal={Errorfusion,Errorsoftmax,ErrorPeLSTM,ErrorCNNThe loss function J (θ) is minimized on the basis. Error of fusion networkfusionError including LSTMLSTMError of and CNNCNNThe relationship can be expressed by the following formula.
Errorfusion=ErrorPeLSTM+ErrorCNN
And S7, inputting all the data in the test set in the step S2 into the network model to obtain the fault diagnosis classification result of the test set, and evaluating the effect of the network model.
In order to verify the effectiveness and the generalization performance of the invention, the following experiment is carried out by adopting a QPZZ-I experiment platform:
the QPZZ-I type rotating mechanical vibration test platform system is used for simulating gear faults, the QPZZ-I type rotating mechanical vibration test platform system can quickly simulate various states and vibration of a rotating machine, and the gear fault simulation is realized by replacing a defective gear. The faults which can be simulated are pitting corrosion, abrasion, broken tooth, mixed fault pitting abrasion, broken tooth abrasion and the like. In the test, when the rotating speed is 880r/min and 0.05A of current is loaded, the acceleration data of the bearing Y at the side of the output shaft motor is recorded, and six health states of the gear box are selected as follows: pitting, wear, tooth breakage, pitting wear, tooth breakage wear, and normal conditions, the feasibility of the present invention was discussed using gearbox fault data and compared to using only one-dimensional sequence data as input to the LSTM network for diagnostics, and oscillogram data containing vibration signal trend information as input to the CNN for fault diagnostics.
(1) Data pre-processing
Fig. 1 is a specific block diagram of the present invention, which employs a sliding window for data preprocessing. Each sliding window is a sample. The sliding window size is set to 100, 400, 900 respectively, i.e. the number of parameters per sample is 100, 400, 900, and the sliding step size is set to 20. The screen shot size was set to 28 x 28, and each type of fault contained 2000 training samples and 600 test samples. The fail flag settings are shown in table 1.
TABLE 1 Fault Label settings
Type of failure Label arrangement
Pitting corrosion 1
Broken tooth 2
Wear and tear 3
Pitting wear 4
Wear of broken teeth 5
Normal state 6
(2) Design of experiments
Experimental setup the feasibility of the inventive method CNN-FF-LSTM was discussed using gearbox fault data and a comparative experiment was set up: a. the fault diagnosis is performed using only the screenshot data as input to the CNN. b. And c, performing fault diagnosis by using the feature fusion method CNN-FF-LSTM, wherein specific experimental settings are shown in Table 2. Each set of experiments was compared using the three methods described above.
TABLE 2 Experimental design
Figure BDA0002824582860000091
Figure BDA0002824582860000101
(3) Parameter setting
Convolutional Neural Networks (CNN) are a special model of feed-forward neural networks that are better at handling image inputs, especially the associated machine learning problems for large images. Convolutional neural networks generally consist of an input layer, convolutional layer, pooling layer, full-link layer, and output layer.
The long-short term memory network (LSTM) is a special Recurrent Neural Network (RNN), which has a cyclic structure in the network, and each output of the recurrent neural network depends on the previous output. This architecture enables it to model sequential inputs. The LSTM is a recurrent neural network with good effect and has good capability of solving the problem of long-term sequences. LSTM has a similar chain structure as RNN, but its mechanism inside the duplicated modules is different, and its information transfer mainly consists of three gates, namely forgetting gate, input gate and output gate. Specific network parameters for the inventive experiments are shown in table 3.
TABLE 3 values of model parameters
Figure BDA0002824582860000102
(4) Analysis of Experimental results
The results of the experiments are shown in tables 4-6.
Table 4 fault diagnosis precision table with sequence length of 100
LSTM CNN CNN-FF-LSTM
Pitting corrosion 83.33% 76.00% 95.00%
Broken tooth 69.17% 61.83% 98.50%
Wear and tear 92.67% 89.00% 97.50%
Pitting wear 95.67% 94.17% 98.17%
Wear of broken teeth 82.83% 78.33% 71.00%
Normal state 78.83% 85.00% 97.83%
Average accuracy 83.75% 80.72% 93.00%
TABLE 5 Fault diagnosis precision Table with 400 sequence Length
LSTM WFCNN CNN-FF-LSTM
Pitting corrosion 84.50% 85.17% 99.50%
Broken tooth 82.83% 74.50% 99.67%
Wear and tear 93.00% 93.00% 96.83%
Pitting wear 93.50% 96.33% 98.00%
Wear of broken teeth 93.50% 85.17% 81.00%
Normal state 93.83% 89.67% 100.00%
Average accuracy 90.19% 87.31% 95.83%
Table 6 fault diagnosis precision table with sequence length of 900
LSTM CNN CNN-FF-LSTM
Pitting corrosion 92.33% 93.00% 97.00%
Broken tooth 91.17% 88.17% 98.17%
Wear and tear 95.67% 94.00% 99.33%
Pitting wear 97.67% 93.50% 98.50%
Wear of broken teeth 89.50% 92.33% 99.67%
Normal state 89.33% 89.83% 100.00%
Average accuracy 92.61% 91.81% 98.78%
As can be seen from tables 4, 5 and 6, for the gear fault vibration signal, the LSTM network is slightly better than the CNN network for time series fault diagnosis, but the diagnosis result obtained by the fusion method CNN-FF-LSTM of the invention is greatly better than that of a model used alone.
As can be seen from Table 4, the diagnostic accuracy is lowest when the screenshot is used as the input of the CNN, and after the features extracted by the LSTM are fused, the diagnostic accuracy is improved by 12.28% compared with that when the CNN is used alone for carrying out fault, and the effect is remarkable. And compared with the LSTM, the diagnosis precision is improved by nearly 9.25 percent. The diagnosis result graphs of each model are shown in fig. 3, 4 and 5, the stars represent the fault diagnosis results, the circles represent the real fault types, and the superposition indicates that the diagnosis is correct.
As can be seen from table 5, the accuracy of each model in table 5 is improved relative to table 4, because the sequence length of the training sample is increased, which indicates that the longer the sequence length of the sample, the more fault information is included, and the better the result of fault diagnosis is. In contrast, the diagnosis results of the present invention in table 5 showed an improvement of 8.52% in the progress of diagnosis compared to CNN alone. Compared with the fault diagnosis by singly using the LSTM, the method has the advantages that the diagnosis precision is improved by 5.64 percent, and the effectiveness of the method is verified. The diagnosis result graphs of each model are shown in fig. 6, 7 and 8, the stars represent the fault diagnosis results, the circles represent the real fault types, and the superposition indicates that the diagnosis is correct.
Comparing table 6 with tables 4 and 5, it can be seen that the diagnosis accuracy of each model in table 6 is improved compared with tables 4 and 5, because the length of the corresponding experimental sample sequence in table 6 is the longest, and each sample contains more complete fault information, which also indicates that the length of the sample sequence has a certain influence on the accuracy of fault diagnosis. In table 6, compared with the single use of the CNN model, the fusion method provided by the present invention has the advantage that the diagnosis accuracy is improved by 6.97%. Compared with the LSTM model which is used alone, the diagnosis precision is improved by 6.17%, and the effectiveness of the method is verified. The diagnosis result graphs of each model are shown in fig. 9, fig. 10 and fig. 11, the stars represent the fault diagnosis results, the circles represent the real fault types, and the superposition indicates that the diagnosis is correct.
The invention provides a feature fusion mechanism, which respectively extracts features of one-dimensional sequence data and two-dimensional image data by using two different neural networks: features of the one-dimensional sequence data are extracted using LSTM, and features of the two-dimensional image data are extracted using CNN. By adding the feature fusion layer, the long-term dependence relationship between the local features extracted by the CNN and the sequences extracted by the LSTM is fused, and the purpose of complementary fusion of the two network features is realized, so that the data utilization is more sufficient, the feature extraction is more comprehensive, and the fault diagnosis is more accurate.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (1)

1. An online fault diagnosis method based on CNN and LSTM depth feature fusion is characterized by comprising the following steps:
s1, establishing a data set, wherein the data set comprises a training set and a testing set, the training set and the testing set both comprise one-dimensional sequence data and corresponding two-dimensional oscillogram data, and the two-dimensional oscillogram is drawn by the one-dimensional sequence data;
the step S1 includes the steps of:
s1.1, selecting one-dimensional sequence sample data of gear boxes with different fault types, and setting different fault type labels;
s1.2, drawing a corresponding oscillogram by utilizing Matlab on the one-dimensional sequence sample data in the step S1.1 to obtain two-dimensional oscillogram data;
s1.3, dividing the one-dimensional data and the two-dimensional data in the steps S1.1 and S1.2 into a training set and a test set according to a certain proportion;
s2, extracting local cross correlation characteristics and trend characteristics F of the two-dimensional oscillogram data in the training set through a convolutional neural network CNN according to the two-dimensional oscillogram data in the training setCNN
The step S2 includes the steps of:
s2.1, building a convolutional neural network Net according to two-dimensional oscillogram data in the training setCNNAs shown in equation (1):
[NetCNN,TrCNN]=Feedforward(θCNN;MCL,Mpool;SIZEcl,SIZEpool;X2D) (1)
where feed forward is a function of generating a neural network, MCLIs the number of convolutional layers of the CNN network; mpoolIs the number of pooled layers of the CNN network; SIZE (silicon carbide)clRepresents the convolution kernel size; SIZE (silicon carbide)poolRepresenting pooled kernel size; thetaCNN={WCNN,bCNNIs a network parameter, WCNNIs a weight matrix, bCNNIs a bias vector; x2DRepresenting input two-dimensional oscillogram data; training a CNN network based on two-dimensional waveform image data;
s2.2, extracting feature F of two-dimensional waveform image by using trained convolutional neural network and network parametersCNN,
FCNN=GCNN(NetCNN,TrCNN,X2D) (2)
Wherein G isCNNIs a nonlinear output function, Tr, of a CNN networkCNNRepresenting the trained CNN network model parameters;
s3, extracting autocorrelation characteristics F among sequences of one-dimensional sequence data in the training set through a long-time memory neural network LSTMLSTM
The step S3 includes the steps of:
s3.1, building a long-short term memory network Net according to the one-dimensional sequence data in the training setLSTMAs shown in equation (3):
[NetLSTM,TrLSTM]=Feedforward(θLSTM;HLSTM;X1D) (3)
wherein, thetaLSTM={WLSTM,bLSTMIs a network parameter, WLSTMIs a weight matrix, bLSTMIs an offset vector, HLSTMNumber of neurons of the hidden layer, X1DRepresenting input one-dimensional sequence data; training an LSTM network based on the one-dimensional sequence data;
s3.2, extracting one-dimensional sequence data characteristic F by using trained network structure parametersLSTM
FLSTM=GLSTM(NetLSTM,TrLSTM,X1D) (4)
Wherein G isLSTMIs a nonlinear output function, Tr, of an LSTM networkLSTMRepresenting the parameters of the well-trained LSTM network model;
s4, extracting local cross-correlation characteristic F of image according to CNN in step S2CNNAnd the autocorrelation feature F of the LSTM extracted sequence in step S3LSTMFusing two different types of features through a multilayer fusion network to obtain fused features Ffusion
The step S4 includes the steps of:
s4.1, extracting the local cross-correlation characteristic F of the image extracted by the CNN in the step S2CNNAnd the autocorrelation feature F of the LSTM extracted sequence in step S3LSTMSplicing;
s4.2, establishing a feature fusion network NetfusionTraining the fusion network parameters to obtain the fused characteristic Ffusion. As shown in equation (5):
Ffusion=Gfusion(Netfusion,Trfusion,X1D,X2D) (5)
wherein G isfusionIs a non-linear output function of the network, TrfusionAre the trained model parameters.
S5, the fusion feature F in the step S4fusionAs an input to the Softmax classifier, fault diagnosis classification is performed, as shown in equation (6):
result=Softmax(Ffusion,θ) (6)
wherein result represents the classification accuracy, and theta represents a Softmax network model parameter;
s6, adjusting network Net simultaneouslyCNN、NetLSTM、NetfusionNetwork parameters of Softmax;
and S7, inputting all the data in the test set in the step S2 into the network model to obtain the fault diagnosis classification result of the test set, and evaluating the effect of the network model.
CN202011446332.4A 2020-12-08 2020-12-08 Fault diagnosis method based on CNN and LSTM depth feature fusion Active CN112561035B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011446332.4A CN112561035B (en) 2020-12-08 2020-12-08 Fault diagnosis method based on CNN and LSTM depth feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011446332.4A CN112561035B (en) 2020-12-08 2020-12-08 Fault diagnosis method based on CNN and LSTM depth feature fusion

Publications (2)

Publication Number Publication Date
CN112561035A true CN112561035A (en) 2021-03-26
CN112561035B CN112561035B (en) 2023-07-25

Family

ID=75062346

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011446332.4A Active CN112561035B (en) 2020-12-08 2020-12-08 Fault diagnosis method based on CNN and LSTM depth feature fusion

Country Status (1)

Country Link
CN (1) CN112561035B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113408525A (en) * 2021-06-17 2021-09-17 成都崇瑚信息技术有限公司 Multilayer ternary pivot and bidirectional long-short term memory fused text recognition method
CN113723478A (en) * 2021-08-17 2021-11-30 中国铁道科学研究院集团有限公司 Track circuit fault diagnosis method based on priori knowledge
CN116500335A (en) * 2023-06-30 2023-07-28 国网山东省电力公司邹城市供电公司 Smart power grid electricity larceny detection method and system based on one-dimensional features and two-dimensional features

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110073301A (en) * 2017-08-02 2019-07-30 强力物联网投资组合2016有限公司 The detection method and system under data collection environment in industrial Internet of Things with large data sets
US20190339688A1 (en) * 2016-05-09 2019-11-07 Strong Force Iot Portfolio 2016, Llc Methods and systems for data collection, learning, and streaming of machine signals for analytics and maintenance using the industrial internet of things
CN110991295A (en) * 2019-11-26 2020-04-10 电子科技大学 Self-adaptive fault diagnosis method based on one-dimensional convolutional neural network
CN111665819A (en) * 2020-06-08 2020-09-15 杭州电子科技大学 Deep learning multi-model fusion-based complex chemical process fault diagnosis method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190339688A1 (en) * 2016-05-09 2019-11-07 Strong Force Iot Portfolio 2016, Llc Methods and systems for data collection, learning, and streaming of machine signals for analytics and maintenance using the industrial internet of things
CN110073301A (en) * 2017-08-02 2019-07-30 强力物联网投资组合2016有限公司 The detection method and system under data collection environment in industrial Internet of Things with large data sets
CN110991295A (en) * 2019-11-26 2020-04-10 电子科技大学 Self-adaptive fault diagnosis method based on one-dimensional convolutional neural network
CN111665819A (en) * 2020-06-08 2020-09-15 杭州电子科技大学 Deep learning multi-model fusion-based complex chemical process fault diagnosis method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
姜洪开;邵海东;李兴球;: "基于深度学习的飞行器智能故障诊断方法", 机械工程学报, no. 07 *
陈丹敏;周福娜;王清贤;: "基于多源异构信息迁移学习的融合故障诊断方法", 信息工程大学学报, no. 02 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113408525A (en) * 2021-06-17 2021-09-17 成都崇瑚信息技术有限公司 Multilayer ternary pivot and bidirectional long-short term memory fused text recognition method
CN113723478A (en) * 2021-08-17 2021-11-30 中国铁道科学研究院集团有限公司 Track circuit fault diagnosis method based on priori knowledge
CN116500335A (en) * 2023-06-30 2023-07-28 国网山东省电力公司邹城市供电公司 Smart power grid electricity larceny detection method and system based on one-dimensional features and two-dimensional features
CN116500335B (en) * 2023-06-30 2023-10-13 国网山东省电力公司邹城市供电公司 Smart power grid electricity larceny detection method and system based on one-dimensional features and two-dimensional features

Also Published As

Publication number Publication date
CN112561035B (en) 2023-07-25

Similar Documents

Publication Publication Date Title
CN112561035A (en) Fault diagnosis method based on CNN and LSTM depth feature fusion
CN111651937B (en) Method for diagnosing faults of in-class self-adaptive bearing under variable working conditions
CN112149316B (en) Aero-engine residual life prediction method based on improved CNN model
CN110361176B (en) Intelligent fault diagnosis method based on multitask feature sharing neural network
Wang et al. Convolutional neural network-based hidden Markov models for rolling element bearing fault identification
CN111709448A (en) Mechanical fault diagnosis method based on migration relation network
Li et al. WavCapsNet: An interpretable intelligent compound fault diagnosis method by backward tracking
Si et al. Unsupervised deep transfer learning with moment matching: A new intelligent fault diagnosis approach for bearings
Jamil et al. A deep boosted transfer learning method for wind turbine gearbox fault detection
CN106407649A (en) Onset time automatic picking method of microseismic signal on the basis of time-recursive neural network
CN104748962B (en) Planetary gear box intelligent diagnosis method based on stacking automatic encoding machine
CN112465030A (en) Multi-source heterogeneous information fusion fault diagnosis method based on two-stage transfer learning
Yao et al. Multiscale domain adaption models and their application in fault transfer diagnosis of planetary gearboxes
CN115859077A (en) Multi-feature fusion motor small sample fault diagnosis method under variable working conditions
Ye et al. A selective adversarial adaptation network for remaining useful life prediction of machines under different working conditions
CN114186602A (en) Sparse filtering domain-based mechanical fault diagnosis method for anti-neural network
Ma et al. A collaborative central domain adaptation approach with multi-order graph embedding for bearing fault diagnosis under few-shot samples
CN112763215B (en) Multi-working-condition online fault diagnosis method based on modular federal deep learning
Chai et al. Deep transfer learning based multisource adaptation fault diagnosis network for industrial processes
Wang et al. A graph neural network-based data cleaning method to prevent intelligent fault diagnosis from data contamination
Zhao et al. Hybrid semi-supervised learning for rotating machinery fault diagnosis based on grouped pseudo labeling and consistency regularization
CN117574262A (en) Underwater sound signal classification method, system and medium for small sample problem
CN116822089A (en) Data-driven motor internal disturbance analysis and modeling method
CN113008998B (en) Concealed engineering internal defect judgment method based on PCNN
CN114169552A (en) Variable working condition fault diagnosis method based on meta-learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant