CN111353482B - LSTM-based fatigue factor recessive anomaly detection and fault diagnosis method - Google Patents

LSTM-based fatigue factor recessive anomaly detection and fault diagnosis method Download PDF

Info

Publication number
CN111353482B
CN111353482B CN202010445803.3A CN202010445803A CN111353482B CN 111353482 B CN111353482 B CN 111353482B CN 202010445803 A CN202010445803 A CN 202010445803A CN 111353482 B CN111353482 B CN 111353482B
Authority
CN
China
Prior art keywords
data
fault
lstm
model
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010445803.3A
Other languages
Chinese (zh)
Other versions
CN111353482A (en
Inventor
冯海领
焦正杉
孙敬哲
王汉奇
王向敏
赵宜斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Development Zone Jingnuo Hanhai Data Technology Co ltd
Original Assignee
Tianjin Development Zone Jingnuo Hanhai Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Development Zone Jingnuo Hanhai Data Technology Co ltd filed Critical Tianjin Development Zone Jingnuo Hanhai Data Technology Co ltd
Priority to CN202010445803.3A priority Critical patent/CN111353482B/en
Publication of CN111353482A publication Critical patent/CN111353482A/en
Application granted granted Critical
Publication of CN111353482B publication Critical patent/CN111353482B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching

Abstract

The invention discloses a fatigue factor recessive anomaly detection and fault diagnosis method based on LSTM, which comprises the following steps: s1, collecting time sequence data of target equipment of a diagnosis object, and carrying out empirical mode decomposition on the data; s2, constructing an equipment vibration signal prediction model based on the LSTM by using normal data; s3, classifying the collected abnormal data, and constructing a fault time sequence data classification model based on the LSTM; s4, taking the mean square error MSE of the obtained vibration signal prediction model based on the LSTM as an initial fatigue factor threshold; s5, predicting the equipment production data by using a vibration signal prediction model based on the LSTM, calculating the mean square error of a predicted value and an actual value, and comparing the mean square error with a fatigue factor threshold value to detect an abnormal signal; and S6, classifying the abnormal signals through a fault time sequence data classification model to obtain a fault diagnosis result.

Description

LSTM-based fatigue factor recessive anomaly detection and fault diagnosis method
Technical Field
The invention relates to the field of equipment fault diagnosis, in particular to a fatigue factor recessive abnormality detection and fault diagnosis method based on LSTM.
Background
With the explosion of the technology of the internet of things, 5G, artificial intelligence, cloud computing and the like in the form of "nuclear fusion", the strategy of "re-industrialization" made by major industrial countries around intelligent manufacturing is also very dust-laden. China firstly proposes an intelligent plus concept in a government work report of 2019, and determines intelligent manufacturing as an important development direction of new kinetic energy for national economic development.
The current fault diagnosis methods are mainly classified into two categories: a signal processing based method and a machine learning based method. Common methods based on signal processing include Spectral Kurtosis Analysis (Spectral Kurtosis), Sparse Decomposition Analysis (Sparse Decomposition Analysis), Time-frequency domain Analysis (Time-frequency Analysis), Wavelet Transform (WT), Empirical Mode Decomposition (EMD); the Machine learning-based methods mainly include Hidden Markov Models (HMMs), Bayesian networks (Bayesian networks), Support Vector Machines (SVMs) and Recurrent Neural Networks (RNNs), and the methods have achieved better research results in the field of fault diagnosis. Signal processing-based methods generally require a large number of highly experienced engineers to analyze the fault information contained in the signal, making it difficult to obtain a fault diagnosis result in a programmed manner; the method based on machine learning just makes up for the above defects of the signal processing method, and well expands the fault diagnosis method, but brings new challenges on fault diagnosis data. Generally, data recorded by a digital infrastructure in an industrial scene are mostly normal data, the data contain huge amount of information, and influence factors in aspects of product quality, energy consumption, production cost and the like are concerned, but the fault information in the data is few and few, so that how much equipment reliability is, how much fault data is, and fault information extraction is another new challenge in a large intelligent manufacturing background. In order to deal with the problem of unbalanced data, technologies such as undersampling, oversampling, and synthesizing a few classes of oversampling have been provided, and these methods have achieved good results in corresponding documents, but the undersampling technology may result in a reduction in data volume, and the oversampling technology may result in an increase in noise, and these problems may ultimately affect the accuracy of the training model.
Disclosure of Invention
In order to solve the problems caused by characteristics of unbalanced equipment data, high noise, time sequence and the like in the field of intelligent manufacturing, the invention provides a fatigue factor recessive abnormality detection and fault diagnosis method based on LSTM. The invention improves the unbalanced data processing method, the classification model and the abnormality detection method in the traditional fault diagnosis, can effectively avoid the problems caused by unbalanced data, can detect the slight change of equipment, and can early warn abnormal signals in time so as to detect the problems of equipment abrasion, aging and the like.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a fatigue factor recessive abnormality detection and fault diagnosis method based on LSTM comprises the following steps:
s1, collecting time sequence sample data of target equipment of a diagnosis object, wherein the sample comprises normal data, abnormal data and fault types of the abnormal data, and performing empirical mode decomposition on the time sequence sample data of the equipment by using an EMD (empirical mode decomposition) method;
s2, extracting normal data samples in the time sequence sample data, and constructing an equipment vibration signal prediction model by using an LSTM-based deep neural network;
s3, extracting abnormal data samples in the time sequence sample data, and constructing a fault time sequence data classification model by using an LSTM-based deep neural network;
s4, acquiring a Mean Square Error (MSE) of a vibration signal prediction model based on the LSTM, and taking the MSE as an initial fatigue factor threshold;
s5, predicting the production data of the equipment by using a vibration signal prediction model based on LSTM, calculating the mean square error of a predicted value and an actual value, comparing the mean square error with a fatigue factor threshold value to detect an abnormal signal, and dynamically adjusting a fatigue factor threshold value delta according to the actual condition;
and S6, classifying the abnormal signals through a fault time sequence data classification model to obtain a fault diagnosis result.
Further, the process of constructing the device vibration signal prediction model based on the LSTM deep neural network in step S2 includes the following steps:
s2-1: characteristic engineering: firstly, processing collected original data, performing auxiliary judgment by using EMD (empirical mode decomposition) and waveform visualization technology, and identifying abnormal vibration frequency by observing IMF (intrinsic mode function) components;
s2-2: resampling: original data are re-sampled, abnormal data fragments in the original data are deleted, the fact that continuous normal data exist in the remaining data and only exist in the remaining data is guaranteed, and the problem of data unbalance caused by unbalanced data is solved;
s2-3: data normalization: normalizing the data of each dimension in the time series, respectively calculating the mean value mu and the standard deviation sigma of each dimension, and normalizing the time series sample data X by the following formula, wherein the normalized data is X':
Figure GDA0002755583510000031
s2-4: generating a data set: converting the unsupervised data set into a supervised data set, dividing the normalized data into a continuous time series X of length lr,XrLength of (d) is input by the model as length lsAnd an output length lpDetermined jointly, the relationship is that the total length l is ls+lpDividing time series sample data X into XsAnd XpTwo parts, which are used as input and output samples of the model;
s2-5: dividing the data set: randomly arranging data, and dividing input data X according to same proportionsAnd output data XpTo obtain a training data set { Xtrain,ytrainAnd a validation data set { X }val,yval}
S2-6: constructing a model: setting check points, storing model parameters once for each Epoch, adjusting Epoch and Dropout parameters, observing train and valloss, and using an early stopping mechanism when an overfitting condition occurs;
s2-7: prediction using a model: inputting a segment of time sequence sample data X to obtain a prediction vector
Figure GDA0002755583510000044
Then, the mean square error MSE is calculated according to the actual value y and is used for evaluating the prediction effect, nFor the number of sequences:
Figure GDA0002755583510000041
further, the process of constructing the fault time series data classification model based on the LSTM deep neural network in step S3 includes the following steps:
s3-1: firstly, processing fault data samples of time sequence sample data of equipment, using data which has abnormality and has determined fault types, and additionally adding normal samples with the same quantity as each fault sample;
s3-2: performing Empirical Mode Decomposition (EMD) on the time-series sample data to obtain an m-dimensional vector IMF containing an IMF component { IMF {1,imf2,…,imfmM is determined by empirical mode decomposition imfiIs the ith dimension component;
s3-3: numbering imf of n-type time sequence data to obtain model input data XC
Figure GDA0002755583510000042
Determining the output data y of the model from the numbersC
Figure GDA0002755583510000043
S3-4: normalizing the data, calculating the sample mean value mu and the standard deviation sigma, wherein the normalized data is X'C
Figure GDA0002755583510000051
S3-5: dividing the data set, randomly arranging the data, and dividing the input data X 'according to the same proportion'CAnd output data yCTo obtain a training data set { Xtrain,ytrainAnd a validation data set { X }val,yval};
S3-6: training a model, setting check points, storing model parameters once for each Epoch, adjusting the Epoch and Dropout parameters, observing the trainloss and valloss, and using an early stopping mechanism when an overfitting condition occurs;
s3-7: classifying by using model, inputting a time sequence XdObtaining a classification vector y ', then a fault number y'dThe corresponding fault is found by numbering argmax (y'), which is the parameter for the maximum value.
Further, the step S5 of predicting the equipment production data by using the vibration signal prediction model based on LSTM specifically includes:
predicting the value of time t using a prediction model
Figure GDA0002755583510000052
Calculated according to the actual value y(t)Calculating a prediction error:
Figure GDA0002755583510000053
wherein
Figure GDA0002755583510000054
i is the dimension of the true monitor value, and then each error e(t)Added to the vector of prediction errors:
Figure GDA0002755583510000055
h is the number of historical errors used to estimate the current error, lsFor the length of the input sequence, this set of errors e is then smoothed, in order to select an error threshold in the set of errors-the values of the mean square error above the threshold are classified as anomalous:
=μ(es)+zσ(es) (8)
is a vector of threshold errors, esIs a sequence ofAn error vector, z is a constant, μ is a mean σ is a standard deviation, wherein is determined by equation (9):
Figure GDA0002755583510000061
wherein:
Figure GDA0002755583510000065
e in the formula (9)seqIs ea∈eaA continuous sequence of (a).
Is determined by Z ∈ Z, Z being an ordered set of positive integers representing a value greater than μ (e)s) Z depends on experimental data, once argmax () is determined, it is possible to generate for each anomalous sequence e that produces a mean square errorseq∈EseqA score is made and the score s is used to indicate the severity of the anomaly, i.e. the fatigue factor:
Figure GDA0002755583510000062
the highest mean square error in each anomalous error sequence is then normalized by its distance from the selected threshold delta.
Further, the step S5 of dynamically adjusting the fatigue factor threshold Δ according to the actual situation specifically includes:
the step S5 of dynamically adjusting the fatigue factor threshold Δ according to the actual situation specifically includes:
creating a new set emaxContaining all e in descending orderseqMaximum value of (e) maxseq) Then the maximum non-abnormal mean square error
Figure GDA0002755583510000064
Addition to emaxAt the end, the sequence is gradually decreased, and the decrease of the ith step is:
Figure GDA0002755583510000063
if the amplitude d is reduced at a certain step i(i)Greater than the minimum drop amplitude p, all satisfy
Figure GDA0002755583510000071
Is still an abnormal sequence; if the amplitude is decreased by d(i)Less than the minimum reduction p, then the reduction d for all subsequent errors(i),d(i+1),…,
Figure GDA0002755583510000072
Are reclassified as normal values. After the abnormal sequence is removed, the mean square error of the remaining normal sequence is recalculated again and used as the adjusted fatigue factor threshold value delta.
Adopt the produced beneficial effect of above-mentioned technical scheme to lie in:
(1) aiming at the problem that enough fault samples are difficult to collect in the initial stage of a production environment, an LSTM-based vibration signal prediction model anomaly detection method is provided, the existing data non-equilibrium characteristics are fully utilized, and anomaly signals generated by equipment are effectively detected;
(2) aiming at the problem that the equipment aging wear condition is not evaluated in the current research situation, a fatigue factor recessive fault detection method is provided on the basis of (1), a threshold value can be found in fluctuating data to distinguish normal data from abnormal data, and the threshold value is dynamically adjusted in incremental learning so as to achieve the purpose of detecting potential abnormality;
(3) in consideration of the characteristic that fault time sequence data have complex time relevance, a fault time sequence data classification model fusing EMD feature extraction is provided, the fault diagnosis method uses EMD to perform feature processing and denoising, and uses a LSTM structure-based deep neural network to extract features on a time dimension so as to improve the fault classification accuracy.
Drawings
FIG. 1 is a flow chart of a vibration signal prediction model based on LSTM;
FIG. 2 is a fault timing data classification model incorporating EMD feature extraction;
FIG. 3 is a flowchart of the overall structure of the LSTM-based incremental learning latent fault diagnosis method;
FIG. 4 ARIMA, RNN, Stacked LSTM predict effect;
FIG. 5 a vibration signal;
FIG. 6 Normal data EMD decomposition;
fig. 7 inner ring fault data EMD decomposition.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
S1, collecting time sequence sample data of target equipment of a diagnosis object, wherein the sample comprises normal data, abnormal data and fault types of the abnormal data, and performing empirical mode decomposition on the time sequence sample data of the equipment by using an EMD (empirical mode decomposition) method;
the experimental verification of the invention uses experimental data containing three faults of an Inner ring (Inner radial Fault), an Outer ring (Outer radial Fault) and a rolling body (Ball Fault)
S2, extracting normal data samples in the time sequence sample data, and constructing an equipment vibration signal prediction model by using an LSTM-based deep neural network;
the flow of the vibration signal prediction model based on the LSTM proposed by the present invention is described as follows, and the flow chart of the model is shown in fig. 1.
Step 1: characteristic engineering: firstly, processing collected original data, performing auxiliary judgment by using EMD (empirical mode decomposition) and waveform visualization technology, and identifying abnormal vibration frequency by observing IMF (intrinsic mode function) components;
step 2: resampling: original data are re-sampled, abnormal data fragments in the original data are deleted, the fact that continuous normal data exist in the remaining data and only exist in the remaining data is guaranteed, and the problem of data unbalance caused by unbalanced data is solved;
and step 3: data normalization: the data of each dimension in the time series are normalized respectively, the mean value mu and the standard deviation sigma of each dimension are calculated respectively, and the time series sample data X are normalized through the following formula. The data normalization processing is beneficial to the initialization of the model, avoids the influence on the updating of the gradient value due to the numerical value deviation problem, and is convenient to use the fixed learning rate to iterate the model, thereby accelerating the convergence speed of the model;
Figure GDA0002755583510000091
and 4, step 4: generating a data set: converting the unsupervised data set into a supervised data set, dividing the normalized data into a continuous time series X of length lr,XrLength of (d) is input by the model as length lsAnd an output length lpDetermined jointly, the relationship is that l ═ ls+lpDividing time series sample data X into XsAnd XpTwo parts, which are used as input and output samples of the model;
and 5: dividing the data set: randomly arranging data, and dividing input data X according to same proportionsAnd output data XpTo obtain a training data set { Xtrain,ytrainAnd a validation data set { X }val,yval}
Step 6: constructing a model: setting check points, saving model parameters once per Epoch, adjusting Epoch and Dropout parameters, observing train and valloss, and using an Early Stop mechanism (Early-Stop) when an overfitting condition occurs;
and 7: the test was performed using a model: inputting a segment of time sequence sample data X to obtain a prediction vector
Figure GDA0002755583510000093
Then, calculating a mean square error MSE according to the actual value y for evaluating the prediction effect, wherein n is the sequence number:
Figure GDA0002755583510000092
s3, extracting abnormal data samples in the time sequence sample data, and constructing a fault time sequence data classification model by using an LSTM-based deep neural network;
the fault time sequence data classification model process integrating EMD feature extraction provided by the invention is described as follows, and a model flow chart is shown in FIG. 2.
Step 1: firstly, processing fault samples of equipment time sequence data, using data which has abnormality and has determined fault types, and additionally adding normal samples with the same quantity as each fault sample;
step 2: performing EMD on the sample time-series data to obtain an m-dimensional vector IMF (IMF) containing IMF components1,imf2,…,imfmM is determined by empirical mode decomposition imfiIs the ith dimension component;
and step 3: numbering imf of n-type time sequence data to obtain model input data XC
Figure GDA0002755583510000101
Determining the output data y of the model from the numbersC
Figure GDA0002755583510000102
And 4, step 4: normalizing the data, calculating sample mean μ and standard deviation σ, and inputting data X to the model by equation (5)CNormalization was performed:
Figure GDA0002755583510000103
and 5: dividing the data set, randomly arranging the data, and dividing the model input data X 'according to the same proportion'CAnd output data yCTo obtain a training data set { Xtrain,ytrainAnd a validation data set { X }val,yval};
Step 6: training a model, setting a check point, storing model parameters once per Epoch, adjusting Epoch and Dropout parameters, observing train and valloss, and using an Early Stop mechanism (Early-Stop) when an overfitting condition occurs;
and 7: classifying by using model, inputting a time sequence XdObtaining a classification vector y ', then a fault number y'dThe corresponding fault is found by numbering argmax (y').
S4, acquiring a Mean Square Error (MSE) of a vibration signal prediction model based on the LSTM, and taking the MSE as an initial fatigue factor threshold;
s5, predicting the production data of the equipment by using a vibration signal prediction model based on LSTM, calculating the mean square error of a predicted value and an actual value, comparing the mean square error with a fatigue factor threshold value to detect an abnormal signal, and dynamically adjusting a fatigue factor threshold value delta according to the actual condition; predicting the value of time t using a prediction model
Figure GDA0002755583510000111
Calculating, from the actual value, a prediction error:
Figure GDA0002755583510000112
wherein
Figure GDA0002755583510000113
i is the dimension of the true monitored value. Then each error e(t)Added to the vector of prediction errors:
Figure GDA0002755583510000115
h is the number of historical errors used to estimate the current error, lsIs the length of the input sequence. This set of errors e is then smoothed to suppress abrupt changes in errors that often occur in LSTM-based prediction-abrupt changes that are typically difficult for prediction models to predict, and that can also lead to abrupt changes in error values even if such data is normal. To determine whether these values are positiveOften, the present invention sets a threshold for their smoothed prediction errors-values of mean square error above the threshold are classified as anomalous.
In calculating the prediction error threshold, it is possible to learn to find a suitable anomaly threshold by using a supervised method with a sample label, but in general there are not enough label samples, especially if the device is operating normally without a lot of fault data. The threshold calculation method provided by the invention does not need to use sample label data or statistical hypothesis test about errors, and can have higher performance under the condition of lower resource consumption. Selecting an error threshold in the error set:
=μ(es)+zσ(es) (8)
is a vector of threshold errors, esIs the sequence error vector, z is a constant, μ is the mean σ, the standard deviation, where determined by equation (9):
Figure GDA0002755583510000114
wherein:
Figure GDA0002755583510000116
e in the formula (9)seqIs ea∈eaA continuous sequence of (a).
Is determined by Z ∈ Z, Z being an ordered set of positive integers representing a value greater than μ (e)s) Number of standard deviations. The value of Z depends on experimental data, the normal range in the experiment of the invention is 2-8, and when Z is less than 2, the false alarm is generally excessive. Once argmax () is determined, an anomalous sequence e can be generated for each mean square errorseq∈EseqA score is made and the score s is used to indicate the severity of the anomaly, i.e. the fatigue factor:
Figure GDA0002755583510000121
in short, if a threshold Δ is found, and an abnormal sequence with a score s greater than Δ is discarded, the mean square error e of the remaining sequence is determinedsMean value of (e) mus) And standard deviation σ (e)s) Will be greatly reduced. This method is also applicable to sequences with a large number of anomalies (| E)seq|) to avoid excessive bias, and then the highest mean square error in each anomalous error sequence is normalized according to its distance from the selected threshold Δ.
In order to reduce the false alarm rate and the calculation amount, the invention provides the following threshold value adjusting strategy: creating a new set emaxContaining all e in descending orderseqMaximum value of (e) maxseq) Then the maximum non-abnormal mean square error
Figure GDA0002755583510000124
Addition to emaxAnd (5) ending. The sequence is gradually decreased, and the decreasing amplitude of each step is d(i),
Figure GDA0002755583510000122
If the amplitude d is reduced at a certain step i(i)Greater than the minimum drop amplitude p, all satisfy
Figure GDA0002755583510000123
Is still an abnormal sequence. If the amplitude is decreased by d(i)Less than the minimum reduction p, then the reduction d for all subsequent errors(i),d(i+1),…,
Figure GDA0002755583510000125
Are reclassified as normal values. After the abnormal sequence is removed, the mean square error of the remaining normal sequence is recalculated again and used as the adjusted fatigue factor threshold value delta. This approach helps to ensure that the anomaly sequence is not regular noise in the time series data stream, and identification of the anomaly value sequence is achieved by thresholding, detecting only sequences with potential anomaliesFragmentation is more efficient than comparing sequence values one by one without using a threshold.
And S6, classifying the abnormal signals through a fault time sequence data classification model to obtain a fault diagnosis result.
The general structural flow of the specific fault diagnosis method is shown in fig. 3.
The invention discloses a fatigue factor recessive anomaly detection and fault diagnosis method based on LSTM, which comprises the following steps:
description of data
The experimental data are derived from bearing fault data of a university of Kaiser West reservoir (CWRU) electrical engineering laboratory, and total 1,341,856 data points, wherein the bearing model is 6205-2RS JEM SKF deep groove ball bearing. Single-point faults of 3 Fault grades are respectively arranged on an Inner ring (Inner Raceway Fault), an Outer ring (Outer Raceway Fault) and a rolling body (Ball Fault) on the bearing by using an electric spark machining technology, the Fault diameters are respectively 0.007 inches, 0.014 inches and 0.021 inches, and the Fault depths are respectively 0.011 inches, 0.050 inches and 0.150 inches. Three kinds of faults are respectively arranged at a motor driving End (Driver End) and a Fan End (Fan End), and 21 groups of data including 6 fault types are collected by vibration sensors arranged at the motor driving End, the Fan End and a base. A description of specific bearing failure data is shown in table 1.
TABLE 1 bearing failure data description (in inches)
Figure GDA0002755583510000131
Figure GDA0002755583510000141
Four types of vibration data drawing graphs are selected, as shown in fig. 5, a normal signal, a rolling element fault signal, an inner ring fault signal and an outer ring fault signal are sequentially arranged from top to bottom, and a sequence with larger amplitude can be periodically shown in the fault signal, wherein the amplitude of the fault vibration signal is obviously larger than that of the fault signal, and the sequence with larger amplitude can be observed from the graph of sample data.
And decomposing the original vibration signal by using EMD, and observing the IMF component characteristics of each stage. The original signal is decomposed to obtain 7 components (called IMF 1-IMF 7 respectively) and a residual map, wherein each IMF component represents an connotative modal component existing in the original signal. Fig. 6 is an EMD decomposition of normal data, fig. 7 is an EMD decomposition of inner ring failure data, in which IMFs 1 to 7 represent signal components at different frequencies, which are arranged in order from a high frequency to a low frequency, and the right part is an instantaneous frequency of each IMF component. It can be seen from the graph that the normal signal and the abnormal signal have a large difference in residual error and instantaneous frequency distribution, the instantaneous frequency distribution of the normal signal is smooth, and the instantaneous frequency distribution of the abnormal signal has a large fluctuation.
In order to compare the effect difference between the LSTM-based vibration signal prediction model and ARIMA and RNN methods, the used experimental data comprise normal data of a driving end and a fan end, the data are divided into single-step prediction data and multi-step prediction data, the multi-step prediction data are provided with two prediction lengths of 30 and 100, 6 data sets are provided in total, and each data set comprises 20000 training samples and 4000 test samples.
The ARIMA Model is called an Autoregressive Moving Average Model (ARIMA, Autoregensive Integrated Moving Average Model). Also known as ARIMA (p, d, q), is the most common model among statistical models (statistical models) for time series prediction. The ARIMA model parameters p (Auto-Regressive), d (integrated), q (moving average) are shown in Table 2.
TABLE 2 ARIMA model parameters
Figure GDA0002755583510000151
RNN (Recurrent Neural Network) processes the most common, most traditional deep learning model of sequence data. The RNN model parameters used for the comparative experiments are shown in table 3.
TABLE 3 RNN model parameters
Figure GDA0002755583510000152
Figure GDA0002755583510000161
The test was performed with the above-identified model parameters and structure, and the comparative experiment results are shown in table 4.
TABLE 4 comparative experiment test results
Figure GDA0002755583510000162
As can be seen from the experimental test results in Table 4, the mean square errors of the three models have strong correlation with the data prediction length (the mean square errors of the RNN and StackedLSTM models are subject to ValLoss in Table 4), that is, the mean square error is smaller when the data prediction length is shorter, and the effect of single-step prediction is significantly better than that of multi-step prediction. Fig. 4(a), (b) and (c) are graphs of the prediction effect with prediction step size of 30 for the three models respectively. The dark grey lines represent the true value (TrueValue), the light grey lines represent the predicted value (Prediction), which represent the Acceleration value (left y-axis, Acceleration), and the dashed lines represent the Residual of the actual and predicted values (Residual, right y-axis, Res). ARIMA has a slight advantage in single-step prediction experiments (datasets 1 and 2) over both RNN and StackedLSTM neural networks, and its higher MSE is not as satisfactory in long-term prediction ( datasets 3, 4, 5, 6). The ARIMA and the neural network show two different characteristics, the neural network has better effect than the ARIMA in the aspect of long-term prediction and has more advantage in the aspect of short-term prediction. The prediction model effect based on the StackLSTM is superior to that of the RNN model, so that a better prediction effect can be obtained, and the StackLSTM is more suitable for long-term prediction. Structurally, the ARIMA is a linear model, and the neural network is a nonlinear model, so that the ARIMA can obtain good effect in a short time when the relationship is simple, but the fault data prediction model based on the StackedLSTM is very suitable for the scene when the complex association relationship is faced.

Claims (2)

1. A fatigue factor recessive abnormality detection and fault diagnosis method based on LSTM is characterized by comprising the following steps:
s1, collecting time sequence sample data of target equipment of a diagnosis object, wherein the sample comprises normal data, abnormal data and fault types of the abnormal data, and performing empirical mode decomposition on the time sequence sample data of the equipment by using an EMD (empirical mode decomposition) method;
s2, extracting normal data samples in the time sequence sample data, and constructing an equipment vibration signal prediction model by using an LSTM-based deep neural network; the process for constructing the equipment vibration signal prediction model based on the LSTM deep neural network comprises the following steps:
s2-1: characteristic engineering: firstly, processing collected original data, performing auxiliary judgment by using EMD (empirical mode decomposition) and waveform visualization technology, and identifying abnormal vibration frequency by observing IMF (intrinsic mode function) components;
s2-2: resampling: original data are re-sampled, abnormal data fragments in the original data are deleted, the fact that continuous normal data exist in the remaining data and only exist in the remaining data is guaranteed, and the problem of data unbalance caused by unbalanced data is solved;
s2-3: data normalization: normalizing the data of each dimension in the time series, respectively calculating the mean value mu and the standard deviation sigma of each dimension, and normalizing the time series sample data X by the following formula, wherein the normalized data is X':
Figure FDA0002755583500000011
s2-4: generating a data set: converting the unsupervised data set into a supervised data set, dividing the normalized data into a continuous time series X of length lr,XrLength of (d) is input by the model as length lsAnd an output length lpDetermined jointly, the relationship is that the total length l is ls+lpDividing time series sample data X into XsAnd XpTwo parts, which are used as input and output samples of the model;
s2-5: dividing the data set: randomly arranging data, and dividing input data X according to same proportionsAnd output data XpTo obtain a training data set { Xtrain,ytrainAnd a validation data set { X }val,yval};
S2-6: constructing a model: setting check points, storing model parameters once for each Epoch, adjusting Epoch and Dropout parameters, observing train and valloss, and using an early stopping mechanism when an overfitting condition occurs;
s2-7: prediction using a model: inputting a segment of time sequence sample data X to obtain a prediction vector
Figure FDA0002755583500000026
Then, calculating a mean square error MSE according to the actual value y for evaluating the prediction effect, wherein n is the sequence number:
Figure FDA0002755583500000021
s3, extracting abnormal data samples in the time sequence sample data, and constructing a fault time sequence data classification model by using an LSTM-based deep neural network;
s4, taking the Mean Square Error (MSE) of the vibration signal prediction model based on the LSTM acquired in the S2 as an initial fatigue factor threshold;
s5, predicting the production data of the equipment by using a vibration signal prediction model based on LSTM, calculating the mean square error of a predicted value and an actual value, comparing the mean square error with a fatigue factor threshold value to detect an abnormal signal, and dynamically adjusting a fatigue factor threshold value delta according to the actual condition; the prediction of the equipment production data by using the vibration signal prediction model based on the LSTM is specifically as follows:
predicting the value of time t using a prediction model
Figure FDA0002755583500000022
ComputingCome out according to the actual value y(t)Calculating a prediction error:
Figure FDA0002755583500000023
wherein
Figure FDA0002755583500000024
i is the dimension of the true monitor value, and then each error e(t)Added to the vector of prediction errors:
Figure FDA0002755583500000025
h is the number of historical errors used to estimate the current error, lsFor the length of the input sequence, this set of errors e is then smoothed, in order to select an error threshold in the set of errors-the values of the mean square error above the threshold are classified as anomalous:
=μ(es)+zσ(es) (8)
is a vector of threshold errors, esIs the sequence error vector, z is a constant, μ is the mean σ, the standard deviation, where determined by equation (9):
Figure FDA0002755583500000031
wherein:
Figure FDA0002755583500000032
e in the formula (9)seqIs ea∈eaA continuous sequence of (a);
is determined by Z ∈ Z, Z being an ordered set of positive integers representing a value greater than μ (e)s) The number of standard deviations, the value of Z, depends on experimental data, once argmax () is confirmedThen, for each abnormal sequence e generating the mean square errorseq∈EseqA score is made and the score s is used to indicate the severity of the anomaly, i.e. the fatigue factor:
Figure FDA0002755583500000033
then, the highest mean square error in each abnormal error sequence is normalized according to the distance between the highest mean square error and the selected threshold value delta;
the dynamic adjustment of the fatigue factor threshold value delta according to the actual situation specifically comprises:
creating a new set emaxContaining all e in descending orderseqMaximum value of (e) maxseq) Then the maximum non-abnormal mean square error
Figure FDA0002755583500000034
Addition to emaxAt the end, the sequence is gradually decreased, and the decrease of the ith step is:
Figure FDA0002755583500000041
if the amplitude d is reduced at a certain step i(i)Greater than the minimum drop amplitude p, all satisfy
Figure FDA0002755583500000042
Figure FDA0002755583500000043
Is still an abnormal sequence; if the amplitude is decreased by d(i)Less than the minimum reduction p, then the reduction of all subsequent errors
Figure FDA0002755583500000044
Are reclassified as normal; after removing the abnormal sequence, the mean square error of the remaining normal sequence is recalculated again and used as the modulationThe fatigue factor threshold value delta after the integration;
and S6, classifying the abnormal signals through a fault time sequence data classification model to obtain a fault diagnosis result.
2. The LSTM-based fatigue factor latent anomaly detection and fault diagnosis method of claim 1, wherein the process of constructing the fault timing sequence data classification model based on the LSTM deep neural network in step S3 comprises the following steps:
s3-1: firstly, processing fault data samples of time sequence sample data of equipment, using data which has abnormality and has determined fault types, and additionally adding normal samples with the same quantity as each fault sample;
s3-2: performing Empirical Mode Decomposition (EMD) on the time-series sample data to obtain an m-dimensional vector IMF containing an IMF component { IMF {1,imf2,…,imfmM is determined by empirical mode decomposition imfiIs the ith dimension component;
s3-3: numbering imf of n-type time sequence data to obtain model input data XC
Figure FDA0002755583500000045
Determining the output data y of the model from the numbersC
Figure FDA0002755583500000051
S3-4: normalizing the data, calculating the sample mean value mu and the standard deviation sigma, wherein the normalized data is X'C
Figure FDA0002755583500000052
S3-5: dividing the data set, randomly arranging the data, and dividing the data according to the same proportionInput data X'CAnd output data yCTo obtain a training data set { Xtrain,ytrainAnd a validation data set { X }val,yval};
S3-6: training a model, setting check points, storing model parameters once for each Epoch, adjusting the Epoch and Dropout parameters, observing the trainloss and valloss, and using an early stopping mechanism when an overfitting condition occurs;
s3-7: classifying by using model, inputting a segment of model input data XdObtaining a classification vector y ', then a fault number y'dThe corresponding fault is found by numbering argmax (y'), which is the parameter for the maximum value.
CN202010445803.3A 2020-05-25 2020-05-25 LSTM-based fatigue factor recessive anomaly detection and fault diagnosis method Active CN111353482B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010445803.3A CN111353482B (en) 2020-05-25 2020-05-25 LSTM-based fatigue factor recessive anomaly detection and fault diagnosis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010445803.3A CN111353482B (en) 2020-05-25 2020-05-25 LSTM-based fatigue factor recessive anomaly detection and fault diagnosis method

Publications (2)

Publication Number Publication Date
CN111353482A CN111353482A (en) 2020-06-30
CN111353482B true CN111353482B (en) 2020-12-08

Family

ID=71197733

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010445803.3A Active CN111353482B (en) 2020-05-25 2020-05-25 LSTM-based fatigue factor recessive anomaly detection and fault diagnosis method

Country Status (1)

Country Link
CN (1) CN111353482B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111539393B (en) * 2020-07-08 2020-10-09 浙江浙能天然气运行有限公司 Oil-gas pipeline third-party construction early warning method based on EMD decomposition and LSTM
WO2022020642A1 (en) * 2020-07-23 2022-01-27 Pdf Solutions, Inc. Predicting equipment fail mode from process trace
CN112149868A (en) * 2020-08-20 2020-12-29 汉威科技集团股份有限公司 Intelligent diagnosis method for gas use habit and safety analysis
CN111931872B (en) * 2020-09-27 2021-11-16 北京工业大数据创新中心有限公司 Method and device for determining abnormity of trend symptom
CN112804336B (en) * 2020-10-29 2022-11-01 浙江工商大学 Fault detection method, device, system and computer readable storage medium
CN112288021B (en) * 2020-11-02 2022-04-29 广东柯内特环境科技有限公司 Medical wastewater monitoring data quality control method, device and system
CN112101489A (en) * 2020-11-18 2020-12-18 天津开发区精诺瀚海数据科技有限公司 Equipment fault diagnosis method driven by united learning and deep learning fusion
CN112101532B (en) * 2020-11-18 2021-02-12 天津开发区精诺瀚海数据科技有限公司 Self-adaptive multi-model driving equipment fault diagnosis method based on edge cloud cooperation
CN112328588B (en) * 2020-11-27 2022-07-15 哈尔滨工程大学 Industrial fault diagnosis unbalanced time sequence data expansion method
CN112578213A (en) * 2020-12-23 2021-03-30 交控科技股份有限公司 Fault prediction method and device for rail power supply screen
CN112862459A (en) * 2021-03-02 2021-05-28 岭东核电有限公司 Test abnormity monitoring method and device, computer equipment and storage medium
CN113361324B (en) * 2021-04-25 2023-06-30 杭州玖欣物联科技有限公司 Lstm-based motor current anomaly detection method
CN114169379B (en) * 2022-02-07 2022-04-26 石家庄铁道大学 Method for detecting abnormal vibration data during bearing state monitoring
CN114783044B (en) * 2022-04-20 2023-03-24 石家庄铁道大学 Anti-fatigue effect evaluation method for tunnel lighting environment, electronic device and system
CN115905835B (en) * 2022-11-15 2024-02-23 国网四川省电力公司电力科学研究院 Low-voltage alternating current arc fault diagnosis method integrating multidimensional features
CN116933012A (en) * 2023-08-14 2023-10-24 华北电力大学 Intelligent early warning method for typical equipment faults of thermal power generating unit based on TiDE model
CN117009791B (en) * 2023-09-28 2023-12-12 太仓点石航空动力有限公司 Method and system for identifying faults of aeroengine
CN117451489B (en) * 2023-12-26 2024-03-08 集美大学 Device and method for identifying contact fatigue failure characteristic vibration signals

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109555566A (en) * 2018-12-20 2019-04-02 西安交通大学 A kind of turbine rotor method for diagnosing faults based on LSTM
CN109919082A (en) * 2019-03-05 2019-06-21 东南大学 Modal identification method based on LSTM and EMD
CN110702418A (en) * 2019-10-10 2020-01-17 山东超越数控电子股份有限公司 Aircraft engine fault prediction method
CN111053549A (en) * 2019-12-23 2020-04-24 威海北洋电气集团股份有限公司 Intelligent biological signal abnormality detection method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109555566A (en) * 2018-12-20 2019-04-02 西安交通大学 A kind of turbine rotor method for diagnosing faults based on LSTM
CN109919082A (en) * 2019-03-05 2019-06-21 东南大学 Modal identification method based on LSTM and EMD
CN110702418A (en) * 2019-10-10 2020-01-17 山东超越数控电子股份有限公司 Aircraft engine fault prediction method
CN111053549A (en) * 2019-12-23 2020-04-24 威海北洋电气集团股份有限公司 Intelligent biological signal abnormality detection method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于长短时记忆网络的旋转机械状态预测研究;赵建鹏 等;《噪声与振动控制》;20170831;第37卷(第4期);第155-159页 *
集成LSTM的航天器遥测数据异常检测方法;董静怡;《仪器仪表学报》;20190731;第40卷(第7期);第22-29页 *

Also Published As

Publication number Publication date
CN111353482A (en) 2020-06-30

Similar Documents

Publication Publication Date Title
CN111353482B (en) LSTM-based fatigue factor recessive anomaly detection and fault diagnosis method
CN110276416B (en) Rolling bearing fault prediction method
KR101316486B1 (en) Error detection method and system
CN105718876B (en) A kind of appraisal procedure of ball-screw health status
Soualhi et al. Hidden Markov models for the prediction of impending faults
Dou et al. A rule-based intelligent method for fault diagnosis of rotating machinery
KR101948604B1 (en) Method and device for equipment health monitoring based on sensor clustering
Li et al. Bearing fault feature selection method based on weighted multidimensional feature fusion
CN107003663A (en) The monitoring of device with movable part
CN112414694B (en) Equipment multistage abnormal state identification method and device based on multivariate state estimation technology
Wang et al. Weighted K-NN classification method of bearings fault diagnosis with multi-dimensional sensitive features
CN116380445B (en) Equipment state diagnosis method and related device based on vibration waveform
CN111474475A (en) Motor fault diagnosis system and method
US20220004163A1 (en) Apparatus for predicting equipment damage
CN111504647A (en) AR-MSET-based performance degradation evaluation method for rolling bearing
CN111678699B (en) Early fault monitoring and diagnosing method and system for rolling bearing
Khan et al. System design for early fault diagnosis of machines using vibration features
CN115496108A (en) Fault monitoring method and system based on manifold learning and big data analysis
Jiang et al. A SVDD and K-means based early warning method for dual-rotor equipment under time-varying operating conditions
CN113283028A (en) Fault diagnosis method for gear of gear box
CN110749443B (en) Rolling bearing fault diagnosis method and system based on high-order origin moment
CN112733446A (en) Data-driven self-adaptive anomaly detection method
CN110160781B (en) Test set reconstruction and prediction method for rotary machine fault classification
CN113435228A (en) Motor bearing service life prediction and analysis method based on vibration signal modeling
Zhang et al. Gearbox health condition identification by neuro-fuzzy ensemble

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A LSTM based implicit anomaly detection and fault diagnosis method for fatigue factors

Effective date of registration: 20230628

Granted publication date: 20201208

Pledgee: Tianjin SME Credit Financing Guarantee Co.,Ltd.

Pledgor: Tianjin Development Zone Jingnuo Hanhai Data Technology Co.,Ltd.

Registration number: Y2023120000049