CN111353482B

CN111353482B - LSTM-based fatigue factor recessive anomaly detection and fault diagnosis method

Info

Publication number: CN111353482B
Application number: CN202010445803.3A
Authority: CN
Inventors: 冯海领; 焦正杉; 孙敬哲; 王汉奇; 王向敏; 赵宜斌
Original assignee: Tianjin Development Zone Jingnuo Hanhai Data Technology Co ltd
Current assignee: Tianjin Development Zone Jingnuo Hanhai Data Technology Co ltd
Priority date: 2020-05-25
Filing date: 2020-05-25
Publication date: 2020-12-08
Anticipated expiration: 2040-05-25
Also published as: CN111353482A

Abstract

The invention discloses a fatigue factor recessive anomaly detection and fault diagnosis method based on LSTM, which comprises the following steps: s1, collecting time sequence data of target equipment of a diagnosis object, and carrying out empirical mode decomposition on the data; s2, constructing an equipment vibration signal prediction model based on the LSTM by using normal data; s3, classifying the collected abnormal data, and constructing a fault time sequence data classification model based on the LSTM; s4, taking the mean square error MSE of the obtained vibration signal prediction model based on the LSTM as an initial fatigue factor threshold; s5, predicting the equipment production data by using a vibration signal prediction model based on the LSTM, calculating the mean square error of a predicted value and an actual value, and comparing the mean square error with a fatigue factor threshold value to detect an abnormal signal; and S6, classifying the abnormal signals through a fault time sequence data classification model to obtain a fault diagnosis result.

Description

LSTM-based fatigue factor recessive anomaly detection and fault diagnosis method

Technical Field

The invention relates to the field of equipment fault diagnosis, in particular to a fatigue factor recessive abnormality detection and fault diagnosis method based on LSTM.

Background

With the explosion of the technology of the internet of things, 5G, artificial intelligence, cloud computing and the like in the form of "nuclear fusion", the strategy of "re-industrialization" made by major industrial countries around intelligent manufacturing is also very dust-laden. China firstly proposes an intelligent plus concept in a government work report of 2019, and determines intelligent manufacturing as an important development direction of new kinetic energy for national economic development.

The current fault diagnosis methods are mainly classified into two categories: a signal processing based method and a machine learning based method. Common methods based on signal processing include Spectral Kurtosis Analysis (Spectral Kurtosis), Sparse Decomposition Analysis (Sparse Decomposition Analysis), Time-frequency domain Analysis (Time-frequency Analysis), Wavelet Transform (WT), Empirical Mode Decomposition (EMD); the Machine learning-based methods mainly include Hidden Markov Models (HMMs), Bayesian networks (Bayesian networks), Support Vector Machines (SVMs) and Recurrent Neural Networks (RNNs), and the methods have achieved better research results in the field of fault diagnosis. Signal processing-based methods generally require a large number of highly experienced engineers to analyze the fault information contained in the signal, making it difficult to obtain a fault diagnosis result in a programmed manner; the method based on machine learning just makes up for the above defects of the signal processing method, and well expands the fault diagnosis method, but brings new challenges on fault diagnosis data. Generally, data recorded by a digital infrastructure in an industrial scene are mostly normal data, the data contain huge amount of information, and influence factors in aspects of product quality, energy consumption, production cost and the like are concerned, but the fault information in the data is few and few, so that how much equipment reliability is, how much fault data is, and fault information extraction is another new challenge in a large intelligent manufacturing background. In order to deal with the problem of unbalanced data, technologies such as undersampling, oversampling, and synthesizing a few classes of oversampling have been provided, and these methods have achieved good results in corresponding documents, but the undersampling technology may result in a reduction in data volume, and the oversampling technology may result in an increase in noise, and these problems may ultimately affect the accuracy of the training model.

Disclosure of Invention

In order to solve the problems caused by characteristics of unbalanced equipment data, high noise, time sequence and the like in the field of intelligent manufacturing, the invention provides a fatigue factor recessive abnormality detection and fault diagnosis method based on LSTM. The invention improves the unbalanced data processing method, the classification model and the abnormality detection method in the traditional fault diagnosis, can effectively avoid the problems caused by unbalanced data, can detect the slight change of equipment, and can early warn abnormal signals in time so as to detect the problems of equipment abrasion, aging and the like.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows:

a fatigue factor recessive abnormality detection and fault diagnosis method based on LSTM comprises the following steps:

s1, collecting time sequence sample data of target equipment of a diagnosis object, wherein the sample comprises normal data, abnormal data and fault types of the abnormal data, and performing empirical mode decomposition on the time sequence sample data of the equipment by using an EMD (empirical mode decomposition) method;

s2, extracting normal data samples in the time sequence sample data, and constructing an equipment vibration signal prediction model by using an LSTM-based deep neural network;

s3, extracting abnormal data samples in the time sequence sample data, and constructing a fault time sequence data classification model by using an LSTM-based deep neural network;

s4, acquiring a Mean Square Error (MSE) of a vibration signal prediction model based on the LSTM, and taking the MSE as an initial fatigue factor threshold;

s5, predicting the production data of the equipment by using a vibration signal prediction model based on LSTM, calculating the mean square error of a predicted value and an actual value, comparing the mean square error with a fatigue factor threshold value to detect an abnormal signal, and dynamically adjusting a fatigue factor threshold value delta according to the actual condition;

and S6, classifying the abnormal signals through a fault time sequence data classification model to obtain a fault diagnosis result.

Further, the process of constructing the device vibration signal prediction model based on the LSTM deep neural network in step S2 includes the following steps:

s2-1: characteristic engineering: firstly, processing collected original data, performing auxiliary judgment by using EMD (empirical mode decomposition) and waveform visualization technology, and identifying abnormal vibration frequency by observing IMF (intrinsic mode function) components;

s2-2: resampling: original data are re-sampled, abnormal data fragments in the original data are deleted, the fact that continuous normal data exist in the remaining data and only exist in the remaining data is guaranteed, and the problem of data unbalance caused by unbalanced data is solved;

s2-3: data normalization: normalizing the data of each dimension in the time series, respectively calculating the mean value mu and the standard deviation sigma of each dimension, and normalizing the time series sample data X by the following formula, wherein the normalized data is X':

s2-4: generating a data set: converting the unsupervised data set into a supervised data set, dividing the normalized data into a continuous time series X of length l_r，X_rLength of (d) is input by the model as length l_sAnd an output length l_pDetermined jointly, the relationship is that the total length l is l_s+l_pDividing time series sample data X into X_sAnd X_pTwo parts, which are used as input and output samples of the model;

s2-5: dividing the data set: randomly arranging data, and dividing input data X according to same proportion_sAnd output data X_pTo obtain a training data set { X_train,y_trainAnd a validation data set { X }_val,y_val}

S2-6: constructing a model: setting check points, storing model parameters once for each Epoch, adjusting Epoch and Dropout parameters, observing train and valloss, and using an early stopping mechanism when an overfitting condition occurs;

s2-7: prediction using a model: inputting a segment of time sequence sample data X to obtain a prediction vector

Then, the mean square error MSE is calculated according to the actual value y and is used for evaluating the prediction effect, nFor the number of sequences:

further, the process of constructing the fault time series data classification model based on the LSTM deep neural network in step S3 includes the following steps:

s3-1: firstly, processing fault data samples of time sequence sample data of equipment, using data which has abnormality and has determined fault types, and additionally adding normal samples with the same quantity as each fault sample;

s3-2: performing Empirical Mode Decomposition (EMD) on the time-series sample data to obtain an m-dimensional vector IMF containing an IMF component { IMF {₁,imf₂,…,imf_mM is determined by empirical mode decomposition imf_iIs the ith dimension component;

s3-3: numbering imf of n-type time sequence data to obtain model input data X_C

Determining the output data y of the model from the numbers_C

S3-4: normalizing the data, calculating the sample mean value mu and the standard deviation sigma, wherein the normalized data is X'_C：

S3-5: dividing the data set, randomly arranging the data, and dividing the input data X 'according to the same proportion'_CAnd output data y_CTo obtain a training data set { X_train,y_trainAnd a validation data set { X }_val,y_val}；

S3-6: training a model, setting check points, storing model parameters once for each Epoch, adjusting the Epoch and Dropout parameters, observing the trainloss and valloss, and using an early stopping mechanism when an overfitting condition occurs;

s3-7: classifying by using model, inputting a time sequence X_dObtaining a classification vector y ', then a fault number y'_dThe corresponding fault is found by numbering argmax (y'), which is the parameter for the maximum value.

Further, the step S5 of predicting the equipment production data by using the vibration signal prediction model based on LSTM specifically includes:

predicting the value of time t using a prediction model

Calculated according to the actual value y^(t)Calculating a prediction error:

wherein

i is the dimension of the true monitor value, and then each error e^(t)Added to the vector of prediction errors:

h is the number of historical errors used to estimate the current error, l_sFor the length of the input sequence, this set of errors e is then smoothed, in order to select an error threshold in the set of errors-the values of the mean square error above the threshold are classified as anomalous:

＝μ(e_s)+zσ(e_s) (8)

is a vector of threshold errors, e_sIs a sequence ofAn error vector, z is a constant, μ is a mean σ is a standard deviation, wherein is determined by equation (9):

wherein:

e in the formula (9)_seqIs e_a∈e_aA continuous sequence of (a).

Is determined by Z ∈ Z, Z being an ordered set of positive integers representing a value greater than μ (e)_s) Z depends on experimental data, once argmax () is determined, it is possible to generate for each anomalous sequence e that produces a mean square error_seq∈E_seqA score is made and the score s is used to indicate the severity of the anomaly, i.e. the fatigue factor:

the highest mean square error in each anomalous error sequence is then normalized by its distance from the selected threshold delta.

Further, the step S5 of dynamically adjusting the fatigue factor threshold Δ according to the actual situation specifically includes:

the step S5 of dynamically adjusting the fatigue factor threshold Δ according to the actual situation specifically includes:

creating a new set e_maxContaining all e in descending order_seqMaximum value of (e) max_seq) Then the maximum non-abnormal mean square error

Addition to e_maxAt the end, the sequence is gradually decreased, and the decrease of the ith step is:

if the amplitude d is reduced at a certain step i⁽ⁱ⁾Greater than the minimum drop amplitude p, all satisfy

Is still an abnormal sequence; if the amplitude is decreased by d⁽ⁱ⁾Less than the minimum reduction p, then the reduction d for all subsequent errors⁽ⁱ⁾,d⁽ⁱ⁺¹⁾,…,

Are reclassified as normal values. After the abnormal sequence is removed, the mean square error of the remaining normal sequence is recalculated again and used as the adjusted fatigue factor threshold value delta.

Adopt the produced beneficial effect of above-mentioned technical scheme to lie in:

(1) aiming at the problem that enough fault samples are difficult to collect in the initial stage of a production environment, an LSTM-based vibration signal prediction model anomaly detection method is provided, the existing data non-equilibrium characteristics are fully utilized, and anomaly signals generated by equipment are effectively detected;

(2) aiming at the problem that the equipment aging wear condition is not evaluated in the current research situation, a fatigue factor recessive fault detection method is provided on the basis of (1), a threshold value can be found in fluctuating data to distinguish normal data from abnormal data, and the threshold value is dynamically adjusted in incremental learning so as to achieve the purpose of detecting potential abnormality;

(3) in consideration of the characteristic that fault time sequence data have complex time relevance, a fault time sequence data classification model fusing EMD feature extraction is provided, the fault diagnosis method uses EMD to perform feature processing and denoising, and uses a LSTM structure-based deep neural network to extract features on a time dimension so as to improve the fault classification accuracy.

Drawings

FIG. 1 is a flow chart of a vibration signal prediction model based on LSTM;

FIG. 2 is a fault timing data classification model incorporating EMD feature extraction;

FIG. 3 is a flowchart of the overall structure of the LSTM-based incremental learning latent fault diagnosis method;

FIG. 4 ARIMA, RNN, Stacked LSTM predict effect;

FIG. 5 a vibration signal;

FIG. 6 Normal data EMD decomposition;

fig. 7 inner ring fault data EMD decomposition.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.

the experimental verification of the invention uses experimental data containing three faults of an Inner ring (Inner radial Fault), an Outer ring (Outer radial Fault) and a rolling body (Ball Fault)

the flow of the vibration signal prediction model based on the LSTM proposed by the present invention is described as follows, and the flow chart of the model is shown in fig. 1.

Step 1: characteristic engineering: firstly, processing collected original data, performing auxiliary judgment by using EMD (empirical mode decomposition) and waveform visualization technology, and identifying abnormal vibration frequency by observing IMF (intrinsic mode function) components;

step 2: resampling: original data are re-sampled, abnormal data fragments in the original data are deleted, the fact that continuous normal data exist in the remaining data and only exist in the remaining data is guaranteed, and the problem of data unbalance caused by unbalanced data is solved;

and step 3: data normalization: the data of each dimension in the time series are normalized respectively, the mean value mu and the standard deviation sigma of each dimension are calculated respectively, and the time series sample data X are normalized through the following formula. The data normalization processing is beneficial to the initialization of the model, avoids the influence on the updating of the gradient value due to the numerical value deviation problem, and is convenient to use the fixed learning rate to iterate the model, thereby accelerating the convergence speed of the model;

and 4, step 4: generating a data set: converting the unsupervised data set into a supervised data set, dividing the normalized data into a continuous time series X of length l_r，X_rLength of (d) is input by the model as length l_sAnd an output length l_pDetermined jointly, the relationship is that l ═ l_s+l_pDividing time series sample data X into X_sAnd X_pTwo parts, which are used as input and output samples of the model;

and 5: dividing the data set: randomly arranging data, and dividing input data X according to same proportion_sAnd output data X_pTo obtain a training data set { X_train,y_trainAnd a validation data set { X }_val,y_val}

Step 6: constructing a model: setting check points, saving model parameters once per Epoch, adjusting Epoch and Dropout parameters, observing train and valloss, and using an Early Stop mechanism (Early-Stop) when an overfitting condition occurs;

and 7: the test was performed using a model: inputting a segment of time sequence sample data X to obtain a prediction vector

Then, calculating a mean square error MSE according to the actual value y for evaluating the prediction effect, wherein n is the sequence number:

the fault time sequence data classification model process integrating EMD feature extraction provided by the invention is described as follows, and a model flow chart is shown in FIG. 2.

Step 1: firstly, processing fault samples of equipment time sequence data, using data which has abnormality and has determined fault types, and additionally adding normal samples with the same quantity as each fault sample;

step 2: performing EMD on the sample time-series data to obtain an m-dimensional vector IMF (IMF) containing IMF components₁,imf₂,…,imf_mM is determined by empirical mode decomposition imf_iIs the ith dimension component;

and step 3: numbering imf of n-type time sequence data to obtain model input data X_C，

Determining the output data y of the model from the numbers_C：

And 4, step 4: normalizing the data, calculating sample mean μ and standard deviation σ, and inputting data X to the model by equation (5)_CNormalization was performed:

and 5: dividing the data set, randomly arranging the data, and dividing the model input data X 'according to the same proportion'_CAnd output data y_CTo obtain a training data set { X_train,y_trainAnd a validation data set { X }_val,y_val}；

Step 6: training a model, setting a check point, storing model parameters once per Epoch, adjusting Epoch and Dropout parameters, observing train and valloss, and using an Early Stop mechanism (Early-Stop) when an overfitting condition occurs;

and 7: classifying by using model, inputting a time sequence X_dObtaining a classification vector y ', then a fault number y'_dThe corresponding fault is found by numbering argmax (y').

s5, predicting the production data of the equipment by using a vibration signal prediction model based on LSTM, calculating the mean square error of a predicted value and an actual value, comparing the mean square error with a fatigue factor threshold value to detect an abnormal signal, and dynamically adjusting a fatigue factor threshold value delta according to the actual condition; predicting the value of time t using a prediction model

Calculating, from the actual value, a prediction error:

wherein

i is the dimension of the true monitored value. Then each error e^(t)Added to the vector of prediction errors:

h is the number of historical errors used to estimate the current error, l_sIs the length of the input sequence. This set of errors e is then smoothed to suppress abrupt changes in errors that often occur in LSTM-based prediction-abrupt changes that are typically difficult for prediction models to predict, and that can also lead to abrupt changes in error values even if such data is normal. To determine whether these values are positiveOften, the present invention sets a threshold for their smoothed prediction errors-values of mean square error above the threshold are classified as anomalous.

In calculating the prediction error threshold, it is possible to learn to find a suitable anomaly threshold by using a supervised method with a sample label, but in general there are not enough label samples, especially if the device is operating normally without a lot of fault data. The threshold calculation method provided by the invention does not need to use sample label data or statistical hypothesis test about errors, and can have higher performance under the condition of lower resource consumption. Selecting an error threshold in the error set:

＝μ(e_s)+zσ(e_s) (8)

is a vector of threshold errors, e_sIs the sequence error vector, z is a constant, μ is the mean σ, the standard deviation, where determined by equation (9):

wherein:

e in the formula (9)_seqIs e_a∈e_aA continuous sequence of (a).

Is determined by Z ∈ Z, Z being an ordered set of positive integers representing a value greater than μ (e)_s) Number of standard deviations. The value of Z depends on experimental data, the normal range in the experiment of the invention is 2-8, and when Z is less than 2, the false alarm is generally excessive. Once argmax () is determined, an anomalous sequence e can be generated for each mean square error_seq∈E_seqA score is made and the score s is used to indicate the severity of the anomaly, i.e. the fatigue factor:

in short, if a threshold Δ is found, and an abnormal sequence with a score s greater than Δ is discarded, the mean square error e of the remaining sequence is determined_sMean value of (e) mu_s) And standard deviation σ (e)_s) Will be greatly reduced. This method is also applicable to sequences with a large number of anomalies (| E)_seq|) to avoid excessive bias, and then the highest mean square error in each anomalous error sequence is normalized according to its distance from the selected threshold Δ.

In order to reduce the false alarm rate and the calculation amount, the invention provides the following threshold value adjusting strategy: creating a new set e_maxContaining all e in descending order_seqMaximum value of (e) max_seq) Then the maximum non-abnormal mean square error

Addition to e_maxAnd (5) ending. The sequence is gradually decreased, and the decreasing amplitude of each step is d⁽ⁱ⁾,

Is still an abnormal sequence. If the amplitude is decreased by d⁽ⁱ⁾Less than the minimum reduction p, then the reduction d for all subsequent errors⁽ⁱ⁾,d⁽ⁱ⁺¹⁾,…,

Are reclassified as normal values. After the abnormal sequence is removed, the mean square error of the remaining normal sequence is recalculated again and used as the adjusted fatigue factor threshold value delta. This approach helps to ensure that the anomaly sequence is not regular noise in the time series data stream, and identification of the anomaly value sequence is achieved by thresholding, detecting only sequences with potential anomaliesFragmentation is more efficient than comparing sequence values one by one without using a threshold.

The general structural flow of the specific fault diagnosis method is shown in fig. 3.

The invention discloses a fatigue factor recessive anomaly detection and fault diagnosis method based on LSTM, which comprises the following steps:

description of data

The experimental data are derived from bearing fault data of a university of Kaiser West reservoir (CWRU) electrical engineering laboratory, and total 1,341,856 data points, wherein the bearing model is 6205-2RS JEM SKF deep groove ball bearing. Single-point faults of 3 Fault grades are respectively arranged on an Inner ring (Inner Raceway Fault), an Outer ring (Outer Raceway Fault) and a rolling body (Ball Fault) on the bearing by using an electric spark machining technology, the Fault diameters are respectively 0.007 inches, 0.014 inches and 0.021 inches, and the Fault depths are respectively 0.011 inches, 0.050 inches and 0.150 inches. Three kinds of faults are respectively arranged at a motor driving End (Driver End) and a Fan End (Fan End), and 21 groups of data including 6 fault types are collected by vibration sensors arranged at the motor driving End, the Fan End and a base. A description of specific bearing failure data is shown in table 1.

TABLE 1 bearing failure data description (in inches)

Four types of vibration data drawing graphs are selected, as shown in fig. 5, a normal signal, a rolling element fault signal, an inner ring fault signal and an outer ring fault signal are sequentially arranged from top to bottom, and a sequence with larger amplitude can be periodically shown in the fault signal, wherein the amplitude of the fault vibration signal is obviously larger than that of the fault signal, and the sequence with larger amplitude can be observed from the graph of sample data.

And decomposing the original vibration signal by using EMD, and observing the IMF component characteristics of each stage. The original signal is decomposed to obtain 7 components (called IMF 1-IMF 7 respectively) and a residual map, wherein each IMF component represents an connotative modal component existing in the original signal. Fig. 6 is an EMD decomposition of normal data, fig. 7 is an EMD decomposition of inner ring failure data, in which IMFs 1 to 7 represent signal components at different frequencies, which are arranged in order from a high frequency to a low frequency, and the right part is an instantaneous frequency of each IMF component. It can be seen from the graph that the normal signal and the abnormal signal have a large difference in residual error and instantaneous frequency distribution, the instantaneous frequency distribution of the normal signal is smooth, and the instantaneous frequency distribution of the abnormal signal has a large fluctuation.

In order to compare the effect difference between the LSTM-based vibration signal prediction model and ARIMA and RNN methods, the used experimental data comprise normal data of a driving end and a fan end, the data are divided into single-step prediction data and multi-step prediction data, the multi-step prediction data are provided with two prediction lengths of 30 and 100, 6 data sets are provided in total, and each data set comprises 20000 training samples and 4000 test samples.

The ARIMA Model is called an Autoregressive Moving Average Model (ARIMA, Autoregensive Integrated Moving Average Model). Also known as ARIMA (p, d, q), is the most common model among statistical models (statistical models) for time series prediction. The ARIMA model parameters p (Auto-Regressive), d (integrated), q (moving average) are shown in Table 2.

TABLE 2 ARIMA model parameters

RNN (Recurrent Neural Network) processes the most common, most traditional deep learning model of sequence data. The RNN model parameters used for the comparative experiments are shown in table 3.

TABLE 3 RNN model parameters

The test was performed with the above-identified model parameters and structure, and the comparative experiment results are shown in table 4.

TABLE 4 comparative experiment test results

As can be seen from the experimental test results in Table 4, the mean square errors of the three models have strong correlation with the data prediction length (the mean square errors of the RNN and StackedLSTM models are subject to ValLoss in Table 4), that is, the mean square error is smaller when the data prediction length is shorter, and the effect of single-step prediction is significantly better than that of multi-step prediction. Fig. 4(a), (b) and (c) are graphs of the prediction effect with prediction step size of 30 for the three models respectively. The dark grey lines represent the true value (TrueValue), the light grey lines represent the predicted value (Prediction), which represent the Acceleration value (left y-axis, Acceleration), and the dashed lines represent the Residual of the actual and predicted values (Residual, right y-axis, Res). ARIMA has a slight advantage in single-step prediction experiments (datasets 1 and 2) over both RNN and StackedLSTM neural networks, and its higher MSE is not as satisfactory in long-term prediction (

datasets

3, 4, 5, 6). The ARIMA and the neural network show two different characteristics, the neural network has better effect than the ARIMA in the aspect of long-term prediction and has more advantage in the aspect of short-term prediction. The prediction model effect based on the StackLSTM is superior to that of the RNN model, so that a better prediction effect can be obtained, and the StackLSTM is more suitable for long-term prediction. Structurally, the ARIMA is a linear model, and the neural network is a nonlinear model, so that the ARIMA can obtain good effect in a short time when the relationship is simple, but the fault data prediction model based on the StackedLSTM is very suitable for the scene when the complex association relationship is faced.

Claims

1. A fatigue factor recessive abnormality detection and fault diagnosis method based on LSTM is characterized by comprising the following steps:

s2, extracting normal data samples in the time sequence sample data, and constructing an equipment vibration signal prediction model by using an LSTM-based deep neural network; the process for constructing the equipment vibration signal prediction model based on the LSTM deep neural network comprises the following steps:

s2-5: dividing the data set: randomly arranging data, and dividing input data X according to same proportion_sAnd output data X_pTo obtain a training data set { X_train,y_trainAnd a validation data set { X }_val,y_val}；

s4, taking the Mean Square Error (MSE) of the vibration signal prediction model based on the LSTM acquired in the S2 as an initial fatigue factor threshold;

s5, predicting the production data of the equipment by using a vibration signal prediction model based on LSTM, calculating the mean square error of a predicted value and an actual value, comparing the mean square error with a fatigue factor threshold value to detect an abnormal signal, and dynamically adjusting a fatigue factor threshold value delta according to the actual condition; the prediction of the equipment production data by using the vibration signal prediction model based on the LSTM is specifically as follows:

predicting the value of time t using a prediction model

ComputingCome out according to the actual value y^(t)Calculating a prediction error:

wherein

＝μ(e_s)+zσ(e_s) (8)

wherein:

e in the formula (9)_seqIs e_a∈e_aA continuous sequence of (a);

is determined by Z ∈ Z, Z being an ordered set of positive integers representing a value greater than μ (e)_s) The number of standard deviations, the value of Z, depends on experimental data, once argmax () is confirmedThen, for each abnormal sequence e generating the mean square error_seq∈E_seqA score is made and the score s is used to indicate the severity of the anomaly, i.e. the fatigue factor:

then, the highest mean square error in each abnormal error sequence is normalized according to the distance between the highest mean square error and the selected threshold value delta;

the dynamic adjustment of the fatigue factor threshold value delta according to the actual situation specifically comprises:

Is still an abnormal sequence; if the amplitude is decreased by d⁽ⁱ⁾Less than the minimum reduction p, then the reduction of all subsequent errors

Are reclassified as normal; after removing the abnormal sequence, the mean square error of the remaining normal sequence is recalculated again and used as the modulationThe fatigue factor threshold value delta after the integration;

2. The LSTM-based fatigue factor latent anomaly detection and fault diagnosis method of claim 1, wherein the process of constructing the fault timing sequence data classification model based on the LSTM deep neural network in step S3 comprises the following steps:

s3-3: numbering imf of n-type time sequence data to obtain model input data X_C

Determining the output data y of the model from the numbers_C

S3-5: dividing the data set, randomly arranging the data, and dividing the data according to the same proportionInput data X'_CAnd output data y_CTo obtain a training data set { X_train,y_trainAnd a validation data set { X }_val,y_val}；

s3-7: classifying by using model, inputting a segment of model input data X_dObtaining a classification vector y ', then a fault number y'_dThe corresponding fault is found by numbering argmax (y'), which is the parameter for the maximum value.