CN117010263A - Residual life prediction method based on convolutional neural network and long-term and short-term memory network - Google Patents
Residual life prediction method based on convolutional neural network and long-term and short-term memory network Download PDFInfo
- Publication number
- CN117010263A CN117010263A CN202310339141.5A CN202310339141A CN117010263A CN 117010263 A CN117010263 A CN 117010263A CN 202310339141 A CN202310339141 A CN 202310339141A CN 117010263 A CN117010263 A CN 117010263A
- Authority
- CN
- China
- Prior art keywords
- model
- rul
- data
- prediction
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 16
- 230000007787 long-term memory Effects 0.000 title claims abstract description 6
- 230000006403 short-term memory Effects 0.000 title claims abstract description 6
- 238000012549 training Methods 0.000 claims abstract description 36
- 238000006731 degradation reaction Methods 0.000 claims abstract description 24
- 230000015556 catabolic process Effects 0.000 claims abstract description 19
- 230000008569 process Effects 0.000 claims abstract description 18
- 238000012544 monitoring process Methods 0.000 claims abstract description 15
- 238000010606 normalization Methods 0.000 claims abstract description 13
- 230000006870 function Effects 0.000 claims abstract description 12
- 238000012545 processing Methods 0.000 claims abstract description 10
- 238000007781 pre-processing Methods 0.000 claims abstract description 9
- 238000000605 extraction Methods 0.000 claims abstract description 8
- 238000004422 calculation algorithm Methods 0.000 claims description 18
- 238000012360 testing method Methods 0.000 claims description 14
- 239000011159 matrix material Substances 0.000 claims description 11
- 230000036541 health Effects 0.000 claims description 9
- 238000005457 optimization Methods 0.000 claims description 9
- 238000011156 evaluation Methods 0.000 claims description 6
- 239000013598 vector Substances 0.000 claims description 6
- 238000009826 distribution Methods 0.000 claims description 5
- 238000012423 maintenance Methods 0.000 claims description 4
- 210000002569 neuron Anatomy 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 230000004913 activation Effects 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 238000012795 verification Methods 0.000 claims description 3
- 230000001174 ascending effect Effects 0.000 claims description 2
- 238000004364 calculation method Methods 0.000 claims description 2
- 238000010276 construction Methods 0.000 claims description 2
- 238000013100 final test Methods 0.000 claims description 2
- 238000012512 characterization method Methods 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 4
- 238000013528 artificial neural network Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 230000007423 decrease Effects 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 238000000342 Monte Carlo simulation Methods 0.000 description 1
- 238000012614 Monte-Carlo sampling Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000012886 linear function Methods 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000011897 real-time detection Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0499—Feedforward networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2119/00—Details relating to the type or aim of the analysis or the optimisation
- G06F2119/02—Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2119/00—Details relating to the type or aim of the analysis or the optimisation
- G06F2119/04—Ageing analysis or optimisation against ageing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Geometry (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Probability & Statistics with Applications (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a residual life prediction method based on a convolutional neural network and a long-term and short-term memory network, and relates to the field of residual life prediction. The method is mainly divided into two parts, namely data preprocessing and a training model. And in the preprocessing stage, the working condition information identification of the invalid equipment monitoring data is mainly completed, normalization and standardization are carried out according to the working condition category, and sliding window processing is simultaneously carried out on the preprocessed monitoring data and the corresponding RUL to obtain an input sample and an output label. In the model training stage, training samples are mainly input into a CNN-LSTM model for time sequence feature extraction and degradation correlation modeling, prediction RUL of input information is obtained through forward propagation, errors of a true value and a predicted value are calculated, reverse propagation updating is carried out on model parameters by using a loss function, and the process is repeated until the predicted loss value falls to a certain range and tends to be stable. According to the method, dropout and early stopping methods are introduced to reduce negative effects of the over-fitting problem on model prediction performance, so that the accuracy of RUL prediction is improved, and a better solution is provided for modeling the degradation process of RUL prediction.
Description
Technical Field
The invention relates to the field of residual life prediction, in particular to the field of residual life prediction application based on a convolutional neural network and a long-term and short-term memory network.
Prior Art
The data-driven residual life (Remaining Useful Life, RUL) based prediction method has become the most widely used prediction classification method due to its powerful data processing capability. However, the existing research mainly completes deep feature extraction and RUL prediction tasks on sensor data of monitoring equipment, but lacks consideration on prediction problems in complex working condition environments, and has important application value considering influences of working conditions on RUL prediction results due to various noise factors existing in actual industrial environments.
Through the current document retrieval, the method has the advantages that the performance of modeling the degradation process and the RUL prediction process of a complex system based on various neural networks under artificial intelligence is excellent, and the powerful nonlinear approximation capability is shown. Most algorithms can only obtain the mapping relation between data and RUL, and the uncertainty factor in the modeling process is not considered. Sbarufatti et al, in Sequential Monte-Carlo sampling based on a committee of artificial neural networks for posterior state estimation and residual lifetime prediction, combine a feed-forward neural network with a Monte Carlo method to apply the RUL probability distribution of a part affected by fatigue cracks, enabling real-time detection of the damaged condition of the part. Yang et al in Remaining Useful Life Prediction Based on a Double-Convolutional Neural Network Architecture propose a model combining two convolutional neural networks (Convolutional Neural Network, CNN) for RUL prediction, a first convolutional network for identifying initial failure points for different components, and a second convolutional network for establishing a reliable mapping between intermediate variables and RUL. The deep neural network can also be used for modeling Health Indexes (HI), a Cox PHM proportional risk model is introduced into the model An integrated deep learning-based approach for automobile maintenance prediction with GIS data by Chen et al to construct HI, and the HI is modeled by combining a plurality of proposed Long Short-Term Memory (LSTM) deep neural network structures, so that the LSTM model is excellent in prediction accuracy. To build the RUL prediction model based on the deep neural network, the key is to accurately build a mapping function of input monitoring data and target health state indexes. In addition, the neural network integrates data characteristic processing and modeling analysis into a network structure, so that RUL prediction is realized end to end, the accuracy of prediction is ensured, and the traditional RUL prediction flow is greatly simplified.
At present, a life prediction method based on deep learning is mainly used for completing deep feature extraction and RUL prediction tasks aiming at monitoring data of equipment, but the prediction problem under a complex working condition environment is freshly researched, and the influence of the working condition on a life prediction result is very necessary and valuable by combining the variability and randomness of the actual running environment of the equipment. Therefore, a RUL method based on deep learning is needed to solve RUL prediction under complex working conditions.
Disclosure of Invention
Aiming at the existing problems, the invention provides an RUL prediction method based on a convolutional neural network and a long-short-term memory network. Firstly, carrying out working condition identification on collected monitoring data by adopting a K-means method to reduce the influence of equipment environment working conditions on model performance, and using a sliding window for generating a three-dimensional sample form of LSTM (LSTM) good treatment in a data preprocessing stage; then adopting CNN to extract depth characteristics of the data subjected to the working condition analysis; and finally, fitting the extracted features by using LSTM to establish a time sequence degradation model, and extrapolating the completed RUL prediction task.
The overall data flow framework of the invention is shown in fig. 1, and is mainly divided into two parts of data preprocessing and training models.
(1) And in the preprocessing stage, the working condition information identification of the invalid equipment monitoring data is mainly completed, normalization and standardization are carried out according to the working condition category, and sliding window processing is simultaneously carried out on the preprocessed monitoring data and the corresponding RUL to obtain an input sample and an output label.
(2) In the model training stage, training samples are mainly input into a CNN-LSTM model for time sequence feature extraction and degradation correlation modeling, prediction RUL of input information is obtained through forward propagation, errors of a true value and a predicted value are calculated, reverse propagation updating is carried out on model parameters by using a loss function, and the process is repeated until the predicted loss value falls to a certain range and tends to be stable.
Step one: selecting a raw data set
And selecting a plurality of sensors to acquire data according to the actual condition of the maintenance equipment, so as to obtain an original data set.
Step two: dataset preprocessing
(1) Data condition information identification
Since the working states of the same equipment in different configurations are not the same, the working condition information of the equipment in different operating states needs to be identified. And randomly selecting a plurality of data points from the original data set as an initial clustering centroid, calling a K-means algorithm to learn the data distribution of the training set, classifying the working condition conditions of the data of the training set and the data of the test set, and adding a new label for representing the working condition information to the data.
(2) Data normalization and normalization process
The running states of the equipment working under different working condition environments are necessarily different, so that the data needs to be normalized and standardized by combining the label obtained in the step 1.
Equation (1) mainly completes the normalization operation of the data, where x d Representing the original value of the d-th sensor, x d-max And x d-min Representing the maximum and minimum values of the sensor in all training sets, x d Representing the normalized value.
The formula (2) completes the standardization process of the data and retains the statistical distribution information of the data. Wherein u is d Sum sigma d Mean and standard deviation of the d-th sensor are shown, respectively.
(3) Time series sample generation
In order to convert the original two-dimensional signal into a three-dimensional matrix in LSTM input data format, a sliding window process is required, and the principle of generating a multi-variable time sample sequence sample using a sliding time window is shown in fig. 2.
Equation (3) represents the sliding window processing, inputIndicating all monitoring data collected by the ith device,/-for>n represents the number of sensors. N (N) tw Indicating the size of the sliding window, the total information obtained by sliding the window at the kth time point is +.>The window is slid from left to right in step 1 until the last point in time T.
(4) RUL tag generation
In combination with the actual operation of the devices, each device has an initial normal operation, i.e. the RUL is considered to be unchanged during this time. According to the piecewise linear model, as shown in FIG. 3, a threshold is set that indicates the health of the device RUL, and the device is considered to begin degrading when the RUL of the device is below the threshold, and is considered to begin linear degradation when the RUL is below the threshold.
Step three: construction of CNN-LSTM network model
The network structure based on the CNN-LSTM model is shown in fig. 4, an input sample is a two-dimensional matrix, one dimension is the feature quantity, the other dimension is the time sequence length, and the convolution kernels with the shared weight perform sliding feature extraction in step length 1 in the time dimension of the sample matrix, so that 1-dimensional convolution operation is realized. In addition, to ensure that the input and output are the same in size, a feature vector representation is obtained for each time point, the convolved input is filled with 0 in the time dimension, each layer of convolution is followed by an active layer, and in order to avoid information loss, no pooling layer downsampling is added. Feature vectors which fully describe degradation information are obtained through feature screening of 3 layers of CNNs, then time correlation modeling of degradation system health indexes is completed through a 1-layer LSTM network, and a multi-layer perceptron (Multilayer Perceptron, MLP) fits the health state representation vectors obtained through deep neural network extraction. In order to reduce the possibility of occurrence of training over fitting, a Dropout method is introduced, as shown in fig. 5, some neurons are deactivated randomly with a certain probability in the training process, so that the neurons are prevented from participating in network training, and the generalization capability of a model is enhanced.
Step four: model training
(1) Parameter update
In the parameter updating process, a small-batch gradient descent algorithm is adopted, so that the matrix operation with high deep learning efficiency is utilized, and fluctuation possibly caused by using a single sample gradient to update the parameters is avoided. Firstly, calculating gradient of each layer by adopting a chain derivative rule of BP algorithm, and updating weight parameter w of the layer by MBGD algorithm (l) 。
In delta (l) Represents the gradient of the first layer, η is the learning rate.
When the parameters of the convolution layer are updated, the convolution kernel needs to be complemented with a circle of 0 and then rotated for 180 degrees to obtain gradient errors, and an updated weight matrix W is obtained according to the obtained gradient errors l : the specific steps are as shown in the formula (6-10):
α (l) =f l (net l ) (6)
net (l+1) =conv(W l+1 ,α (l) ) (7)
δ (l) =δ (l+1) rot180(W l+1 )⊙f l (net (l) ) (8)
wherein f l (. Cndot.) represents the activation function used after the first convolution layer, W l+1 Weight matrix, alpha, representing convolutional layers (l) Representing the output of layer l, conv (·) representing the convolution operation, net (l) Representing a feature map obtained by a convolution operation, rot180 represents a rotation of 180 degrees.
The early-stop method is adopted to improve the training efficiency of the model, as shown in fig. 6, after the completion of multiple rounds of epoch training, the loss error on the training set is reduced, but the loss on the verification set has an ascending trend, at this time, the model can be considered to have an overfitting phenomenon, and the network training is stopped and the model weight parameter W is saved l And (5) completing establishment of a prediction model.
(2) Model parameter optimization
The adaptive optimization algorithm of Adam is adopted to optimize model parameters, the first moment is used for controlling the updating direction of the model, the second moment is used for controlling the learning rate, and compared with other optimization algorithms, the adaptive optimization algorithm of Adam is insensitive to gradient scale and is suitable for optimizing depth models with sparse parameters or high complexity.
Step five: model evaluation
To more directly observe the predictive performance optimization process of the model, a model loss function is constructed based on root mean square error (Root Mean Square Error, RMSE):
wherein n is the number of test units, RUL pred_i And RUL (Rul) true_i The predicted RUL and the real RUL of the i-th test unit are represented, respectively.
Evaluating the model for effectiveness using a Score function, as shown in (12)
Where Score is the total predictive scoring function, n represents the number of test units, d i Representing the prediction error of the i-th unit.
Step six: RUL prediction
Inputting the test sample into a prediction model, and outputting the RUL prediction results of the final test sample at all times through forward calculation.
The invention provides a RUL prediction method based on CNN-LSTM, which analyzes the characteristics of collected data under complex working conditions, and combines the structure and characteristics of a network to provide a method for preprocessing multi-working condition data. The specific CNN-LSTM network model is carried, the specific step of equipment life prediction is provided, and because of the over-fitting risk in the neural network training, dropout and early-stop methods are introduced to reduce the negative influence of the over-fitting problem on the model prediction performance, the accuracy of RUL prediction is improved, and a better solution is provided for the degradation process modeling of RUL prediction.
Drawings
FIG. 1 is a flow chart of a RUL prediction method based on CNN-LSTM
FIG. 2 is a schematic diagram of a sliding window generation sequence sample
FIG. 3 is a piecewise linearized RUL tag model
FIG. 4 is a diagram of the network structure of CNN-LSTM
FIG. 5 is a dropout schematic diagram
FIG. 6 is a schematic diagram of an early stop method
FIG. 7 is a schematic view of raw degradation trends of 21 sensors of FD002 data set
FIG. 8 is a graph of the result of FD002 engine data clustering
FIG. 9 is a schematic view showing the degradation trend of 21 sensors in the FD002 data set after normalization
FIG. 10 is a diagram showing experimental results of parameters of network structure
FIG. 11 is a graph of experimental results of input sample parameters
FIG. 12 is a FD001 dataset 100 test set engine prediction results
FIG. 13 is a graph of C-MAPSS test subset single engine predictions
DETAILED DESCRIPTION OF EMBODIMENT (S) OF INVENTION
The residual life prediction method based on the convolutional neural network and the long-term memory network provided by the invention is verified by a specific case, and the specific process comprises the following steps:
step 1: selecting a raw data set
The effects of the invention are demonstrated and verified by commercial modular aviation propulsion system simulation (Commercial Modular Aero-Propulsion System Simulation, C-MAPSS) data sets.
In the C-MAPSS data set, 4 subsets are total, specific information is shown in table 1, the data sets FD001 and FD003 are all engine degradation data under a single-working-condition environment, and FD002 and FD004 are multi-working-condition engine degradation data under six different working conditions, so that the performance verification of the proposed CNN-LSTM life prediction method is completed based on the data sets of different working conditions.
TABLE 1C-MAPSS Engine dataset Profile
Each data subset contains 21 sensor data and 3 running operation data, and according to different operation settings, the running condition of the engine can be further refined into 6 kinds, and the monitoring data of the corresponding sensor are influenced. FIG. 7 shows raw degradation data for 21 sensors of one of the engines of the FD002 data set. As can be seen from the graph, due to the influence of the working conditions on the performance of the engine, the raw degradation data of 21 sensors contained in FD002 obviously fluctuates in the whole degradation process, so that further analysis according to the working conditions is required to screen the sensors to extract effective degradation information.
Step 2: data preprocessing
(1) Data condition information identification
The method comprises the steps of firstly, carrying out working condition identification and marking on FD002 and FD004 data sets by using a K-means algorithm so as to carry out standardized processing on monitoring data according to the current running state. Fig. 8 shows a clustering result of the operation information of FD002 on the operation conditions of the device, and it can be seen that the distance interval between the operation information of different conditions is larger, so that a good clustering effect can be achieved.
(2) Data normalization and normalization process
The data was then normalized and a schematic representation of the degradation traces for 21 sensors after normalization is shown in fig. 9. As can be seen from the figure, the individual sensors such as T24, T30, T50 after the normalization process have obvious variation trend, but the effective degradation information still cannot be obtained for the sensors such as T2, P20, P15, etc.
(3) Time series sample generation
And processing the sensor reserved after the preliminary screening by utilizing a sliding window technology and generating a time sequence sample.
(4) RUL tag generation
The threshold for RUL labels is set to 120 according to a piecewise linear function.
Step 3: constructing and training CNN-LSTM network model
The main parameters affecting the prediction performance are: the number of convolutional layers of the CNN, the number of hidden units of the LSTM network, the sequence length of the samples and the training batch size. These parameters are adjusted separately by a controlled single variable method, and in addition, in order to eliminate the influence of the network parameter initialization randomness on the experimental result, 5 repeated experiments are performed on each set of parameter settings and the average value of the evaluation results is taken.
Fig. 10 shows the influence of the number of convolutional neural network layers and the LSTM hidden unit dimension on the model prediction performance, and it can be seen from the graph that as the number of network convolutional layers increases, the model feature extraction capability further increases, the Score value and RMSE value of the scoring function gradually decrease, when the number of convolutional layers reaches 3, the model prediction performance reaches an optimal state in comparison, and then the number of convolutional layers is increased, the complexity of the model increases, the risk of fitting is present, and the model prediction performance has a decreasing trend. The fixed convolution layer number is 3, the experimental result has similar change along with the increase of the LSTM hidden unit dimension, and when the LSTM hidden unit dimension is 64, the model prediction error is reduced to the lowest historical point.
Besides the effect of model prediction caused by the network structure, the size of the training sample is also an important parameter for influencing the prediction performance of the proposed method, in particular the influence on the convergence speed of the model. As can be seen from fig. 11, as the sample sequence length increases, the network requires more time to analyze the input information, and thus the model convergence rate decreases. In addition, as the sequence length increases, the amount of information contained in a single sample increases, the prediction error gradually decreases, and the prediction accuracy gradually increases. In particular, when the sequence length is increased from 20 to 25, the prediction accuracy is greatly improved, so that it can be inferred that the longer sample sequence contains enough historical information to be helpful for predicting the degradation trend. The size of the training sample is also another important factor affecting the prediction performance of the model, and as can be seen from the right graph of fig. 11, as the training batch increases, the speed of processing the same data increases, the gradient descent direction also tends to be stable, and the model convergence speed increases.
In summary, the selected network configuration parameter settings are shown in Table 2
Table 2 network parameter settings
Step 4: RUL prediction results and analysis
According to the life prediction method based on the CNN-LSTM, firstly, a degradation model of equipment is obtained through supervised network training in an offline stage, and secondly, in an online RUL prediction stage, processed data are input into a training storage model, and then RUL can be directly obtained. Fig. 12 shows the result of the prediction of the RUL of the last monitoring point of 100 engines in the FD001 test set, and it can be seen that the predicted RUL is substantially coincident with the real RUL label, and the prediction accuracy of the RUL prediction model established based on the neural network is greatly improved compared with the method based on particle filtering.
One engine is randomly selected from four test sets to conduct RUL prediction, and FIG. 13 shows the prediction RUL of the engines No. 76, no. 190, no. 99 and No. 126 which change along with time, and it is easy to find that the RUL predicted value of the proposed method is basically consistent with the corresponding RUL true mark value change trend during the whole degradation process prediction period of the engines under four different operation conditions, and the error of the predicted value and the true value is smaller. From the figures it can be derived that:
(1) The proposed algorithm gives accurate life estimation on four kinds of equipment in different working conditions, shows stronger model generalization performance, and the estimated RUL is further converged to a true value along with the increase of input information. This is because the device can be considered to be in a normal operation phase at the early stage of monitoring, and thus the predicted value fluctuates around a constant. With the increase of the monitoring time, the equipment enters a degradation stage, the failure trend is further remarkable, at the moment, the RUL predicted value is gradually close to the real label value, and the prediction accuracy of the model is gradually improved. The method can give out the health state evaluation with higher accuracy in the early period of the equipment near the fault, has important guiding significance for the subsequent preparation of predictive maintenance plans, and has very valuable predictive performance in practical industrial environment application.
(2) The complexity of the operating conditions has a large impact on model predictive representation. The RUL prediction results of the engine under different randomly selected working conditions are given in the figure, and from the comparison of the fluctuation of RUL prediction curves, the engines No. 76 and No. 99 of FD001 and FD003 under single working conditions are smaller than the engines No. 190 and No. 126 of FD002 and FD004 under complex working conditions. In particular, in the stage close to failure, the RUL measured value of the simple working condition gradually converges and is basically overlapped with the real label value, and the RUL predicted value of the complex working condition still fluctuates up and down on the real label. The CNN-LSTM algorithm provided by the chapter performs cluster analysis of working conditions before data are input into the depth network model, is beneficial to reducing the influence of working condition noise on the prediction model, and improves the prediction accuracy of the prediction method in an actual scene.
In addition, the model predictive performance was measured according to RMSE and Score, and the results are shown in table 3.
TABLE 3 RUL prediction Performance based on CNN-LSTM
The RUL prediction result can be accurately given out by combining the RUL prediction result and the evaluation given by the two model performance evaluation methods, and the RUL prediction method based on the CNN-LSTM has strong algorithm robustness and stable and effective prediction performance under different working condition environments.
Claims (1)
1. The residual life prediction method based on the convolutional neural network and the long-term and short-term memory network is characterized by comprising the following steps of:
step one: selecting a raw data set
Selecting a plurality of sensors to acquire data according to the actual condition of maintenance equipment, so as to obtain an original data set;
step two: dataset preprocessing
(1) Data condition information identification
Randomly selecting a plurality of data points from an original data set as an initial clustering centroid, calling a K-means algorithm to learn the data distribution of the training set, classifying the working condition conditions of the data of the training set and the data of the test set, and adding a new label for representing the working condition information to the data;
(2) Data normalization and normalization process
The running states of the equipment working under different working condition environments are necessarily different, so that the data are required to be normalized and standardized by combining the label obtained in the step 1;
equation (1) mainly completes the normalization operation of the data, where x d Representing the original value of the d-th sensor, x d-max And x d-min Representing the maximum and minimum values of the sensor in all training sets,representing the normalized value;
the formula (2) completes the standardization process of the data and reserves the statistical distribution information of the data; wherein u is d Sum sigma d Mean and standard deviation of the d-th sensor are shown respectively;
(3) Time series sample generation
In order to convert the original two-dimensional signal into a three-dimensional matrix in LSTM input data format, sliding window processing is required;
equation (3) represents the sliding window processing, inputIndicating all of the monitoring data collected by the ith device,n represents the number of sensors; n (N) tw Indicating the size of the sliding window, the total information obtained by sliding the window at the kth time point is +.>Sliding the window from left to right by step 1 until the last time point T;
(4) Actual RUL tag generation
In combination with the fact that each piece of equipment has an initial normal operating condition, that is to say that the RUL can be considered unchanged during this time; thus, setting a threshold value representing the health of the device RUL according to the piecewise linear model, considering the RUL to be unchanged when the device is operating normally, and considering the device to start to be linearly degraded when the RUL of the device is lower than the threshold value;
step three: construction of CNN-LSTM network model
Inputting a two-dimensional matrix sample, wherein one dimension of the sample is the feature quantity, the other dimension is the time sequence length, and the convolution kernel shared by the weight performs sliding feature extraction on the time dimension of the sample matrix by a step length 1 to realize 1-dimensional convolution operation; in addition, in order to ensure that the input and output are the same in size, a feature vector representation of each time point is obtained, the convolution input is filled with 0 in the time dimension, and each layer of convolution is followed by an activation layer; feature vectors are obtained through feature screening of 3 layers of CNNs, modeling of health indexes of a degradation system is completed through a 1-layer LSTM network, and MLP is selected to fit the extracted health state characterization vectors; to reduce the likelihood of training over-fits occurring, a Dropout method is introduced; in the training process, some neurons are randomly inactivated with a certain probability, so that the neurons are prevented from participating in network training, and the generalization capability of the model is enhanced;
completing establishment of a CNN-LSTM network model;
step four: model training
(1) Parameter update
A small batch gradient descent algorithm is adopted in the parameter updating process; firstly, calculating gradient of each layer by adopting a chain derivative rule of BP algorithm, and updating weight parameter w of the layer by MBGD algorithm (l) ;
In delta (l) Representing the gradient of the first layer, wherein eta is the learning rate;
when the parameters of the convolution layer are updated, the convolution kernel is complemented with a circle of 0 and then rotated for 180 degrees to obtain gradient errors, and an updated weight parameter matrix W is obtained according to the obtained gradient errors l : the specific steps are as shown in the formula (6-10):
α (l) =f l (net l ) (6)
net (l+1) =conv(W l+1 ,α (l) ) (7)
δ (l) =δ (l+1) rot180(W l+1 )⊙f l (net (l) ) (8)
wherein f l (. Cndot.) represents the activation function used after the first convolution layer, W l+1 Weight matrix, alpha, representing convolutional layers (l) Representing the output of layer l, conv (·) representing the convolution operation, net (l) Representing a feature map obtained through convolution operation, rot180 representing rotation by 180 degrees;
the early stop method is adopted to improve the training efficiency of the model, after the completion of multiple rounds of epoch training, the loss error on the training set is reduced, but the loss on the verification set has an ascending trend, at the moment, the model can be considered to have an overfitting phenomenon, and the network training is stopped and the model weight parameter W is saved l Completing establishment of a prediction model;
(2) Model parameter optimization
Optimizing model parameters by adopting an Adam self-adaptive optimization algorithm, controlling the updating direction of the model by using a first moment, controlling the learning rate by using a second moment, and being insensitive to gradient scale compared with other optimization algorithms, and being suitable for optimizing a depth model with sparse parameters or high complexity;
step five: model evaluation
To more directly observe the predictive performance optimization process of the model, a model loss function is constructed based on root mean square error (Root Mean Square Error, RMSE):
wherein n is the number of test units, RUL pred_i And RUL (Rul) true_i Respectively representing the predicted RUL and the real RUL of the ith test unit;
evaluating the model for effectiveness using a Score function, as shown in (12)
Where Score is the total predictive scoring function, n represents the number of test units, d i Representing the prediction error of the i-th unit;
step six, RUL prediction
Inputting the test sample into a prediction model, and outputting the RUL prediction results of the final test sample at all times through forward calculation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310339141.5A CN117010263A (en) | 2023-04-01 | 2023-04-01 | Residual life prediction method based on convolutional neural network and long-term and short-term memory network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310339141.5A CN117010263A (en) | 2023-04-01 | 2023-04-01 | Residual life prediction method based on convolutional neural network and long-term and short-term memory network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117010263A true CN117010263A (en) | 2023-11-07 |
Family
ID=88564299
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310339141.5A Pending CN117010263A (en) | 2023-04-01 | 2023-04-01 | Residual life prediction method based on convolutional neural network and long-term and short-term memory network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117010263A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117545122A (en) * | 2023-12-07 | 2024-02-09 | 广东省科技基础条件平台中心 | LED lamp array control method, device, storage medium and equipment |
CN118194049A (en) * | 2024-05-17 | 2024-06-14 | 山东省盈鑫彩钢有限公司 | Method for predicting loss data of aluminum-zinc plated steel plate |
-
2023
- 2023-04-01 CN CN202310339141.5A patent/CN117010263A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117545122A (en) * | 2023-12-07 | 2024-02-09 | 广东省科技基础条件平台中心 | LED lamp array control method, device, storage medium and equipment |
CN117545122B (en) * | 2023-12-07 | 2024-08-09 | 广东省科技基础条件平台中心 | LED lamp array control method, device, storage medium and equipment |
CN118194049A (en) * | 2024-05-17 | 2024-06-14 | 山东省盈鑫彩钢有限公司 | Method for predicting loss data of aluminum-zinc plated steel plate |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110321603B (en) | Depth calculation model for gas path fault diagnosis of aircraft engine | |
CN117010263A (en) | Residual life prediction method based on convolutional neural network and long-term and short-term memory network | |
US11840998B2 (en) | Hydraulic turbine cavitation acoustic signal identification method based on big data machine learning | |
CN109141847B (en) | Aircraft system fault diagnosis method based on MSCNN deep learning | |
CN113642225B (en) | CNN-LSTM short-term wind power prediction method based on attention mechanism | |
Ayodeji et al. | Causal augmented ConvNet: A temporal memory dilated convolution model for long-sequence time series prediction | |
CN111340282B (en) | DA-TCN-based method and system for estimating residual service life of equipment | |
CN112926273A (en) | Method for predicting residual life of multivariate degradation equipment | |
CN111310965A (en) | Aircraft track prediction method based on LSTM network | |
Hsu et al. | Temporal convolution-based long-short term memory network with attention mechanism for remaining useful life prediction | |
CN114218872B (en) | DBN-LSTM semi-supervised joint model-based residual service life prediction method | |
CN112132394B (en) | Power plant circulating water pump predictive state evaluation method and system | |
Suryo et al. | Improved time series prediction using LSTM neural network for smart agriculture application | |
CN111784061B (en) | Training method, device and equipment for power grid engineering cost prediction model | |
CN118378054B (en) | Real-time reliability assessment system and method for submarine-launched unmanned aerial vehicle | |
CN118194222A (en) | SCADA data-based space-time fusion wind turbine generator fault prediction method | |
Cui et al. | Intelligent health management of fixed-wing UAVs: A deep-learning-based approach | |
Zhao et al. | Remaining useful life prediction method based on convolutional neural network and long short-term memory neural network | |
CN112560252B (en) | Method for predicting residual life of aeroengine | |
CN117909930A (en) | Compressor stall prediction method, device, equipment and medium based on LSTM neural network | |
CN113469013A (en) | Motor fault prediction method and system based on transfer learning and time sequence | |
CN113673774A (en) | Aero-engine remaining life prediction method based on self-encoder and time sequence convolution network | |
CN117371321A (en) | Internal plasticity depth echo state network soft measurement modeling method based on Bayesian optimization | |
CN118296516A (en) | Online industrial control abnormality detection algorithm based on hierarchical time memory | |
CN115016275B (en) | Intelligent feeding and livestock house big data Internet of things system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |