CN112287596A

CN112287596A - Engine residual life prediction method based on clustering and LSTM

Info

Publication number: CN112287596A
Application number: CN202011000802.4A
Authority: CN
Inventors: 刘君强; 雷凡; 左洪福
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2020-09-22
Filing date: 2020-09-22
Publication date: 2021-01-29

Abstract

An engine residual life prediction method based on clustering and LSTM. The method belongs to the field of health management of aero-engines, and particularly relates to a clustering and LSTM-based engine residual life prediction method. The method for predicting the remaining service life of the engine based on the clustering and the LSTM is used for monitoring the health condition of the aircraft engine by combining the clustering and the LSTM and predicting the remaining service life of the engine so as to ensure the normal operation of flights. The prediction is carried out according to the following steps: 1) standardizing; 2) clustering analysis; 3) analyzing mutation; 4) predicting an LSTM model; 5) calculating a weight value; 6) multi-stage prediction. Experimental analysis shows that the ILSTMC method is effective when the residual service life of the aero-engine is predicted, and operators can better manage the health of the aero-engine by combining the prediction result of the scheme.

Description

Engine residual life prediction method based on clustering and LSTM

Technical Field

The invention belongs to the field of health management of aero-engines, and particularly relates to a clustering and LSTM-based engine residual life prediction method.

Background

Aircraft engines are a core component of aircraft, and deployment is extremely important to the health management of the engine due to the complex variability between its internal components, coupled with long-term operation in a fairly intense operating environment. The prediction of the residual service life of the aircraft engine is of great significance for improving the safe and reliable work of the aircraft engine and guaranteeing the flight safety of the aircraft.

In the aspect of residual life prediction of the engine at present, a single-stage residual life prediction method is mostly used, the residual life prediction of the engine is not accurate, and the method cannot be well applied to health management of the engine.

Disclosure of Invention

Aiming at the problems, the invention provides an engine residual life prediction method based on clustering and LSTM, which utilizes the combination of clustering and LSTM to monitor the health condition of an aircraft engine and predict the residual life of the engine so as to ensure the normal operation of flights.

The technical scheme of the invention is as follows: the prediction is carried out according to the following steps:

1) and (3) standardization: inputting engine data, and carrying out sample normalization averaging processing;

2) clustering analysis: accurately obtaining the performance degradation data of the engine in multiple stages through clustering analysis on the basis of the step 1);

3) mutation analysis: on the basis of the step 2), predicating degradation data of multiple stages by adopting ILSTMC, and replacing common hidden layer nodes with memory blocks in the predicating process to ensure the storage of information and avoid the problems of long-term dependence and gradient disappearance;

4) prediction of the LSTM model: using the mutation points for prediction;

5) calculating a weight value: calculating the weight value of each door according to an algorithm formula of the LSTM;

6) multi-stage prediction: the data are related sequentially, firstly, the LSTM of the clustering treatment divided into a plurality of stages is carried out, and then prediction is carried out.

The step 1) is specifically as follows: firstly, normalizing and averaging input multidimensional aeroengine performance monitoring data to normalize the data to [0, 1] so as to avoid the possibility of positive and negative offset, and then averaging the normalized result, wherein the specific calculation formula of normalization is as follows:

wherein x is_no_rmalFor normalized data, x_tAs raw data, x_maxIs the maximum value, x, in the original data sample_minAt its minimum. The normalized averaged data was defined as the Overall Health Score (OHS) of the engine.

The step 2) is specifically as follows: the Sum of Squared Errors (SSE) in the clusters is used to find the optimal number k of clusters, and the number of clusters is determined by observing the inflection point diagram. The formula is as follows:

ω^(i,j)＝[X₁+X₂+…+X_k]^-1 (4)

wherein, X_kThe probability of each cluster is represented, omega represents the probability number of the cluster to which the sample belongs, m is a fuzzy coefficient and is generally 2, and mu is the central point of each cluster.

The step 3) is specifically as follows: and (4) utilizing clustering analysis to obtain clusters and catastrophe points, and dividing the data of the engine into a plurality of stages. The occurrence of the abrupt points may cause the remaining life condition of the engine to change, so the processing of the abrupt points becomes the basis of clustering. The detection of the mutation points is mainly to calculate the distance between each object and the central objects, and the specific calculation is shown in formula (2) to formula (4).

The error at the time t-1 can be calculated by the formula (11), the value of the activation function sigmoid of the forgetting gate at the mutation point, namely the value of sigma, is approximately 0, the input gate is approximately 1, and the cell at the previous time can not be transmitted by time and stored to the current time, so that the mutation point can be generated.

The step 4) is specifically as follows: using the discontinuities for prediction, the flow of information in the state of the hidden layer cell can be controlled by the value of the error term σ being in the input, forgetting and output gates of [0, 1 ]:

wherein, C_tIs the long-term state of the cell at the present time, F_tIs the value of the forgotten gate at the present time, C_t-1Is the cell state at time t-1, I_tIs an input for the current time of day,

is the instant state of the current moment, and the forgetting gate determines the previous moment C_t-1How much of the cell state of (2) remains until the present time C_t。

The LSTM model uses the ability to remove or add information to the cell state by carefully designing the structure of three gates, an input gate, a forgetting gate and an output gate, respectively, to protect and update the cell state. The input gate allows information to enter the memory cell, the forgetting gate allows cell information to be forgotten or removed from the input memory cell, and the output gate outputs information from the input memory cell.

The three gates cooperate closely to update the cell information. Firstly, effective information is extracted through tanh at an input gate, then, an activation function sigmoid is used for information screening, and the information updating degree of cells is determined, I_tIs a scalar for measuring cell renewal, and the output of the output layer is h_tThe hidden layer will serve as the output of the LSTM.

An input gate:

forget the door:

an output gate:

where σ is the value of the activation function sigmoid, W_iIs the weight value of the input gate, h_t-1Is the hidden state value of the cell at time t-1, and b represents the bias term. W_fIs the weight of the forgetting gate, W_oThe weights of the output gates are increased, each weight gradient at the t moment in the prediction process can be calculated through the formula (12) to the formula (15) along with the increment of the data points in the prediction process, and then the weights are substituted into the formula (16) to the formula (19) to obtain the final weight gradient after summation, so that the weights are updated.

The specific calculation steps of step 5) are as follows:

5.1) reversely calculating the value of the error term sigma corresponding to each moment, wherein the reverse propagation of the error term is divided into two directions, namely the reverse propagation along the time is respectively calculated, namely the error term sigma-1 at the moment t-1 is calculated from the current moment t; the error term is propagated up one layer.

5.2) calculation of the gradient of the weights, the gradient of each weight, i.e. W, being calculated according to the corresponding error term_fh，W_ch，W_oh，W_ihTheir gradient is the sum of the gradients at the respective time instants. It is therefore necessary to first determine their gradient at time t and then determine their final gradient.

Calculating an error term sigma at time t_o,t，σ_f,t，σ_i,t，σ_c,tFurther, a weight value W at time t is obtained_fh，W_ch，W_oh，W_ihNamely:

the final gradient is obtained by adding the gradients at the various times together:

the LSTM prediction process is used for directly predicting the data of the 1 st class obtained in the clustering process as the data of the 1 st stage of prediction; predicting the data of the 1 st and 2 nd classes as the data of the 2 nd order stage, and so on; and obtaining the results of the real value and the predicted value of the overall health score OHS of the engine.

The step 6) is specifically as follows: and (3) multi-stage residual life prediction, wherein data adopts a sequential association relation, firstly, clustering is carried out on the LSTM divided into a plurality of stages, and then prediction is carried out. In order to more accurately obtain the prediction effect, directly predicting the data of the 1 st class obtained by clustering as the data of the 1 st stage of prediction; then, iteration is carried out on the data for further prediction, and the clustered data of the 1 st and 2 nd classes are used as the data of the 2 nd stage for prediction; and predicting the data of the 1 st, 2 nd and 3 rd classes obtained by clustering as the data of the 3 rd stage, and then performing model comparison analysis.

The invention researches the prediction of the remaining life of the engine based on clustering and LSTM. The deep learning method can learn complex data characteristics, so that the multi-stage degradation characteristics of the engine are analyzed by combining clustering, the multi-stage remaining life of the engine is predicted, the method can be applied to health management of the engine to a certain extent, and the method has certain practical significance.

The method realizes the fusion of multi-dimensional data, establishes a prediction model based on the result of multi-stage information fusion, introduces real-time monitoring data by using the parameters of the multi-stage prediction model, adopts the algorithm of cluster analysis and LSTM fusion to realize the update and prediction of the parameters of the multi-stage model, finally predicts the performance degradation trend of the aero-engine and obtains the accurate prediction of the residual life of the aero-engine.

The invention has the following beneficial effects: the invention establishes a new prediction model according to the performance parameters of the aeroengine, the model integrates the advantages of an LSTM neural network and K-means clustering, the K-means clustering has the advantages of short time, high speed, simple thought, easy explanation and the like in the aspect of calculation, the LSTM can replace a common hidden layer node by using a memory block during prediction, the information storage can be ensured to span any delay, an error signal is returned to a time point before a long time, the network society is enabled to forget and be far away from a saturated state, and the problems of long-term dependence, gradient extinction, explosion and the like are avoided. Experimental analysis and verification are carried out through the algorithm, the calculated error of the ILSTMC in the residual life prediction is superior to that of other traditional models, experimental analysis shows that the ILSTMC method is effective in the residual life prediction of the aero-engine, so that an operator can better manage the health of the engine by combining the prediction result of the scheme, and the method has important significance in guaranteeing the flight safety of the airplane.

Drawings

FIG. 1 is a schematic structural framework of the present invention;

FIG. 2 is an OHS normalization chart for the 83 st engine;

FIG. 3 is a graph of cluster analysis results for the 83 th engine;

FIG. 4(a) is an ILSTMC stage 1 prediction diagram for the 83 st engine;

FIG. 4(b) is an ILSTMC stage 2 prediction diagram for the 83 st engine;

FIG. 4(c) is an ILSTMC stage 3 prediction diagram for the 83 st engine;

FIG. 5 is a comparison graph of RMSE of ILSTMC and other algorithm predictions for the 83 th engine;

fig. 6 is a multi-stage life cycle prediction error comparison diagram for the 83 th engine.

Detailed Description

In order to clearly explain the technical features of the present patent, the following detailed description of the present patent is provided in conjunction with the accompanying drawings.

As shown in figures 1-6, the clustering and LSTM-based method for predicting the residual life of the aircraft engine mainly comprises two modules, namely a clustering module and a prediction module

The clustering module comprises: the method mainly has the advantages that the K-means clustering is short in calculation time, high in speed, simple in thought, easy to explain and the like, input engine data are normalized to be between 0 and 1, and then the average value is obtained. Next, the Sum of Squared Errors (SSE) in the clusters is used to find the optimal number of clusters, and the number of clusters is determined by observing the inflection point diagram. Finally, clustering and catastrophe points are obtained by utilizing clustering analysis, and the data of the engine are divided into a plurality of stages. The occurrence of the abrupt points may cause the remaining life condition of the engine to change, so the processing of the abrupt points becomes the basis of clustering. The detection of the discontinuities is mainly a calculation of the distance of each object from these central objects.

The prediction module comprises: and performing multi-stage prediction and multi-stage prediction based on the mutation points obtained by clustering, wherein the data adopts a sequential association relationship, and firstly performing clustering treatment on the LSTM divided into a plurality of stages and secondly performing prediction. In order to more accurately obtain the prediction effect, directly predicting the data of the 1 st class obtained by clustering as the data of the 1 st stage of prediction; then, iteration is carried out on the data for further prediction, and the clustered data of the 1 st and 2 nd classes are used as the data of the 2 nd stage for prediction; and predicting the data of the 1 st, 2 nd and 3 rd classes obtained by clustering as the data of the 3 rd stage, and then performing model comparison analysis. Finally, the prediction error value of each engine is calculated, and the life cycle of the engine is divided in consideration of prediction of the remaining life of the engine, and the prediction errors in 5 stages of 60%, 70%, 80%, 90% and 95% are respectively extracted and compared with the errors of each method. And substituting the prediction result into a cost function to obtain RMSE, and comparing the RMSE with the prediction results of the traditional neural networks LSTM and RNN and the linear regression method LP.

As shown in fig. 1 to 5, the method for predicting the remaining life of the engine based on clustering and LSTM of the present invention comprises the following specific steps:

1) and (3) standardization: inputting engine data, and carrying out sample normalization averaging processing; firstly, multi-dimensional aeroengine performance monitoring data are input, normalization and averaging processing is carried out on the data, and the purpose is to improve the convergence speed and the calculation accuracy of the model.

Firstly, normalizing and averaging input multidimensional aeroengine performance monitoring data to normalize the data to [0, 1] so as to avoid the possibility of positive and negative offset, and then averaging the normalized result, wherein the specific calculation formula of normalization is as follows:

and (4) clustering analysis, namely calculating the optimal number k of clusters by using Sum of Squared Errors (SSE) in the clusters, and judging the cluster number by observing an inflection point diagram of the cluster. The formula is as follows:

ω^(i,j)＝[X₁+X₂+…+X_k]^-1 (4)

and (4) utilizing clustering analysis to obtain clusters and catastrophe points, and dividing the data of the engine into a plurality of stages. The occurrence of the abrupt points may cause the remaining life condition of the engine to change, so the processing of the abrupt points becomes the basis of clustering. The detection of the mutation points is mainly to calculate the distance between each object and the central objects, and the specific calculation is shown in formula (2) to formula (4).

4) Prediction of the LSTM model: using the discontinuities for prediction, the flow of information in the state of the hidden layer cell can be controlled by the value of the error term σ being in the input, forgetting and output gates of [0, 1 ]:

An input gate:

forget the door:

an output gate:

5) Calculating a weight value: calculating the weight value of each gate according to an algorithm formula of the LSTM, wherein the specific calculation steps are as follows:

6) Multi-stage prediction: and (3) multi-stage residual life prediction, wherein data adopts a sequential association relation, firstly, clustering is carried out on the LSTM divided into a plurality of stages, and then prediction is carried out. In order to more accurately obtain the prediction effect, directly predicting the data of the 1 st class obtained by clustering as the data of the 1 st stage of prediction; then, iteration is carried out on the data for further prediction, and the clustered data of the 1 st and 2 nd classes are used as the data of the 2 nd stage for prediction; and predicting the data of the 1 st, 2 nd and 3 rd classes obtained by clustering as the data of the 3 rd stage, and then performing model comparison analysis. And substituting the prediction result of the ILSTMC into a cost function to obtain RSME, and comparing the RSME with the prediction results of the traditional neural networks LSTM and RNN and a linear regression LP method.

The prediction error value of each engine is calculated, and the life cycle of the engine is divided in consideration of prediction of the remaining life of the engine, and the prediction errors in 5 stages of 60%, 70%, 80%, 90% and 95% are respectively extracted and compared with the errors of the respective methods.

And substituting the prediction result into a cost function to obtain RMSE, and comparing with errors of the prediction results of the traditional neural networks LSTM and RNN and the linear regression LP method.

Experiment of

The experiments were validated using the C-MAPSS dataset on the NASA website. The C-MAPSS data set has 4 subsets, each subset has 27-dimensional data, wherein the first 2 dimensions are engine numbers and the number of times of engine operation, the 3 dimensions correspond to the operating environment setting, the 22 dimensions correspond to the gas path performance data of the sensor, and each subset has different operating environment and fault number.

The gas circuit parameters reflect the health state of each gas circuit component or unit body, so the gas circuit performance parameters are key data influencing the overall performance condition of the engine and play a leading role in the prediction process. And 7 characterization parameters in the gas path performance parameters are used as decision factors of the overall health condition of the engine, and data fusion of multidimensional parameters is carried out to describe the performance degradation track of the engine.

Assuming that the sequence length of a sample is 5, and the input neuron at each moment is 1, sequentially bringing a training set and a test set of data into a neural network for training and testing; in the Adam algorithm of the LSTM, 1 sample value is taken out each time for batch processing, the learning rate η is set to be 0.01, the hidden layer size is 2 layers, the node number of each layer is 16 and 8 respectively, and 5 epoch periods are respectively taken by a training set and a test set to predict the remaining life of the engine.

And determining the clustering number k of each engine through calculation, wherein the k value also represents the specific stage number, then respectively carrying out model prediction on each engine, comparing the prediction results corresponding to each k value, and selecting the clustering number with the minimum error as the specific clustering number of each engine.

Taking the 83 th engine as an example, when using cluster and LSTM based ILSTMC model for prediction, the following comparison may illustrate the superiority of the cluster and LSTM based prediction method:

as shown in fig. 4, compared with RNN, LSTM, linear regression method, it was found that the overall health score OHS value of the engine is closer to the true value as the time point increases during the prediction process of the ILSTMC model;

in the case of multi-stage prediction, the RMSE of the last stage is reduced by 0.67% compared to LSTM, and on average each stage is reduced by 2.39% compared to LSTM;

as shown in fig. 5, for the full life cycle prediction, the error value is reduced by 0.58% from LSTM for 90% -95% life cycle, and the average error per stage life is reduced by 2.64% from LSTM.

While the invention has been described in terms of its preferred embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention.

Claims

1. A method for predicting the remaining life of an engine based on clustering and LSTM is characterized by comprising the following steps:

4) prediction of the LSTM model: using the mutation points for prediction;

2. The method for predicting the remaining life of the engine based on the clustering and the LSTM according to claim 1, wherein the step 1) is specifically as follows: firstly, normalizing and averaging input multidimensional aeroengine performance monitoring data to normalize the data to [0, 1] so as to avoid the possibility of positive and negative offset, and then averaging the normalized result, wherein the specific calculation formula of normalization is as follows:

wherein x is_normalFor normalized data, x_tAs raw data, x_maxIs the maximum value, x, in the original data sample_minAt its minimum. The normalized averaged data was defined as the Overall Health Score (OHS) of the engine.

3. The method for predicting the remaining life of the engine based on the clustering and the LSTM according to claim 1, wherein the step 2) is specifically as follows: the Sum of Squared Errors (SSE) in the clusters is used to find the optimal number k of clusters, and the number of clusters is determined by observing the inflection point diagram. The formula is as follows:

ω^(i,j)＝[X₁+X₂+…+X_k]^-1 (4)

4. The method for predicting the remaining life of the engine based on the clustering and the LSTM according to claim 1, wherein the step 3) is specifically as follows: and (4) utilizing clustering analysis to obtain clusters and catastrophe points, and dividing the data of the engine into a plurality of stages. The occurrence of the abrupt points may cause the remaining life condition of the engine to change, so the processing of the abrupt points becomes the basis of clustering. The detection of the mutation points is mainly to calculate the distance between each object and the central objects, and the specific calculation is shown in formula (2) to formula (4).

5. The method for predicting the remaining life of the engine based on the clustering and the LSTM according to claim 1, wherein the step 4) is specifically as follows: using the discontinuities for prediction, the flow of information in the state of the hidden layer cell can be controlled by the value of the error term σ being in the input, forgetting and output gates of [0, 1 ]:

An input gate:

forget the door:

an output gate:

where σ is the value of the activation function sigmoid, W_iIs the weight value of the input gate, h_t-1Is the hidden state value of the cell at time t-1, and b represents the bias term. W_fIs the weight of the forgetting gate, W_oIs the weight of the output gate, increasing with the predicted process data pointEach weight gradient at the time t in the prediction process can be calculated through the formula (12) to the formula (15), and then the final weight gradient after summation is obtained by substituting the formula (16) to the formula (19), so as to update the weight.

6. The method for predicting the remaining life of the engine based on the clustering and the LSTM according to claim 1, wherein the specific calculation steps of the step 5) are as follows:

7. The method for predicting the remaining life of the engine based on the clustering and the LSTM according to claim 1, wherein the step 6) is specifically as follows: and (3) multi-stage residual life prediction, wherein data adopts a sequential association relation, firstly, clustering is carried out on the LSTM divided into a plurality of stages, and then prediction is carried out. In order to more accurately obtain the prediction effect, directly predicting the data of the 1 st class obtained by clustering as the data of the 1 st stage of prediction; then, iteration is carried out on the data for further prediction, and the clustered data of the 1 st and 2 nd classes are used as the data of the 2 nd stage for prediction; and predicting the data of the 1 st, 2 nd and 3 rd classes obtained by clustering as the data of the 3 rd stage, and then performing model comparison analysis.