CN114282443B - Residual service life prediction method based on MLP-LSTM supervised joint model - Google Patents

Residual service life prediction method based on MLP-LSTM supervised joint model Download PDF

Info

Publication number
CN114282443B
CN114282443B CN202111623573.6A CN202111623573A CN114282443B CN 114282443 B CN114282443 B CN 114282443B CN 202111623573 A CN202111623573 A CN 202111623573A CN 114282443 B CN114282443 B CN 114282443B
Authority
CN
China
Prior art keywords
mlp
lstm
neural network
layer
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111623573.6A
Other languages
Chinese (zh)
Other versions
CN114282443A (en
Inventor
张新民
张雨桐
李乐清
朱哲人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202111623573.6A priority Critical patent/CN114282443B/en
Publication of CN114282443A publication Critical patent/CN114282443A/en
Application granted granted Critical
Publication of CN114282443B publication Critical patent/CN114282443B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a residual service life prediction method based on an MLP-LSTM supervised joint model, which comprises the steps of firstly, carrying out data fusion on multi-dimensional time sequence historical information by using a multi-layer perceptron MLP to extract health index characteristics of a machine; and inputting the extracted health index time sequence into the LSTM, and calculating the current remaining service life RUL of the machine. Further, supervised training is carried out on two serially connected neural networks by using a labeled sample data set to update the weight, the prediction result is evaluated in a verification set, and the parameters are adaptively adjusted to obtain an optimization model. The MLP-LSTM supervised combined model obtained through training not only effectively improves the prediction capability of the LSTM on the residual service life, but also provides a feature fusion result of multi-dimensional sensor data, can effectively express the health condition of the current machine, and provides effective reference indexes for equipment maintenance and repair.

Description

Residual service life prediction method based on MLP-LSTM supervised joint model
Technical Field
The invention belongs to the field of industrial process control, and particularly relates to a residual service life prediction method based on an MLP-LSTM supervised joint model.
Background
In the industrial field, the working performance and health of some important machine devices and industrial components tend to decline when they continuously work due to the influence of internal motion factors or external environmental factors. With the continuous decline of health condition, equipment can't normally work at some time in the future, and its work efficiency drops rapidly even stops the operation, reaches life, and this can lead to industrial process to receive the influence even take place to interrupt. It is therefore necessary to predict the Remaining Useful Life (RUL) of the system during its entire useful life, i.e. the length of time from the current time until the end of the useful life of the machine equipment.
In recent years, with the collection and accumulation of large amounts of industrial data, data-driven solutions have received much attention in RUL prediction. The data-driven solution does not need to know the detailed operation mechanism of the mechanical system, and can accurately predict the residual service life of modern factory equipment with a complex mechanism model only by identifying the condition of the system based on data collected by a sensor and a data-driven algorithm. Traditional predictions are mainly based on physical degradation models, and the correct establishment of degradation models relies heavily on expert knowledge. These assumptions and requirements are very limited in practical industrial production applications. The traditional machine learning method needs manual feature design, which requires a lot of professional knowledge of practitioners and a separate feature extraction process, which increases the difficulty of wide application of the model. Because the deep learning has the capability of extracting the automatic features of the data and less depends on the prior knowledge of the system, the recent research shows that the deep learning can better process industrial big data and more accurately predict the residual service life of mechanical equipment compared with a physical degradation model and a traditional machine learning algorithm.
However, there are still some problems in the current research on predicting RUL by deep learning method. Firstly, data fusion and prediction are generally divided into two separate steps, generally, data fusion is firstly carried out to obtain a health index, and then a fusion signal is used for executing an RUL prediction process, and the traditional process results in that an internal relation between two tasks is lacked, and the relation between the fusion signal and a prediction result cannot be explained. Therefore, a common method for predicting the RUL deep learning of the multi-sensor data is to directly obtain end-to-end prediction output. The advantage of this approach is that it is completely data-driven, without the need to assume a degenerate model, parametric distribution and manually extract features. However, this method belongs to the black box model method and cannot provide any information of the performance degradation process. The remaining useful life is a time value whose magnitude is linearly decaying. The physical condition of the machine is not changed linearly but decays exponentially, and the maintenance of the machine depends on the exponential model to a great extent, so that the maintenance is completed before the machine enters a rapid decay period. There is therefore a need to be able to predict and build both remaining usage time and health indicators based on sensor information.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a residual service life prediction method based on an MLP-LSTM supervised joint model, which can obtain the information of a performance degradation process while judging the current residual service life of a machine and simultaneously improve the prediction effect of a single LSTM neural network.
A residual service life prediction method based on an MLP-LSTM supervised joint model is characterized in that an MLP neural network is added between an input layer and a deep LSTM neural network, and the MLP neural network is used for data fusion; the deep LSTM neural network is used for predicting the residual service life, namely RUL (run least squares);
the method comprises the following steps:
the method comprises the following steps: collecting equipment data to form a data set, dividing the data set into a training set and a verification set, and preprocessing the data according to different working conditions;
step two: inputting the training set into an MLP neural network, compressing the multi-dimensional sensor characteristics into HI health characteristic indexes by the MLP neural network, and obtaining a plurality of HI time sequences of health indexes;
step three: inputting the health index HI time sequence into a depth LSTM neural network, and calculating by the depth LSTM neural network to obtain an RUL predicted value;
step four: calculating a loss function based on an error between a predicted value and a true value of the RUL, and training an MLP-LSTM supervised joint model by adopting a training set through RMSprop gradient self-adaptation; when the error result obtained after the training set and the verification set are input into the current model is smaller than a certain value or the variation of the error result is smaller than a certain value, the loss function of the model training is converged, the model training is finished, and the MLP-LSTM supervised combined model is stored;
step five: and preprocessing the equipment data to be predicted, and inputting the preprocessed equipment data into a stored MLP-LSTM supervised joint model to obtain HI and RUL values output in real time.
Further, the tagged data set in the first step is:
X o ={(x it ,rul it )|i≤n,t≤T i } (1)
among them, rul it Is tThe value of the remaining service life at the moment,
rul it =T i -t (2)
when the device is completely unusable, rul it 0, and all ruls it Is inversely increased in time sequence;
x it for the sequence of the ith sensor data from the initial to time t,
x it =[x i (1),x i (2),...,x i (t)](3) Wherein x is i From initial to time T for ith sensor data i The sequence of (a) to (b),
x i =[x i (1),x i (2),...,x i (T i )] (4)
the preprocessing comprises normalization processing and sliding time window sampling processing; and when the equipment data are data under different working conditions, carrying out condition normalization, otherwise, carrying out global normalization.
Further, in the second step, a multi-sensor information multi-dimensional time sequence is input into the MLP, the MLP compresses the multi-dimensional data into one dimension, and finally a set including a health index HI time sequence is output;
the MLP building and pre-training process comprises the following steps:
inputting a multi-sensor information multi-dimensional time sequence into an MLP neural network, and compressing multi-dimensional data into one dimension by the MLP neural network; in the MLP neural network forward propagation process, each node is obtained by calculating all nodes of the previous layer, a weight W is given to each node of the previous layer, a bias b is added, and finally the value of a certain node of the next layer is obtained through an activation function:
wherein the value of the L +1 layer node j is
Figure BDA0003439015030000031
The output of the last layer of the MLP neural network is a set H of HI time sequences
H={h i (t j )|i=1,2,...,N;1,2,...,T i } (6)
Wherein H is the health index H at each time point i (t j ) Set of constituents, h i (t j )=f(x i (t j )),f(x i ) The function is a function corresponding to the MLP neural network; t is the length of the time series; h represents the health index HI; x is the number of i (t j ) Denotes t j Set of each sensor data l at a time, x i (t j )=[l i,1 (t j ),l i,2 (t j ),...,l i,p (t j )]∈R 1×p (ii) a x represents the raw sample, p represents the number of sensors; x is a radical of a fluorine atom i (t j ) Is X, X = { X = i (t j )|i=1,2,...,N;1,2,...,T i }。
Further, the third step is specifically divided into the following sub-steps:
the depth LSTM neural network is formed by stacking a plurality of layers of LSTMs, and the vector dimension of each layer of LSTM is variable; the HI health index is decoded into a multidimensional sensor time sequence through a first layer LSTM, the output of the upper layer of a depth LSTM network is used as the input of the next layer, and the updating formula of the l layer is as follows:
Figure BDA0003439015030000032
Figure BDA0003439015030000033
Figure BDA0003439015030000034
Figure BDA0003439015030000035
Figure BDA0003439015030000036
wherein l represents the layer number of the deep LSTM neural network, t represents the unit number of the LSTM at a certain moment,
Figure BDA0003439015030000041
an input unit indicating the time of the l-th layer t,
Figure BDA0003439015030000042
a forgetting unit indicating the time t of the l-th layer,
Figure BDA0003439015030000043
an output unit indicating the time t of the l-th layer,
Figure BDA0003439015030000044
a status cell indicating the time t of the l-th layer,
Figure BDA0003439015030000045
indicating a hidden unit at the moment of the ith layer t,. Sigma.indicating a sigmoid activation function,. Alpha.indicating an element multiplication calculation,. Tanh indicating a tanh activation function,
Figure BDA0003439015030000046
representing the hidden unit weight at layer l-1 time instant t,
Figure BDA0003439015030000047
representing the hidden unit weight at time t-1 of the l-th layer,
Figure BDA0003439015030000048
indicating a deviation;
and outputting the multidimensional characteristic vector by the last unit of the LSTM neural network of the last layer, and calculating by a linear layer to obtain the RUL predicted value.
Further, the fourth step is specifically divided into the following sub-steps:
(1) The input layer of the deep LSTM neural network is MLP-LSTM supervisedThe first layer of the combined model comprises n neurons, and the output layer of the MLP neural network is the first layer (1) of the MLP-LSTM supervised combined model and only has one neuron; designing neuron error delta of l layer and l-1 layer of MLP-LSTM supervised joint model l 、δ l-1 The MLP neural network and the deep LSTM neural network are used for realizing synchronous training of the MLP neural network and the deep LSTM neural network;
δ l =(w l+1 ) T δ l+1 (12)
Figure BDA0003439015030000049
wherein, w and B are weight parameters and batch sizes of the neural network respectively;
(2) In the supervised joint training, a square error loss function constrained by L2 regularization is applied to carry out gradient self-adaptive training parameters, a score function is adopted to evaluate the prediction accuracy of the MLP-LSTM supervised joint model, the score function is added into a global loss function with certain weight as punishment, and the square error loss function is optimized to obtain an MLP-LSTM supervised joint model biased to early prediction:
wherein, the calculation formula of the square error loss function is as follows:
Figure BDA00034390150300000410
wherein the values of Θ, w, B, λ,
Figure BDA00034390150300000411
and y i Respectively representing a parameter set learned in the MLP-LSTM supervised joint model, a weight parameter set in the MLP-LSTM supervised joint model, a batch size, a regularization parameter, a predicted RUL and a true RUL of an ith sample;
the scoring function Score is calculated as follows:
Figure BDA00034390150300000412
d=RUL pred -RUL true (16)
the global Loss function Loss total The calculation formula of (a) is as follows:
Loss total =αLoss score +(1-α)Loss MSE (17)
wherein α is the weight of two scoring functions;
(3) Training an MLP-LSTM supervised joint model by adopting a labeled training set through RMSprop gradient self-adaptation, wherein the calculation formula is as follows:
Figure BDA0003439015030000051
where r is an accumulated variable of the history gradient, ρ is a contraction coefficient for controlling acquisition of history information, η is a learning rate, δ is a constant, and g is Loss total Of the gradient of (c).
The invention has the following beneficial effects:
the invention provides a general RUL semi-supervised joint prediction framework method aiming at analyzing the conventional RUL prediction method in the running degradation process of machine equipment and combining a deep learning theory and the problems existing in the previous research, thereby realizing the synchronous RUL prediction of health index data fusion and multi-sensor data. The training model provides a continuous visualization process of system degradation, but also ensures efficient prediction of the generated fusion signal for RUL and rapid convergence of the predictive model training process. Furthermore, a loss function of the RUL life prediction model is modified, so that the trained model is more biased to early prediction, the prediction result can ensure the maintenance to be advanced, and the prediction model is safer.
Drawings
FIG. 1 is a schematic diagram of an MLP-LSTM neural network model;
FIG. 2 is a flow diagram of a joint model framework training implementation;
FIG. 3 is a diagram illustrating results generated after different normalization strategies;
FIG. 4 is a graphical (raw) plot of the data fusion output health indicator HI over time, wherein the upper graph in FIG. 4 represents the HI output for all turbine data in the test set, and the lower graph represents the HI output for selected test set partial sample data.
FIG. 5 is a graphical representation of a time-varying (filtered) plot of data fusion output health indicator HI, where the upper graph in FIG. 5 represents the HI output for all turbine data in the test set and the lower graph represents the HI output for selected test set partial sample data.
FIG. 6 is a schematic diagram of a fitting curve of the MLP-LSTM neural network model.
FIG. 7 is a schematic diagram of an on-line prediction RUL fitting curve.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and preferred embodiments, and the objects and effects of the present invention will become more apparent, it being understood that the specific embodiments described herein are merely illustrative of the present invention and are not intended to limit the present invention.
In the method for predicting the residual service life based on the MLP-LSTM supervised joint model, the MLP-LSTM supervised joint model is realized by adding a DBN neural network between an input layer and a deep LSTM neural network, and the MLP neural network is used for data fusion; the deep LSTM neural network is used for residual life prediction, i.e., RUL prediction. And inputting the data slice sequence into an MLP neural network of an MLP-LSTM supervised joint model to calculate the HI time sequence of a single dimension. The last layer of the MLP neural network is connected with a deep LSTM network model, namely the HI sequence fused by the MLP neural network data is sequentially input into the deep LSTM network for calculation according to the length of time _ step. And after a complete MLP-LSTM network joint prediction model is obtained, further performing gradient descent iteration by using an RMSprop optimizer. The specific structure is shown in fig. 1. In addition, because the MLP neural network can deeply extract data characteristics, the overfitting problem can easily occur in the training process, the network is optimized by adopting a batch standardization network layer, and a regularization item is added to the MLP neural network to solve the problem. Inputting the data of the test set, outputting the change process of the health index HI in the MLP network, and outputting the predicted value of the RUL in the final network layer of the whole model. In the concrete engineering implementation, the network layers in the two models are connected in the same model, and an RMSprop optimizer is used for iterative joint training. The entire training process proceeds as shown in fig. 2.
The method comprises the following steps:
the method comprises the following steps: collecting equipment data to form a data set, dividing the data set into a training set and a verification set, and preprocessing the data according to different working conditions; the data of the labeled training set comprises a time stamp of time sequence data, a numerical value of each characteristic variable at each moment, and an RUL data label or equipment life end time for calculating the RUL label; the content of the labeled verification set is the same as that of the labeled training set, and the number of the labeled verification sets is 10-30% of that of the labeled training set.
The tagged data set in the step one is as follows:
X o ={(x it ,rul it )|i≤n,t≤T i } (1)
among them, rul it For the value of the remaining service life at time t,
rul it =T i -t (2)
when the device is completely out of service, rul it 0, and all ruls it Is inversely increased in time sequence;
x it for the ith sensor data sequence from initial to time t,
x it =[x i (1),x i (2),...,x i (t)] (3)
wherein x is i From initial to time T for ith sensor data i The sequence of (a) to (b),
x i =[x i (1),x i (2),...,x i (T i )] (4)
the preprocessing comprises normalization processing and sliding time window sampling processing; and when the equipment data are data under different working conditions, carrying out condition normalization, otherwise, carrying out global normalization.
The LSTM recurrent neural network has standard input forms (batch _ size, time _ steps, feature _ nums), where batch _ size refers to the number of samples that are processed in a batch during the training of the neural network model, time _ steps refers to the time step of the time series data in each sample, and feature _ nums refers to the number of dimensions of features in the multi-sensor data. In order to further process the data set into a standard pattern, a sliding time window method is used for sample sampling.
Step two: inputting the training set into an MLP neural network, compressing the multi-dimensional sensor characteristics into HI health characteristic indexes by the MLP neural network, and obtaining a plurality of HI time sequences of health indexes;
in the second step, a multi-sensor information multi-dimensional time sequence is input into the MLP, the MLP compresses the multi-dimensional data into one dimension, and finally a set comprising a health index HI time sequence is output;
the MLP building and pre-training process comprises the following steps:
inputting a multi-sensor information multi-dimensional time sequence into an MLP neural network, and compressing multi-dimensional data into one dimension by the MLP neural network; in the MLP neural network forward propagation process, each node is obtained by calculating all nodes of the previous layer, a weight W is given to each node of the previous layer, a bias b is added, and finally the value of a certain node of the next layer is obtained through an activation function:
wherein the value of the L +1 layer node j is
Figure BDA0003439015030000071
The output of the last layer of the MLP neural network is a set H of HI time sequences
H={h i (t j )|i=1,2,...,N;1,2,...,T i } (6)
Wherein H is the health index H at each time point i (t j ) Set of constituents, h i (t j )=f(x i (t j )),f(x i ) The function is a function corresponding to the MLP neural network; t is timeThe length of the sequence; h represents the health index HI; x is a radical of a fluorine atom i (t j ) Denotes t j Set of data/for each sensor at a time, x i (t j )=[l i,1 (t j ),l i,2 (t j ),...,l i,p (t j )]∈R 1×p (ii) a x denotes the raw sample, p denotes the number of sensors; x is the number of i (t j ) Is X, X = { X = } i (t j )|i=1,2,...,N;1,2,...,T i }。
Step three: inputting the health index HI time sequence into a depth LSTM neural network, and calculating by the depth LSTM neural network to obtain an RUL predicted value;
the third step is specifically divided into the following substeps:
the depth LSTM network is formed by stacking a plurality of layers of LSTMs, and the vector dimension of each layer of LSTM is variable; the HI health index is decoded into a multidimensional sensor time sequence through a first layer LSTM, the output of the upper layer of a depth LSTM network is used as the input of the next layer, and the updating formula of the l layer is as follows:
Figure BDA0003439015030000081
Figure BDA0003439015030000082
Figure BDA0003439015030000083
Figure BDA0003439015030000084
Figure BDA0003439015030000085
where l represents the number of layers of the deep LSTM neural network, t is the tableThe number of cells at a time of the LSTM is shown,
Figure BDA0003439015030000086
an input unit indicating the time t of the l-th layer,
Figure BDA0003439015030000087
a forgetting unit indicating the time t of the l-th layer,
Figure BDA0003439015030000088
an output unit representing the time t of the l-th layer,
Figure BDA0003439015030000089
a status cell indicating the time t of the l-th layer,
Figure BDA00034390150300000810
indicating a hidden unit at the moment of the ith layer t,. Sigma.indicating a sigmoid activation function,. Alpha.indicating an element multiplication calculation,. Tanh indicating a tanh activation function,
Figure BDA00034390150300000811
representing the hidden unit weight at the time t of layer l-1,
Figure BDA00034390150300000812
representing the hidden unit weight at time t-1 of the l-th layer,
Figure BDA00034390150300000813
indicating a deviation;
and outputting the multidimensional characteristic vector by the last unit of the LSTM neural network of the last layer, and calculating by a linear layer to obtain the RUL predicted value.
Step four: based on an error calculation loss function between a predicted value and a true value of the RUL, training an MLP-LSTM supervised joint model by adopting a training set through RMSprop gradient self-adaptation; when the error result obtained after the training set and the verification set are input into the current model is smaller than a certain value or the variation of the error result is smaller than a certain value, the loss function of the model training is converged, the model training is finished, and the MLP-LSTM supervised combined model is stored;
the fourth step is specifically divided into the following substeps:
(1) The input layer of the deep LSTM neural network is the l-th layer of the MLP-LSTM supervised joint model and comprises n neurons, and the output layer of the MLP neural network is the l-1-th layer of the MLP-LSTM supervised joint model and only has one neuron; designing neuron error delta of l layer and l-1 layer of MLP-LSTM supervised joint model l 、δ l-1 The MLP neural network and the deep LSTM neural network are used for realizing synchronous training of the MLP neural network and the deep LSTM neural network;
δ l =(w l+1 ) T δ l+1 (12)
Figure BDA00034390150300000814
wherein w and B are weight parameters and batch sizes of the neural network, respectively;
(2) In the supervised joint training, a square error loss function constrained by L2 regularization is applied to carry out gradient self-adaptive training parameters, a score function is adopted to evaluate the prediction accuracy of the MLP-LSTM supervised joint model, the score function is added into a global loss function with certain weight as punishment, the square error loss function is optimized, and the MLP-LSTM supervised joint model biased to early prediction is obtained:
wherein, the square error loss function calculation formula is as follows:
Figure BDA0003439015030000091
wherein the values of Θ, w, B, λ,
Figure BDA0003439015030000092
and y i Respectively representing a parameter set learned in the MLP-LSTM supervised joint model, a weight parameter set in the MLP-LSTM supervised joint model, a batch size, a regularization parameter, a predicted RUL and a true RUL of an ith sample;
the scoring function Score is calculated as follows:
Figure BDA0003439015030000093
d=RUL pred -RUL true (16)
the global Loss function Loss total The calculation formula of (a) is as follows:
Loss total =αLoss score +(1-α)Loss MSE (17)
wherein α is the weight of two scoring functions;
(3) Training the MLP-LSTM supervised joint model by adopting a labeled training set through RMSprop gradient self-adaptation, wherein the calculation formula is as follows:
Figure BDA0003439015030000094
where r is an accumulated variable of the history gradient, ρ is a contraction coefficient for controlling acquisition of history information, η is a learning rate, δ is a constant, and g is Loss total Of the gradient of (c).
Step five: and preprocessing the equipment data to be predicted, and inputting the preprocessed equipment data into a stored MLP-LSTM supervised joint model to obtain HI and RUL values output in real time.
The usefulness of the present invention is illustrated below with reference to a specific industrial example. The invention adopts an open source turbofan engine degradation simulation data set C-MAPSS provided by NASA (national aeronautics and astronautics administration) as an example, the data specifically comprises four sub data sets FD001-FD004 with different operating conditions and failure modes, each data set comprises three files of train _ FD00X, test _ FD00X and RUL _ FD00X which are respectively RUL truth labels of a training set, a testing set and a testing set. The details are shown in the following table:
table 1: details of C-MAPSS dataset
Figure BDA0003439015030000095
Figure BDA0003439015030000101
The data set FD002 is mainly used as a research object, compared with FD001 or FD003, the multi-sensor data of the data set has 6 working conditions, the complexity of the external environment is higher, the data is more, and the RUL prediction is more difficult in theory. The specific meanings of the various dimensions of the sensor are shown in the following table:
table 2: multi-sensor data specific representation of a turbomachine
Figure BDA0003439015030000102
Figure BDA0003439015030000111
After data are obtained, the data set is divided to obtain a label-free training set, a label training set and a label verification set, the original data are subjected to condition normalization processing according to 6 working conditions of the data set, and then sliding window processing is carried out to obtain a data slice sequence. The operating conditions of the turbine have a great influence on the sensor values, with the readings of the sensor in different states lying in completely different value ranges. The global normalization ignores the influence of working condition conditions, and all values of each sensor are normalized simultaneously. And the condition normalization is to normalize the data of each sensor under the same working condition. As shown in fig. 3, the effect of a certain turbine unit sensor 4, 7 after processing under different normalization strategies, respectively. If global normalization is used, although the prediction accuracy of the RUL is not affected, the output HI of the data fusion model is caused to be a global normalization variable, and a degradation trend is difficult to present. Therefore, a conditional normalization strategy is used in the data preprocessing to obtain the health index HI of the degradation trend.
The training data set and the test data set each had 100 units of turbomachinery. A sliding window of size num steps is used in each subset of units to generate the input sequence. For the model structure itself, it is mainly influenced by two hyper-parameters: batch size and num steps, which are set to different values and trained for different LSTM models in order to compare the effects of different hyper-parameters. The appropriate parameters are then selected based on the Score obtained for each model in the test set.
And adjusting regularization parameters of Dropout and the BN network according to the change of train _ loss and val _ loss and the stable condition, wherein over-fitting problems can be caused if the regularization is too small, and the model precision can be influenced if the regularization is too large. In order to prevent overfitting and reduce the training time cost, an early-stop strategy is used, a threshold value of loss reduction change is set, and training is stopped when the change does not exceed the threshold value in n continuous periods. An appropriate threshold parameter may be set to implement the early-stop strategy as required by the accuracy of the model results.
Based on the built joint training neural network, the output of the multi-sensor data fusion model, namely the time sequence of the health index HI, can be obtained at the middle network layer. The health indicator decay process curve for each turbine is shown in FIG. 4. The HI time series are further filtered using a Savitzky-Golay filter. The method is a method for realizing smooth filtering in a time domain by combining convolution and local polynomial regression. The filter is characterized in that the shape and the width of a signal can be kept unchanged while noise is filtered. The filtered HI time series curve is shown in fig. 5.
Based on the FD002 subdata set of the C-MAPSS data set, the RUL prediction of the test set is carried out by using the neural network model, and the RUL prediction result of the deep learning model on the multi-sensor time series of 100 turbines in the test set can be obtained as shown in FIG. 6. It was found that the MLP-LSTM combined model allows a good prediction of the RUL for each turbine based on the current multi-dimensional sensor time series. And further taking each time series slice as the state of one turbine unit, namely taking the historical information of the turbine at each moment as input, and realizing online real-time RUL prediction. The predicted results for 4 of these turbines are shown in figure 7. The predicted results are shown in table 3 for comparison with other methods:
table 3: comparison result of each RUL prediction algorithm model under RMSE and SCORE indexes
Methods RMSE Score
MLP 80.03 7.80×10 6
SVR 42.00 1.74×10 4
RVR 31.30 5.90×10 5
CNN 30.29 1.36×10 4
Depth LSTM 14.93 465
MLP-LSTM 12.74 335
As can be seen from table 3, the LSTM neural network exhibits better prediction effect on the RUL prediction application than the conventional machine learning algorithm and other deep learning algorithms. The MLP-LSTM combined neural network training model provided by the research can provide a health index HI for data fusion. Due to the more complex neural network structure, the deep extraction of model features can be realized, the prediction error is smaller, and the predicted value is more accurate. While having an error on the Score evaluation system of a lower order of magnitude.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and although the invention has been described in detail with reference to the foregoing examples, it will be apparent to those skilled in the art that various changes in the form and details of the embodiments may be made and equivalents may be substituted for elements thereof. All modifications, equivalents and the like which come within the spirit and principle of the invention are intended to be included within the scope of the invention.

Claims (4)

1. A residual service life prediction method based on an MLP-LSTM supervised joint model is characterized in that the MLP-LSTM supervised joint model is that an MLP neural network is added between an input layer and a deep LSTM neural network, and the MLP neural network is used for data fusion; the deep LSTM neural network is used for predicting the residual service life, namely RUL;
the method comprises the following steps:
the method comprises the following steps: collecting equipment data to form a data set, dividing the data set into a training set and a verification set, and preprocessing the data according to different working conditions;
step two: inputting the training set into an MLP neural network, compressing the multi-dimensional sensor characteristics into HI health characteristic indexes by the MLP neural network, and obtaining a plurality of HI time sequences of health indexes;
step three: inputting the health index HI time sequence into a depth LSTM neural network, and calculating by the depth LSTM neural network to obtain an RUL predicted value;
step four: based on an error calculation loss function between a predicted value and a true value of the RUL, training an MLP-LSTM supervised joint model by adopting a training set through RMSprop gradient self-adaptation; when the error result obtained after the training set and the verification set are input into the current model is smaller than a certain value or the variation of the error result is smaller than a certain value, the loss function of the model training is converged, the model training is finished, and the MLP-LSTM supervised combined model is stored;
the fourth step is specifically divided into the following substeps:
(1) The input layer of the deep LSTM neural network is the l-th layer of the MLP-LSTM supervised joint model and comprises n neurons, and the output layer of the MLP neural network is the l-1-th layer of the MLP-LSTM supervised joint model and only has one neuron; designing neuron error delta of l layer and l-1 layer of MLP-LSTM supervised joint model l 、δ l-1 The MLP neural network and the deep LSTM neural network are used for realizing synchronous training of the MLP neural network and the deep LSTM neural network;
δ l =(w l+1 ) T δ l+1
Figure FDA0003932298100000011
wherein, w and B are weight parameters and batch sizes of the neural network respectively;
(2) In the supervised joint training, a square error loss function constrained by L2 regularization is applied to carry out gradient self-adaptive training parameters, a score function is adopted to evaluate the prediction accuracy of the MLP-LSTM supervised joint model, the score function is added into a global loss function with certain weight as punishment, the square error loss function is optimized, and the MLP-LSTM supervised joint model biased to early prediction is obtained:
wherein, the calculation formula of the square error loss function is as follows:
Figure FDA0003932298100000012
wherein the values of Θ, w, B, λ,
Figure FDA0003932298100000021
and y i Respectively representing a parameter set learned in the MLP-LSTM supervised joint model, a weight parameter set in the MLP-LSTM supervised joint model, a batch size, a regularization parameter, a predicted RUL and a true RUL of an ith sample;
the calculation formula of the scoring function Score is as follows:
Figure FDA0003932298100000022
d=RUL pred -RUL true
the global penalty function Loss total The calculation formula of (c) is as follows:
Loss total =αLoss score +(1-α)Loss MSE
wherein α is the weight of the two scoring functions;
(3) Training an MLP-LSTM supervised joint model by adopting a labeled training set through RMSprop gradient self-adaptation, wherein the calculation formula is as follows:
Figure FDA0003932298100000023
where r is an accumulated variable of the history gradient, ρ is a contraction coefficient for controlling acquisition of history information, η is a learning rate, δ is a constant, g is Loss total A gradient of (a);
step five: and preprocessing the equipment data to be predicted, and inputting the preprocessed equipment data into a stored MLP-LSTM supervised joint model to obtain HI and RUL values output in real time.
2. The MLP-LSTM supervised joint model-based remaining service life prediction method as recited in claim 1, wherein the labeled data set in the first step is:
X o ={(x it ,rul it )|i≤n,t≤T i } (1)
among them, rul it For the value of the remaining service life at time t,
rul it =T i -t (2)
when the device is completely unusable, rul it 0, and all ruls it Is inversely increased in time sequence;
x it for the ith sensor data sequence from initial to time t,
x it =[x i (1),x i (2),...,x i (t)] (3)
wherein x is i For the ith sensor data from initial to time T i The sequence of (a) to (b),
x i =[x i (1),x i (2),...,x i (T i )] (4)
the preprocessing comprises normalization processing and sliding time window sampling processing; and when the equipment data are data under different working conditions, carrying out condition normalization, otherwise, carrying out global normalization.
3. The MLP-LSTM supervised joint model-based residual service life prediction method as recited in claim 1, wherein in the second step, a multi-sensor information multi-dimensional time series is input into the MLP, the MLP compresses the multi-dimensional data into one dimension, and finally, a set comprising a health index HI time series is output;
the MLP building and pre-training process comprises the following steps:
inputting a multi-sensor information multi-dimensional time sequence into an MLP neural network, and compressing multi-dimensional data into one dimension by the MLP neural network; in the MLP neural network forward propagation process, each node is obtained by calculating all nodes of the previous layer, a weight W is given to each node of the previous layer, a bias b is added, and finally the value of a certain node of the next layer is obtained through an activation function:
wherein the value of the L +1 layer node j is
Figure FDA0003932298100000031
The output of the last layer of the MLP neural network is a set H of HI time sequences
H={h i (t j )|i=1,2,...,N;j=1,2,...,T} (6)
Wherein H is the health index H at each time point i (t j ) Set of constituents, h i (t j )=f(x i (t j )),f(x i ) The function is a function corresponding to the MLP neural network; t is the length of the time sequence; h represents the health index HI; x is a radical of a fluorine atom i (t j ) Represents t j Set of each sensor data l at a time, x i (t j )=[l i,1 (t j ),l i,2 (t j ),...,l i,p (t j )]∈R 1×p (ii) a x represents the raw sample, p represents the number of sensors; x is a radical of a fluorine atom i (t j ) Is X, X = { X = } i (t j )|i=1,2,...,N;j=1,2,...,T}。
4. The MLP-LSTM supervised joint model-based remaining service life prediction method as recited in claim 1, wherein the step III is specifically divided into the following sub-steps:
the depth LSTM neural network is formed by stacking a plurality of layers of LSTMs, and the vector dimension of each layer of LSTM is variable; the HI health index is decoded into a multidimensional sensor time sequence through a first layer LSTM, the output of an upper layer of a deep LSTM network is used as the input of a next layer, and an updating formula of a l layer is as follows:
Figure FDA0003932298100000032
Figure FDA0003932298100000033
Figure FDA0003932298100000034
Figure FDA0003932298100000035
Figure FDA0003932298100000036
wherein l represents the number of layers of the deep LSTM neural network, t represents the number of units at a certain time of the LSTM,
Figure FDA0003932298100000041
an input unit indicating the time t of the l-th layer,
Figure FDA0003932298100000042
representing a forgetting unit at the moment of the ith layer t,
Figure FDA0003932298100000043
an output unit indicating the time t of the l-th layer,
Figure FDA0003932298100000044
a status cell indicating the time t of the l-th layer,
Figure FDA0003932298100000045
indicating a hidden unit at the moment of the ith layer t,. Sigma.indicating a sigmoid activation function,. Alpha.indicating an element multiplication calculation,. Tanh indicating a tanh activation function,
Figure FDA0003932298100000046
representing the hidden unit weight at the time t of layer l-1,
Figure FDA0003932298100000047
representing the hidden unit weight at time t-1 of the l-th layer,
Figure FDA0003932298100000048
represents a deviation;
and outputting the multi-dimensional characteristic vector by the last unit of the LSTM neural network of the last layer, and obtaining the RUL predicted value through linear layer calculation.
CN202111623573.6A 2021-12-28 2021-12-28 Residual service life prediction method based on MLP-LSTM supervised joint model Active CN114282443B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111623573.6A CN114282443B (en) 2021-12-28 2021-12-28 Residual service life prediction method based on MLP-LSTM supervised joint model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111623573.6A CN114282443B (en) 2021-12-28 2021-12-28 Residual service life prediction method based on MLP-LSTM supervised joint model

Publications (2)

Publication Number Publication Date
CN114282443A CN114282443A (en) 2022-04-05
CN114282443B true CN114282443B (en) 2023-03-17

Family

ID=80876961

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111623573.6A Active CN114282443B (en) 2021-12-28 2021-12-28 Residual service life prediction method based on MLP-LSTM supervised joint model

Country Status (1)

Country Link
CN (1) CN114282443B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024050782A1 (en) * 2022-09-08 2024-03-14 Siemens Aktiengesellschaft Method and apparatus for remaining useful life estimation and computer-readable storage medium
CN115987295B (en) * 2023-03-20 2023-05-12 河北省农林科学院 Crop monitoring data efficient processing method based on Internet of things
CN116697039B (en) * 2023-08-07 2023-09-29 德电北斗电动汽车有限公司 Self-adaptive control method and system for single-stage high-speed transmission

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486578A (en) * 2021-06-28 2021-10-08 北京科技大学 Method for predicting residual life of equipment in industrial process

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10410116B2 (en) * 2014-03-11 2019-09-10 SparkCognition, Inc. System and method for calculating remaining useful time of objects
CN109472110B (en) * 2018-11-29 2023-06-27 南京航空航天大学 Method for predicting residual service life of aeroengine based on LSTM network and ARIMA model
CN112580263B (en) * 2020-12-24 2022-05-10 湖南工业大学 Turbofan engine residual service life prediction method based on space-time feature fusion
CN113743016B (en) * 2021-09-09 2023-06-30 湖南工业大学 Engine residual life prediction method based on self-encoder and echo state network

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486578A (en) * 2021-06-28 2021-10-08 北京科技大学 Method for predicting residual life of equipment in industrial process

Also Published As

Publication number Publication date
CN114282443A (en) 2022-04-05

Similar Documents

Publication Publication Date Title
CN114282443B (en) Residual service life prediction method based on MLP-LSTM supervised joint model
Zheng et al. Long short-term memory network for remaining useful life estimation
CN109766583B (en) Aircraft engine life prediction method based on unlabeled, unbalanced and initial value uncertain data
CN114218872B (en) DBN-LSTM semi-supervised joint model-based residual service life prediction method
CN108445752B (en) Random weight neural network integrated modeling method for self-adaptively selecting depth features
CN110689171A (en) Turbine health state prediction method based on E-LSTM
CN112990556A (en) User power consumption prediction method based on Prophet-LSTM model
CN114048600A (en) Digital twin-driven multi-model fusion industrial system anomaly detection method
CN116757534B (en) Intelligent refrigerator reliability analysis method based on neural training network
CN113325721B (en) Model-free adaptive control method and system for industrial system
CN112668775A (en) Air quality prediction method based on time sequence convolution network algorithm
CN114015825B (en) Method for monitoring abnormal state of blast furnace heat load based on attention mechanism
Liu et al. Complex engineered system health indexes extraction using low frequency raw time-series data based on deep learning methods
CN110757510A (en) Method and system for predicting remaining life of robot
Liu et al. Model fusion and multiscale feature learning for fault diagnosis of industrial processes
CN114266278A (en) Dual-attention-network-based method for predicting residual service life of equipment
CN117273440A (en) Engineering construction Internet of things monitoring and managing system and method based on deep learning
Bond et al. A hybrid learning approach to prognostics and health management applied to military ground vehicles using time-series and maintenance event data
Wenqiang et al. Remaining useful life prediction for mechanical equipment based on temporal convolutional network
CN113988210A (en) Method and device for restoring distorted data of structure monitoring sensor network and storage medium
CN111859798A (en) Flow industrial fault diagnosis method based on bidirectional long-time and short-time neural network
Zhang et al. Remaining Useful Life Predictions for Turbofan Engine Using Semi-supervised DBN-LSTM Joint Training Model
Wang A new variable selection method for soft sensor based on deep learning
CN117370870B (en) Knowledge and data compound driven equipment multi-working condition identification and performance prediction method
CN115309736B (en) Time sequence data anomaly detection method based on self-supervision learning multi-head attention network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant