CN114282443A - Residual service life prediction method based on MLP-LSTM supervised joint model - Google Patents

Residual service life prediction method based on MLP-LSTM supervised joint model Download PDF

Info

Publication number
CN114282443A
CN114282443A CN202111623573.6A CN202111623573A CN114282443A CN 114282443 A CN114282443 A CN 114282443A CN 202111623573 A CN202111623573 A CN 202111623573A CN 114282443 A CN114282443 A CN 114282443A
Authority
CN
China
Prior art keywords
mlp
lstm
neural network
layer
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111623573.6A
Other languages
Chinese (zh)
Other versions
CN114282443B (en
Inventor
张新民
张雨桐
李乐清
朱哲人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202111623573.6A priority Critical patent/CN114282443B/en
Publication of CN114282443A publication Critical patent/CN114282443A/en
Application granted granted Critical
Publication of CN114282443B publication Critical patent/CN114282443B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a residual service life prediction method based on an MLP-LSTM supervised joint model, which comprises the steps of firstly, carrying out data fusion on multi-dimensional time sequence historical information by using a multi-layer perceptron MLP to extract health index characteristics of a machine; and inputting the extracted health index time sequence into the LSTM, and calculating the current remaining service life RUL of the machine. Further, supervised training is carried out on two serially connected neural networks by using a labeled sample data set to update the weight, the prediction result is evaluated in a verification set, and the parameters are adaptively adjusted to obtain an optimization model. The MLP-LSTM supervised combined model obtained by training not only effectively improves the prediction capability of the LSTM on the residual service life, but also provides a feature fusion result of multi-dimensional sensor data, can effectively express the health condition of the current machine, and provides an effective reference index for equipment maintenance.

Description

Residual service life prediction method based on MLP-LSTM supervised joint model
Technical Field
The invention belongs to the field of industrial process control, and particularly relates to a residual service life prediction method based on an MLP-LSTM supervised joint model.
Background
In the industrial field, the working performance and health of some important machine equipment and industrial components tend to decline when they continuously work due to the influence of internal motion factors or external environmental factors. With the continuous decline of health condition, equipment can't normally work at some time in the future, and its work efficiency drops rapidly even stops the operation, reaches life, and this can lead to industrial process to receive the influence even take place to interrupt. It is therefore necessary to predict the Remaining Useful Life (RUL) of the system during its entire useful life, i.e. the length of time from the current time until the end of the useful life of the machine equipment.
In recent years, with the collection and accumulation of large amounts of industrial data, data-driven solutions have received much attention in RUL prediction. The data-driven solution does not need to know the detailed operation mechanism of the mechanical system, and only needs to identify the condition of the system based on the data collected by the sensor and according to a data-driven algorithm, so that the residual service life of modern factory equipment with a complex mechanism model can be accurately predicted. Traditional predictions are mainly based on physical degradation models, and the correct establishment of degradation models relies heavily on expert knowledge. These assumptions and requirements are very limited in practical industrial production applications. The traditional machine learning method needs manual feature design, which requires a lot of professional knowledge of practitioners and a separate feature extraction process, which increases the difficulty of wide application of the model. Because the deep learning has the capability of extracting the automatic features of the data and less depends on the prior knowledge of the system, the recent research shows that the deep learning can better process industrial big data and more accurately predict the residual service life of mechanical equipment compared with a physical degradation model and a traditional machine learning algorithm.
However, the current research on predicting RUL by deep learning methods still has some problems. Firstly, data fusion and prediction are generally divided into two separate steps, generally, data fusion is firstly carried out to obtain a health index, and then a fusion signal is used for executing an RUL prediction process, and the traditional process results in that an internal relation between two tasks is lacked, and the relation between the fusion signal and a prediction result cannot be explained. Therefore, a common method for predicting the RUL deep learning of the multi-sensor data is to directly obtain end-to-end prediction output. The advantage of this approach is that it is completely data-driven, without the need to assume a degenerate model, parametric distribution and manually extract features. However, this method belongs to the black box model method and cannot provide any information of the performance degradation process. The remaining useful life is a time value whose magnitude is linearly decaying. The physical condition of the machine is not changed linearly but decays exponentially, and the maintenance of the machine depends on the exponential model to a great extent, so that the maintenance is completed before the machine enters a rapid decay period. There is therefore a need to be able to predict and build simultaneously remaining usage time and health indicators based on sensor information.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a residual service life prediction method based on an MLP-LSTM supervised joint model, which can obtain the information of a performance degradation process while judging the current residual service life of a machine and simultaneously improve the prediction effect of a single LSTM neural network.
A residual service life prediction method based on an MLP-LSTM supervised joint model is characterized in that an MLP neural network is added between an input layer and a deep LSTM neural network, and the MLP neural network is used for data fusion; the deep LSTM neural network is used for predicting the residual service life, namely RUL;
the method comprises the following steps:
the method comprises the following steps: collecting equipment data to form a data set, dividing the data set into a training set and a verification set, and preprocessing the data according to different working conditions;
step two: inputting the training set into an MLP neural network, compressing the multi-dimensional sensor characteristics into HI health characteristic indexes by the MLP neural network, and obtaining a plurality of HI time sequences of health indexes;
step three: inputting the health index HI time sequence into a depth LSTM neural network, and calculating by the depth LSTM neural network to obtain an RUL predicted value;
step four: calculating a loss function based on an error between a predicted value and a true value of the RUL, and training an MLP-LSTM supervised joint model by adopting a training set through RMSprop gradient self-adaptation; when the error result obtained after the training set and the verification set are input into the current model is smaller than a certain value or the variation of the error result is smaller than a certain value, the loss function of the model training is converged, the model training is finished, and the MLP-LSTM supervised combined model is stored;
step five: and preprocessing the equipment data to be predicted, and inputting the preprocessed equipment data into a stored MLP-LSTM supervised joint model to obtain HI and RUL values output in real time.
Further, the tagged data set in the first step is:
Xo={(xit,rulit)|i≤n,t≤Ti} (1)
wherein, rulitFor the value of the remaining service life at time t,
rulit=Ti-t (2)
when the device is completely out of use rulitIs 0, and all rulitAre increased in reverse time sequence;
xitfor the ith sensor data sequence from initial to time t,
xit=[xi(1),xi(2),...,xi(t)](3) wherein x isiFrom initial to time T for ith sensor dataiThe sequence of (a) to (b),
xi=[xi(1),xi(2),...,xi(Ti)] (4)
the preprocessing comprises normalization processing and sliding time window sampling processing; and when the equipment data are data under different working conditions, carrying out condition normalization, otherwise, carrying out global normalization.
Further, in the second step, a multi-sensor information multi-dimensional time sequence is input into the MLP, the MLP compresses the multi-dimensional data into one dimension, and finally a set including a health index HI time sequence is output;
the MLP building and pre-training process is as follows:
inputting a multi-sensor information multi-dimensional time sequence into an MLP neural network, and compressing multi-dimensional data into one dimension by the MLP neural network; in the MLP neural network forward propagation process, each node is obtained by calculating all nodes of the previous layer, a weight W is given to each node of the previous layer, a bias b is added, and finally the value of a certain node of the next layer is obtained through an activation function:
wherein the value of the L +1 layer node j is
Figure BDA0003439015030000031
The output of the last layer of the MLP neural network is a set H of HI time sequences
H={hi(tj)|i=1,2,...,N;1,2,...,Ti} (6)
Wherein H is the health index H at each time pointi(tj) Set of constituents, hi(tj)=f(xi(tj)),f(xi) The function is a function corresponding to the MLP neural network; t is the length of the time sequence; h represents the health index HI; x is the number ofi(tj) Represents tjSet of each sensor data l at a time, xi(tj)=[li,1(tj),li,2(tj),...,li,p(tj)]∈R1×p(ii) a x represents the raw sample, p represents the number of sensors; x is the number ofi(tj) Is X, X ═ Xi(tj)|i=1,2,...,N;1,2,...,Ti}。
Further, the third step is specifically divided into the following sub-steps:
the depth LSTM neural network is formed by stacking a plurality of layers of LSTMs, and the vector dimension of each layer of LSTM is variable; the HI health index is decoded into a multidimensional sensor time sequence through a first layer LSTM, the output of the upper layer of a depth LSTM network is used as the input of the next layer, and the updating formula of the l layer is as follows:
Figure BDA0003439015030000032
Figure BDA0003439015030000033
Figure BDA0003439015030000034
Figure BDA0003439015030000035
Figure BDA0003439015030000036
wherein l represents the number of layers of the deep LSTM neural network, t represents the number of units at a certain time of the LSTM,
Figure BDA0003439015030000041
an input unit indicating the time t of the l-th layer,
Figure BDA0003439015030000042
a forgetting unit indicating the time t of the l-th layer,
Figure BDA0003439015030000043
an output unit representing the time t of the l-th layer,
Figure BDA0003439015030000044
a status cell indicating the time t of the l-th layer,
Figure BDA0003439015030000045
hidden unit for indicating t-th layer timeσ denotes a sigmoid activation function, an-denotes an element multiplication calculation, tanh denotes a tanh activation function,
Figure BDA0003439015030000046
representing the hidden unit weight at the time t of layer l-1,
Figure BDA0003439015030000047
representing the hidden unit weight at time t-1 of the l-th layer,
Figure BDA0003439015030000048
indicating a deviation;
and outputting the multi-dimensional characteristic vector by the last unit of the LSTM neural network of the last layer, and obtaining the RUL predicted value through linear layer calculation.
Further, the step four is specifically divided into the following sub-steps:
(1) the input layer of the deep LSTM neural network is the l-th layer of the MLP-LSTM supervised joint model and comprises n neurons, and the output layer of the MLP neural network is the l-1-th layer of the MLP-LSTM supervised joint model and only has one neuron; designing neuron error delta of l layer and l-1 layer of MLP-LSTM supervised joint modell、δl-1The MLP neural network and the deep LSTM neural network are used for realizing synchronous training of the MLP neural network and the deep LSTM neural network;
δl=(wl+1)Tδl+1 (12)
Figure BDA0003439015030000049
wherein w and B are weight parameters and batch sizes of the neural network, respectively;
(2) in the supervised joint training, a square error loss function constrained by L2 regularization is applied to carry out gradient self-adaptive training parameters, a score function is adopted to evaluate the prediction accuracy of the MLP-LSTM supervised joint model, the score function is added into a global loss function with certain weight as punishment, the square error loss function is optimized, and the MLP-LSTM supervised joint model biased to early prediction is obtained:
wherein, the square error loss function calculation formula is as follows:
Figure BDA00034390150300000410
wherein the values of Θ, w, B, λ,
Figure BDA00034390150300000411
and yiRespectively representing a parameter set learned in the MLP-LSTM supervised joint model, a weight parameter set in the MLP-LSTM supervised joint model, a batch size, a regularization parameter, a predicted RUL and a true RUL of an ith sample;
the scoring function Score is calculated as follows:
Figure BDA00034390150300000412
d=RULpred-RULtrue (16)
the global Loss function LosstotalThe calculation formula of (a) is as follows:
Losstotal=αLossscore+(1-α)LossMSE (17)
wherein α is the weight of the two scoring functions;
(3) training the MLP-LSTM supervised joint model by adopting a labeled training set through RMSprop gradient self-adaptation, wherein the calculation formula is as follows:
Figure BDA0003439015030000051
where r is an accumulated variable of the history gradient, ρ is a contraction coefficient for controlling acquisition of history information, η is a learning rate, δ is a constant, and g is LosstotalOf the gradient of (c).
The invention has the following beneficial effects:
the invention provides a general RUL semi-supervised joint prediction framework method aiming at analyzing the conventional RUL prediction method in the running degradation process of machine equipment and combining a deep learning theory and the problems existing in the previous research, thereby realizing the synchronous RUL prediction of health index data fusion and multi-sensor data. The training model provides a continuous visualization process of system degradation, but also ensures efficient prediction of the generated fusion signal for RUL and rapid convergence of the predictive model training process. Furthermore, a loss function of the RUL life prediction model is modified, so that the trained model is more biased to early prediction, the prediction result can ensure the maintenance to be advanced, and the prediction model is safer.
Drawings
FIG. 1 is a schematic diagram of an MLP-LSTM neural network model;
FIG. 2 is a flow diagram of a joint model framework training implementation;
FIG. 3 is a diagram illustrating results generated after different normalization strategies;
FIG. 4 is a graphical (raw) plot of the data fusion output health indicator HI over time, wherein the upper graph in FIG. 4 represents the HI output for all turbine data in the test set, and the lower graph represents the HI output for selected test set partial sample data.
FIG. 5 is a graphical representation of a time-varying (filtered) plot of data fusion output health indicator HI, where the upper graph in FIG. 5 represents the HI output for all turbine data in a test set, and the lower graph represents the HI output for selected test set partial sample data.
FIG. 6 is a schematic diagram of a fitting curve of the MLP-LSTM neural network model.
FIG. 7 is a schematic diagram of an on-line prediction RUL fitting curve.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and preferred embodiments, and the objects and effects of the present invention will become more apparent, it being understood that the specific embodiments described herein are merely illustrative of the present invention and are not intended to limit the present invention.
In the method for predicting the remaining service life based on the MLP-LSTM supervised joint model, the MLP-LSTM supervised joint model is realized by adding a DBN neural network between an input layer and a deep LSTM neural network, and the MLP neural network is used for data fusion; the deep LSTM neural network is used for residual life prediction, i.e., RUL prediction. And inputting the data slice sequence into an MLP neural network of an MLP-LSTM supervised joint model to calculate the HI time sequence of a single dimension. The last layer of the MLP neural network is connected with a deep LSTM network model, namely, the HI sequence fused by the MLP neural network data is sequentially input into the deep LSTM network for calculation according to the time _ step length. And after a complete MLP-LSTM network joint prediction model is obtained, further performing gradient descent iteration by using an RMSprop optimizer. The concrete structure is shown in figure 1. In addition, because the MLP neural network can deeply extract data characteristics, the overfitting problem can easily occur in the training process, the network is optimized by adopting a batch standardized network layer, and a regularization item is added to the MLP neural network to solve the problem. Inputting the data of the test set, outputting the change process of the health index HI in the MLP network, and outputting the predicted value of the RUL in the final network layer of the whole model. In the concrete engineering implementation, the network layers in the two models are connected in the same model, and an RMSprop optimizer is used for iterative joint training. The entire training process proceeds as shown in fig. 2.
The method comprises the following steps:
the method comprises the following steps: collecting equipment data to form a data set, dividing the data set into a training set and a verification set, and preprocessing the data according to different working conditions; wherein the data of the labeled training set comprises a time stamp of time sequence data, a value of each characteristic variable at each moment, and an RUL data label or equipment life end time for calculating the RUL label; the content of the labeled verification set is the same as that of the labeled training set, and the number of the labeled verification set is 10% -30% of that of the labeled training set.
The tagged data set in the step one is as follows:
Xo={(xit,rulit)|i≤n,t≤Ti} (1)
wherein, rulitFor the value of the remaining service life at time t,
rulit=Ti-t (2)
when the device is completely out of use rulitIs 0, and all rulitAre increased in reverse time sequence;
xitfor the ith sensor data sequence from initial to time t,
xit=[xi(1),xi(2),...,xi(t)] (3)
wherein x isiFrom initial to time T for ith sensor dataiThe sequence of (a) to (b),
xi=[xi(1),xi(2),...,xi(Ti)] (4)
the preprocessing comprises normalization processing and sliding time window sampling processing; and when the equipment data are data under different working conditions, carrying out condition normalization, otherwise, carrying out global normalization.
The LSTM recurrent neural network has standard input forms (batch _ size, time _ steps, feature _ nums), where batch _ size refers to the number of samples that are processed in a batch during the training of the neural network model, time _ steps refers to the time step of the time series data in each sample, and feature _ nums refers to the number of dimensions of features in the multi-sensor data. In order to further process the data set into a standard pattern, a sliding time window method is used for sample sampling.
Step two: inputting the training set into an MLP neural network, compressing the multi-dimensional sensor characteristics into HI health characteristic indexes by the MLP neural network, and obtaining a plurality of HI time sequences of health indexes;
in the second step, a multi-sensor information multi-dimensional time sequence is input into the MLP, the MLP compresses the multi-dimensional data into one dimension, and finally a set comprising a health index HI time sequence is output;
the MLP building and pre-training process is as follows:
inputting a multi-sensor information multi-dimensional time sequence into an MLP neural network, and compressing multi-dimensional data into one dimension by the MLP neural network; in the MLP neural network forward propagation process, each node is obtained by calculating all nodes of the previous layer, a weight W is given to each node of the previous layer, a bias b is added, and finally the value of a certain node of the next layer is obtained through an activation function:
wherein the value of the L +1 layer node j is
Figure BDA0003439015030000071
The output of the last layer of the MLP neural network is a set H of HI time sequences
H={hi(tj)|i=1,2,...,N;1,2,...,Ti} (6)
Wherein H is the health index H at each time pointi(tj) Set of constituents, hi(tj)=f(xi(tj)),f(xi) The function is a function corresponding to the MLP neural network; t is the length of the time sequence; h represents the health index HI; x is the number ofi(tj) Represents tjSet of each sensor data l at a time, xi(tj)=[li,1(tj),li,2(tj),...,li,p(tj)]∈R1×p(ii) a x represents the raw sample, p represents the number of sensors; x is the number ofi(tj) Is X, X ═ Xi(tj)|i=1,2,...,N;1,2,...,Ti}。
Step three: inputting the health index HI time sequence into a depth LSTM neural network, and calculating by the depth LSTM neural network to obtain an RUL predicted value;
the third step is specifically divided into the following substeps:
the depth LSTM network is formed by stacking a plurality of layers of LSTMs, and the vector dimension of each layer of LSTM is variable; the HI health index is decoded into a multidimensional sensor time sequence through a first layer LSTM, the output of the upper layer of a depth LSTM network is used as the input of the next layer, and the updating formula of the l layer is as follows:
Figure BDA0003439015030000081
Figure BDA0003439015030000082
Figure BDA0003439015030000083
Figure BDA0003439015030000084
Figure BDA0003439015030000085
wherein l represents the number of layers of the deep LSTM neural network, t represents the number of units at a certain time of the LSTM,
Figure BDA0003439015030000086
an input unit indicating the time t of the l-th layer,
Figure BDA0003439015030000087
a forgetting unit indicating the time t of the l-th layer,
Figure BDA0003439015030000088
an output unit representing the time t of the l-th layer,
Figure BDA0003439015030000089
a status cell indicating the time t of the l-th layer,
Figure BDA00034390150300000810
indicating a hidden unit at the moment of the ith layer t,. sigma.indicating a sigmoid activation function,. alpha.indicating an element multiplication calculation,. tanh indicating a tanh activation function,
Figure BDA00034390150300000811
denotes the l-hidden unit weights at layer 1 time t,
Figure BDA00034390150300000812
representing the hidden unit weight at time t-1 of the l-th layer,
Figure BDA00034390150300000813
indicating a deviation;
and outputting the multi-dimensional characteristic vector by the last unit of the LSTM neural network of the last layer, and obtaining the RUL predicted value through linear layer calculation.
Step four: calculating a loss function based on an error between a predicted value and a true value of the RUL, and training an MLP-LSTM supervised joint model by adopting a training set through RMSprop gradient self-adaptation; when the error result obtained after the training set and the verification set are input into the current model is smaller than a certain value or the variation of the error result is smaller than a certain value, the loss function of the model training is converged, the model training is finished, and the MLP-LSTM supervised combined model is stored;
the fourth step is specifically divided into the following substeps:
(1) the input layer of the deep LSTM neural network is the l-th layer of the MLP-LSTM supervised joint model and comprises n neurons, and the output layer of the MLP neural network is the l-1-th layer of the MLP-LSTM supervised joint model and only has one neuron; designing neuron error delta of l layer and l-1 layer of MLP-LSTM supervised joint modell、δl-1The MLP neural network and the deep LSTM neural network are used for realizing synchronous training of the MLP neural network and the deep LSTM neural network;
δl=(wl+1)Tδl+1 (12)
Figure BDA00034390150300000814
wherein w and B are weight parameters and batch sizes of the neural network, respectively;
(2) in the supervised joint training, a square error loss function constrained by L2 regularization is applied to carry out gradient self-adaptive training parameters, a score function is adopted to evaluate the prediction accuracy of the MLP-LSTM supervised joint model, the score function is added into a global loss function with certain weight as punishment, the square error loss function is optimized, and the MLP-LSTM supervised joint model biased to early prediction is obtained:
wherein, the square error loss function calculation formula is as follows:
Figure BDA0003439015030000091
wherein the values of Θ, w, B, λ,
Figure BDA0003439015030000092
and yiRespectively representing a parameter set learned in the MLP-LSTM supervised joint model, a weight parameter set in the MLP-LSTM supervised joint model, a batch size, a regularization parameter, a predicted RUL and a true RUL of an ith sample;
the scoring function Score is calculated as follows:
Figure BDA0003439015030000093
d=RULpred-RULtrue (16)
the global Loss function LosstotalThe calculation formula of (a) is as follows:
Losstotal=αLossscore+(1-α)LossMSE (17)
wherein α is the weight of the two scoring functions;
(3) training the MLP-LSTM supervised joint model by adopting a labeled training set through RMSprop gradient self-adaptation, wherein the calculation formula is as follows:
Figure BDA0003439015030000094
where r is an accumulated variable of the history gradient, ρ is a contraction coefficient for controlling acquisition of history information, η is a learning rate, and δ is oneA constant, g is LosstotalOf the gradient of (c).
Step five: and preprocessing the equipment data to be predicted, and inputting the preprocessed equipment data into a stored MLP-LSTM supervised joint model to obtain HI and RUL values output in real time.
The usefulness of the present invention is illustrated below with reference to a specific industrial example. The invention adopts an open source turbofan engine degradation simulation data set C-MAPSS provided by NASA as an example, the data specifically comprises four sub data sets FD001-FD004 with different operating conditions and failure modes, and each data set comprises three files, namely train _ FD00X, test _ FD00X and RUL _ FD00X, which are respectively RUL truth labels of a training set, a test set and a test set. The details are shown in the following table:
table 1: details of C-MAPSS dataset
Figure BDA0003439015030000095
Figure BDA0003439015030000101
The data set FD002 is mainly used as a research object, compared with FD001 or FD003, the multi-sensor data of the data set has 6 working conditions, the complexity of the external environment is higher, more data are available, and the RUL is more difficult to predict theoretically. The specific meanings of the various dimensions of the sensor are shown in the following table:
table 2: multi-sensor data specific representation of a turbomachine
Figure BDA0003439015030000102
Figure BDA0003439015030000111
After data are obtained, the data set is divided to obtain a label-free training set, a label training set and a label verification set, the original data are subjected to condition normalization processing according to 6 working conditions of the data set, and then sliding window processing is carried out to obtain a data slice sequence. The operating conditions of the turbine have a great influence on the sensor values, with the readings of the sensor in different states lying in completely different value ranges. The global normalization ignores the influence of working condition conditions, and all values of each sensor are normalized simultaneously. And the condition normalization is to normalize the data of each sensor under the same working condition. As shown in fig. 3, the effect of a certain turbine unit sensor 4, 7, respectively, after processing under different normalization strategies. If global normalization is used, although the prediction accuracy of the RUL is not influenced, the output HI of the data fusion model is caused to be a global normalization variable, and the degradation trend is difficult to present. Therefore, a conditional normalization strategy is used in the data preprocessing to obtain the health index HI of the degradation trend.
The training data set and the test data set each had 100 units of turbomachinery. A sliding window of size num steps is used in each subset of units to generate the input sequence. For the model structure itself, it is mainly influenced by two hyper-parameters: batch size and num steps, which are set to different values and trained for different LSTM models in order to compare the effects of different hyper-parameters. The appropriate parameters are then selected based on the Score obtained for each model in the test set.
And adjusting regularization parameters of Dropout and the BN network according to the change of train _ loss and val _ loss and the stable condition, wherein over-fitting problems can be caused if regularization is too small, and model accuracy can be influenced if regularization is too large. In order to prevent overfitting and reduce the training time cost, an early-stop strategy is used, a threshold value of loss reduction change is set, and training is stopped when the change does not exceed the threshold value in n continuous periods. An appropriate threshold parameter may be set to implement the early-stop strategy as required by the accuracy of the model results.
Based on the built joint training neural network, the output of the multi-sensor data fusion model, namely the time sequence of the health index HI, can be obtained at the middle network layer. The health indicator decay process curve for each turbine is shown in FIG. 4. The HI time series are further filtered using a Savitzky-Golay filter. The method is a method for realizing smooth filtering in a time domain by combining convolution and local polynomial regression. The filter is characterized in that the shape and the width of a signal can be kept unchanged while noise is filtered. The filtered HI time series curve is shown in fig. 5.
Based on the C-MAPSS data set FD002 sub-data set, RUL prediction is performed on the test set by using the neural network model, and the RUL prediction result of the deep learning model on the multi-sensor time series of 100 turbines in the test set can be obtained as shown in FIG. 6. It was found that the MLP-LSTM joint model better predicts the RUL for each turbine based on the current multi-dimensional sensor time series. And further taking each time series slice as the state of one turbine unit, namely taking the historical information of the turbine at each moment as input, and realizing online real-time RUL prediction. The predicted results for 4 of these turbines are shown in figure 7. The predicted results are shown in table 3 for comparison with other methods:
table 3: comparison result of each RUL prediction algorithm model under RMSE and SCORE indexes
Methods RMSE Score
MLP 80.03 7.80×106
SVR 42.00 1.74×104
RVR 31.30 5.90×105
CNN 30.29 1.36×104
Depth LSTM 14.93 465
MLP-LSTM 12.74 335
As can be seen from table 3, the LSTM neural network exhibits better prediction effect on the RUL prediction application than the conventional machine learning algorithm and other deep learning algorithms. The MLP-LSTM combined neural network training model provided by the research can provide a health index HI for data fusion. Due to the more complex neural network structure, the deep extraction of model features can be realized, the prediction error is smaller, and the predicted value is more accurate. While having an order of magnitude lower error on the Score evaluation system.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and although the invention has been described in detail with reference to the foregoing examples, it will be apparent to those skilled in the art that various changes in the form and details of the embodiments may be made and equivalents may be substituted for elements thereof. All modifications, equivalents and the like which come within the spirit and principle of the invention are intended to be included within the scope of the invention.

Claims (5)

1. A residual service life prediction method based on an MLP-LSTM supervised joint model is characterized in that the MLP-LSTM supervised joint model is that an MLP neural network is added between an input layer and a deep LSTM neural network, and the MLP neural network is used for data fusion; the deep LSTM neural network is used for predicting the residual service life, namely RUL;
the method comprises the following steps:
the method comprises the following steps: collecting equipment data to form a data set, dividing the data set into a training set and a verification set, and preprocessing the data according to different working conditions;
step two: inputting the training set into an MLP neural network, compressing the multi-dimensional sensor characteristics into HI health characteristic indexes by the MLP neural network, and obtaining a plurality of HI time sequences of health indexes;
step three: inputting the health index HI time sequence into a depth LSTM neural network, and calculating by the depth LSTM neural network to obtain an RUL predicted value;
step four: calculating a loss function based on an error between a predicted value and a true value of the RUL, and training an MLP-LSTM supervised joint model by adopting a training set through RMSprop gradient self-adaptation; when the error result obtained after the training set and the verification set are input into the current model is smaller than a certain value or the variation of the error result is smaller than a certain value, the loss function of the model training is converged, the model training is finished, and the MLP-LSTM supervised combined model is stored;
step five: and preprocessing the equipment data to be predicted, and inputting the preprocessed equipment data into a stored MLP-LSTM supervised joint model to obtain HI and RUL values output in real time.
2. The MLP-LSTM supervised joint model-based remaining service life prediction method as recited in claim 1, wherein the labeled data set in the first step is:
Xo={(xit,rulit)|i≤n,t≤Ti} (1)
wherein, rulitFor the value of the remaining service life at time t,
rulit=Ti-t (2)
when the device is completely out of use rulitIs 0, and all rulitAre increased in reverse time sequence;
xitfor the ith sensor data sequence from initial to time t,
xit=[xi(1),xi(2),...,xi(t)] (3)
wherein x isiFrom initial to time T for ith sensor dataiThe sequence of (a) to (b),
xi=[xi(1),xi(2),...,xi(Ti)] (4)
the preprocessing comprises normalization processing and sliding time window sampling processing; and when the equipment data are data under different working conditions, carrying out condition normalization, otherwise, carrying out global normalization.
3. The MLP-LSTM supervised joint model-based residual service life prediction method as recited in claim 1, wherein in the second step, a multi-sensor information multi-dimensional time series is input into the MLP, the MLP compresses the multi-dimensional data into one dimension, and finally, a set comprising a health index HI time series is output;
the MLP building and pre-training process is as follows:
inputting a multi-sensor information multi-dimensional time sequence into an MLP neural network, and compressing multi-dimensional data into one dimension by the MLP neural network; in the MLP neural network forward propagation process, each node is obtained by calculating all nodes of the previous layer, a weight W is given to each node of the previous layer, a bias b is added, and finally the value of a certain node of the next layer is obtained through an activation function:
wherein the value of the L +1 layer node j is
Figure FDA0003439015020000021
The output of the last layer of the MLP neural network is an aggregate slice of HI time sequences
H={hi(tj)|i=1,2,...,N;1,2,...,Ti} (6)
Wherein H is the health index H at each time pointi(tj) Set of constituents, hi(tj)=f(xi(tj)),f(xi) The function is a function corresponding to the MLP neural network; t is the length of the time sequence; h represents the health index HI; x is the number ofi(tj) Represents tjSet of each sensor data l at a time, xi(tj)=[li,1(tj),li,2(tj),...,li,p(tj)]∈R1×p(ii) a x represents the raw sample, p represents the number of sensors; x is the number ofi(tj) Is X, X ═ Xi(tj)|i=1,2,...,N;1,2,...,Ti}。
4. The MLP-LSTM supervised joint model-based remaining service life prediction method as recited in claim 1, wherein the step III is specifically divided into the following sub-steps:
the depth LSTM neural network is formed by stacking a plurality of layers of LSTMs, and the vector dimension of each layer of LSTM is variable; the HI health index is decoded into a multidimensional sensor time sequence through a first layer LSTM, the output of the upper layer of a depth LSTM network is used as the input of the next layer, and the updating formula of the l layer is as follows:
Figure FDA0003439015020000022
Figure FDA0003439015020000023
Figure FDA0003439015020000024
Figure FDA0003439015020000025
Figure FDA0003439015020000031
wherein l represents the number of layers of the deep LSTM neural network, t represents the number of units at a certain time of the LSTM,
Figure FDA0003439015020000032
an input unit indicating the time t of the l-th layer,
Figure FDA0003439015020000033
a forgetting unit indicating the time t of the l-th layer,
Figure FDA0003439015020000034
an output unit representing the time t of the l-th layer,
Figure FDA0003439015020000035
a status cell indicating the time t of the l-th layer,
Figure FDA0003439015020000036
indicating a hidden unit at the moment of the ith layer t,. sigma.indicating a sigmoid activation function,. alpha.indicating an element multiplication calculation,. tanh indicating a tanh activation function,
Figure FDA0003439015020000037
representing the hidden unit weight at the time t of layer l-1,
Figure FDA0003439015020000038
representing the hidden unit weight at time t-1 of the l-th layer,
Figure FDA0003439015020000039
indicating a deviation;
and outputting the multi-dimensional characteristic vector by the last unit of the LSTM neural network of the last layer, and obtaining the RUL predicted value through linear layer calculation.
5. The MLP-LSTM supervised joint model-based remaining service life prediction method as recited in claim 1, wherein the step four is specifically divided into the following sub-steps:
(1) the input layer of the deep LSTM neural network is the l-th layer of the MLP-LSTM supervised joint model and comprises n neurons, and the output layer of the MLP neural network is the l-1-th layer of the MLP-LSTM supervised joint model and only has one neuron; designing neuron error delta of l layer and l-1 layer of MLP-LSTM supervised joint modell、δl-1The MLP neural network and the deep LSTM neural network are used for realizing synchronous training of the MLP neural network and the deep LSTM neural network;
δl=(wl+1)Tδl+1 (12)
Figure FDA00034390150200000310
wherein w and B are weight parameters and batch sizes of the neural network, respectively;
(2) in the supervised joint training, a square error loss function constrained by L2 regularization is applied to carry out gradient self-adaptive training parameters, a score function is adopted to evaluate the prediction accuracy of the MLP-LSTM supervised joint model, the score function is added into a global loss function with certain weight as punishment, the square error loss function is optimized, and the MLP-LSTM supervised joint model biased to early prediction is obtained:
wherein, the square error loss function calculation formula is as follows:
Figure FDA00034390150200000311
wherein the values of Θ, w, B, λ,
Figure FDA00034390150200000312
and yiRespectively representing a parameter set learned in the MLP-LSTM supervised joint model, a weight parameter set in the MLP-LSTM supervised joint model, a batch size, a regularization parameter, a predicted RUL and a true RUL of an ith sample;
the scoring function Score is calculated as follows:
Figure FDA0003439015020000041
d=RULpred-RULtrue (16)
the global Loss function LosstotalThe calculation formula of (a) is as follows:
Losstotal=αLossscore+(1-α)LossMSE (17)
wherein α is the weight of the two scoring functions;
(3) training the MLP-LSTM supervised joint model by adopting a labeled training set through RMSprop gradient self-adaptation, wherein the calculation formula is as follows:
Figure FDA0003439015020000042
where r is an accumulated variable of the history gradient, ρ is a contraction coefficient for controlling acquisition of history information, η is a learning rate, δ is a constant, and g is LosstotalOf the gradient of (c).
CN202111623573.6A 2021-12-28 2021-12-28 Residual service life prediction method based on MLP-LSTM supervised joint model Active CN114282443B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111623573.6A CN114282443B (en) 2021-12-28 2021-12-28 Residual service life prediction method based on MLP-LSTM supervised joint model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111623573.6A CN114282443B (en) 2021-12-28 2021-12-28 Residual service life prediction method based on MLP-LSTM supervised joint model

Publications (2)

Publication Number Publication Date
CN114282443A true CN114282443A (en) 2022-04-05
CN114282443B CN114282443B (en) 2023-03-17

Family

ID=80876961

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111623573.6A Active CN114282443B (en) 2021-12-28 2021-12-28 Residual service life prediction method based on MLP-LSTM supervised joint model

Country Status (1)

Country Link
CN (1) CN114282443B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115987295A (en) * 2023-03-20 2023-04-18 河北省农林科学院 Crop monitoring data efficient processing method based on Internet of things
CN116697039A (en) * 2023-08-07 2023-09-05 德电北斗电动汽车有限公司 Self-adaptive control method and system for single-stage high-speed transmission
WO2024050782A1 (en) * 2022-09-08 2024-03-14 Siemens Aktiengesellschaft Method and apparatus for remaining useful life estimation and computer-readable storage medium
CN117874639A (en) * 2024-03-12 2024-04-12 山东能源数智云科技有限公司 Mechanical equipment service life prediction method and device based on artificial intelligence

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150262060A1 (en) * 2014-03-11 2015-09-17 SparkCognition, Inc. System and Method for Calculating Remaining Useful Time of Objects
CN109472110A (en) * 2018-11-29 2019-03-15 南京航空航天大学 A kind of aero-engine remaining life prediction technique based on LSTM network and ARIMA model
CN112580263A (en) * 2020-12-24 2021-03-30 湖南工业大学 Turbofan engine residual service life prediction method based on space-time feature fusion
CN113486578A (en) * 2021-06-28 2021-10-08 北京科技大学 Method for predicting residual life of equipment in industrial process
CN113743016A (en) * 2021-09-09 2021-12-03 湖南工业大学 Turbofan engine residual service life prediction method based on improved stacked sparse self-encoder and attention echo state network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150262060A1 (en) * 2014-03-11 2015-09-17 SparkCognition, Inc. System and Method for Calculating Remaining Useful Time of Objects
CN109472110A (en) * 2018-11-29 2019-03-15 南京航空航天大学 A kind of aero-engine remaining life prediction technique based on LSTM network and ARIMA model
CN112580263A (en) * 2020-12-24 2021-03-30 湖南工业大学 Turbofan engine residual service life prediction method based on space-time feature fusion
CN113486578A (en) * 2021-06-28 2021-10-08 北京科技大学 Method for predicting residual life of equipment in industrial process
CN113743016A (en) * 2021-09-09 2021-12-03 湖南工业大学 Turbofan engine residual service life prediction method based on improved stacked sparse self-encoder and attention echo state network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANDRÉ LISTOU ELLEFSEN等: "Remaining useful life predictions for turbofan engine degradation using semi-supervised deep architecture", 《RELIABILITY ENGINEERING AND SYSTEM SAFETY 183》 *
ZHANG XINMIN ET AL.: "Dynamic Variational Bayesian Student"s T Mixture Regression With Hidden Variables Propagation for Industrial Inferential Sensor Development", 《IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS》 *
李京峰等: "基于LSTM-DBN的航空发动机剩余寿命预测", 《系统工程与电子技术》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024050782A1 (en) * 2022-09-08 2024-03-14 Siemens Aktiengesellschaft Method and apparatus for remaining useful life estimation and computer-readable storage medium
CN115987295A (en) * 2023-03-20 2023-04-18 河北省农林科学院 Crop monitoring data efficient processing method based on Internet of things
CN115987295B (en) * 2023-03-20 2023-05-12 河北省农林科学院 Crop monitoring data efficient processing method based on Internet of things
CN116697039A (en) * 2023-08-07 2023-09-05 德电北斗电动汽车有限公司 Self-adaptive control method and system for single-stage high-speed transmission
CN116697039B (en) * 2023-08-07 2023-09-29 德电北斗电动汽车有限公司 Self-adaptive control method and system for single-stage high-speed transmission
CN117874639A (en) * 2024-03-12 2024-04-12 山东能源数智云科技有限公司 Mechanical equipment service life prediction method and device based on artificial intelligence

Also Published As

Publication number Publication date
CN114282443B (en) 2023-03-17

Similar Documents

Publication Publication Date Title
CN114282443B (en) Residual service life prediction method based on MLP-LSTM supervised joint model
CN116757534B (en) Intelligent refrigerator reliability analysis method based on neural training network
CN108445752B (en) Random weight neural network integrated modeling method for self-adaptively selecting depth features
CN114218872B (en) DBN-LSTM semi-supervised joint model-based residual service life prediction method
CN112990556A (en) User power consumption prediction method based on Prophet-LSTM model
CN110689171A (en) Turbine health state prediction method based on E-LSTM
CN114048600A (en) Digital twin-driven multi-model fusion industrial system anomaly detection method
CN114015825B (en) Method for monitoring abnormal state of blast furnace heat load based on attention mechanism
CN113325721B (en) Model-free adaptive control method and system for industrial system
CN112668775A (en) Air quality prediction method based on time sequence convolution network algorithm
Liu et al. Complex engineered system health indexes extraction using low frequency raw time-series data based on deep learning methods
CN110757510A (en) Method and system for predicting remaining life of robot
Liu et al. Model fusion and multiscale feature learning for fault diagnosis of industrial processes
CN114118225A (en) Method, system, electronic device and storage medium for predicting remaining life of generator
CN114500004A (en) Anomaly detection method based on conditional diffusion probability generation model
CN117273440A (en) Engineering construction Internet of things monitoring and managing system and method based on deep learning
CN113984389A (en) Rolling bearing fault diagnosis method based on multi-receptive-field and improved capsule map neural network
CN113988210A (en) Method and device for restoring distorted data of structure monitoring sensor network and storage medium
CN116842323A (en) Abnormal detection method for operation data of water supply pipeline
CN116244596A (en) Industrial time sequence data anomaly detection method based on TCN and attention mechanism
CN113821974B (en) Engine residual life prediction method based on multiple fault modes
CN114648095A (en) Air quality concentration inversion method based on deep learning
Zhang et al. Remaining useful life predictions for turbofan engine using semi-supervised DBN-LSTM joint training model
Wang A new variable selection method for soft sensor based on deep learning
CN115309736B (en) Time sequence data anomaly detection method based on self-supervision learning multi-head attention network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant