CN114282443B

CN114282443B - Residual service life prediction method based on MLP-LSTM supervised joint model

Info

Publication number: CN114282443B
Application number: CN202111623573.6A
Authority: CN
Inventors: 张新民; 张雨桐; 李乐清; 朱哲人
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2021-12-28
Filing date: 2021-12-28
Publication date: 2023-03-17
Anticipated expiration: 2041-12-28
Also published as: CN114282443A

Abstract

The invention discloses a residual service life prediction method based on an MLP-LSTM supervised joint model, which comprises the steps of firstly, carrying out data fusion on multi-dimensional time sequence historical information by using a multi-layer perceptron MLP to extract health index characteristics of a machine; and inputting the extracted health index time sequence into the LSTM, and calculating the current remaining service life RUL of the machine. Further, supervised training is carried out on two serially connected neural networks by using a labeled sample data set to update the weight, the prediction result is evaluated in a verification set, and the parameters are adaptively adjusted to obtain an optimization model. The MLP-LSTM supervised combined model obtained through training not only effectively improves the prediction capability of the LSTM on the residual service life, but also provides a feature fusion result of multi-dimensional sensor data, can effectively express the health condition of the current machine, and provides effective reference indexes for equipment maintenance and repair.

Description

Residual service life prediction method based on MLP-LSTM supervised joint model

Technical Field

The invention belongs to the field of industrial process control, and particularly relates to a residual service life prediction method based on an MLP-LSTM supervised joint model.

Background

In the industrial field, the working performance and health of some important machine devices and industrial components tend to decline when they continuously work due to the influence of internal motion factors or external environmental factors. With the continuous decline of health condition, equipment can't normally work at some time in the future, and its work efficiency drops rapidly even stops the operation, reaches life, and this can lead to industrial process to receive the influence even take place to interrupt. It is therefore necessary to predict the Remaining Useful Life (RUL) of the system during its entire useful life, i.e. the length of time from the current time until the end of the useful life of the machine equipment.

In recent years, with the collection and accumulation of large amounts of industrial data, data-driven solutions have received much attention in RUL prediction. The data-driven solution does not need to know the detailed operation mechanism of the mechanical system, and can accurately predict the residual service life of modern factory equipment with a complex mechanism model only by identifying the condition of the system based on data collected by a sensor and a data-driven algorithm. Traditional predictions are mainly based on physical degradation models, and the correct establishment of degradation models relies heavily on expert knowledge. These assumptions and requirements are very limited in practical industrial production applications. The traditional machine learning method needs manual feature design, which requires a lot of professional knowledge of practitioners and a separate feature extraction process, which increases the difficulty of wide application of the model. Because the deep learning has the capability of extracting the automatic features of the data and less depends on the prior knowledge of the system, the recent research shows that the deep learning can better process industrial big data and more accurately predict the residual service life of mechanical equipment compared with a physical degradation model and a traditional machine learning algorithm.

However, there are still some problems in the current research on predicting RUL by deep learning method. Firstly, data fusion and prediction are generally divided into two separate steps, generally, data fusion is firstly carried out to obtain a health index, and then a fusion signal is used for executing an RUL prediction process, and the traditional process results in that an internal relation between two tasks is lacked, and the relation between the fusion signal and a prediction result cannot be explained. Therefore, a common method for predicting the RUL deep learning of the multi-sensor data is to directly obtain end-to-end prediction output. The advantage of this approach is that it is completely data-driven, without the need to assume a degenerate model, parametric distribution and manually extract features. However, this method belongs to the black box model method and cannot provide any information of the performance degradation process. The remaining useful life is a time value whose magnitude is linearly decaying. The physical condition of the machine is not changed linearly but decays exponentially, and the maintenance of the machine depends on the exponential model to a great extent, so that the maintenance is completed before the machine enters a rapid decay period. There is therefore a need to be able to predict and build both remaining usage time and health indicators based on sensor information.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a residual service life prediction method based on an MLP-LSTM supervised joint model, which can obtain the information of a performance degradation process while judging the current residual service life of a machine and simultaneously improve the prediction effect of a single LSTM neural network.

A residual service life prediction method based on an MLP-LSTM supervised joint model is characterized in that an MLP neural network is added between an input layer and a deep LSTM neural network, and the MLP neural network is used for data fusion; the deep LSTM neural network is used for predicting the residual service life, namely RUL (run least squares);

the method comprises the following steps:

the method comprises the following steps: collecting equipment data to form a data set, dividing the data set into a training set and a verification set, and preprocessing the data according to different working conditions;

step two: inputting the training set into an MLP neural network, compressing the multi-dimensional sensor characteristics into HI health characteristic indexes by the MLP neural network, and obtaining a plurality of HI time sequences of health indexes;

step three: inputting the health index HI time sequence into a depth LSTM neural network, and calculating by the depth LSTM neural network to obtain an RUL predicted value;

step four: calculating a loss function based on an error between a predicted value and a true value of the RUL, and training an MLP-LSTM supervised joint model by adopting a training set through RMSprop gradient self-adaptation; when the error result obtained after the training set and the verification set are input into the current model is smaller than a certain value or the variation of the error result is smaller than a certain value, the loss function of the model training is converged, the model training is finished, and the MLP-LSTM supervised combined model is stored;

step five: and preprocessing the equipment data to be predicted, and inputting the preprocessed equipment data into a stored MLP-LSTM supervised joint model to obtain HI and RUL values output in real time.

Further, the tagged data set in the first step is:

X _o ＝{(x _it ，rul _it )|i≤n，t≤T _i } (1)

among them, rul _it Is tThe value of the remaining service life at the moment,

rul _it ＝T _i -t (2)

when the device is completely unusable, rul _it 0, and all ruls _it Is inversely increased in time sequence;

x _it for the sequence of the ith sensor data from the initial to time t,

x _it ＝[x _i (1)，x _i (2)，...，x _i (t)](3) Wherein x is _i From initial to time T for ith sensor data _i The sequence of (a) to (b),

x _i ＝[x _i (1)，x _i (2)，...，x _i (T _i )] (4)

the preprocessing comprises normalization processing and sliding time window sampling processing; and when the equipment data are data under different working conditions, carrying out condition normalization, otherwise, carrying out global normalization.

Further, in the second step, a multi-sensor information multi-dimensional time sequence is input into the MLP, the MLP compresses the multi-dimensional data into one dimension, and finally a set including a health index HI time sequence is output;

the MLP building and pre-training process comprises the following steps:

inputting a multi-sensor information multi-dimensional time sequence into an MLP neural network, and compressing multi-dimensional data into one dimension by the MLP neural network; in the MLP neural network forward propagation process, each node is obtained by calculating all nodes of the previous layer, a weight W is given to each node of the previous layer, a bias b is added, and finally the value of a certain node of the next layer is obtained through an activation function:

wherein the value of the L +1 layer node j is

The output of the last layer of the MLP neural network is a set H of HI time sequences

H＝{h _i (t _j )|i＝1，2，...，N；1，2，...，T _i } (6)

Wherein H is the health index H at each time point _i (t _j ) Set of constituents, h _i (t _j )＝f(x _i (t _j ))，f(x _i ) The function is a function corresponding to the MLP neural network; t is the length of the time series; h represents the health index HI; x is the number of _i (t _j ) Denotes t _j Set of each sensor data l at a time, x _i (t _j )＝[l _i，1 (t _j )，l _i，2 (t _j )，...，l _i，p (t _j )]∈R ^1×p (ii) a x represents the raw sample, p represents the number of sensors; x is a radical of a fluorine atom _i (t _j ) Is X, X = { X = _i (t _j )|i＝1，2，...，N；1，2，...，T _i }。

Further, the third step is specifically divided into the following sub-steps:

the depth LSTM neural network is formed by stacking a plurality of layers of LSTMs, and the vector dimension of each layer of LSTM is variable; the HI health index is decoded into a multidimensional sensor time sequence through a first layer LSTM, the output of the upper layer of a depth LSTM network is used as the input of the next layer, and the updating formula of the l layer is as follows:

wherein l represents the layer number of the deep LSTM neural network, t represents the unit number of the LSTM at a certain moment,

an input unit indicating the time of the l-th layer t,

a forgetting unit indicating the time t of the l-th layer,

an output unit indicating the time t of the l-th layer,

a status cell indicating the time t of the l-th layer,

indicating a hidden unit at the moment of the ith layer t,. Sigma.indicating a sigmoid activation function,. Alpha.indicating an element multiplication calculation,. Tanh indicating a tanh activation function,

representing the hidden unit weight at layer l-1 time instant t,

representing the hidden unit weight at time t-1 of the l-th layer,

indicating a deviation;

and outputting the multidimensional characteristic vector by the last unit of the LSTM neural network of the last layer, and calculating by a linear layer to obtain the RUL predicted value.

Further, the fourth step is specifically divided into the following sub-steps:

(1) The input layer of the deep LSTM neural network is MLP-LSTM supervisedThe first layer of the combined model comprises n neurons, and the output layer of the MLP neural network is the first layer (1) of the MLP-LSTM supervised combined model and only has one neuron; designing neuron error delta of l layer and l-1 layer of MLP-LSTM supervised joint model ^l 、δ ^l-1 The MLP neural network and the deep LSTM neural network are used for realizing synchronous training of the MLP neural network and the deep LSTM neural network;

δ ^l ＝(w ^l+1 ) ^T δ ^l+1 (12)

wherein, w and B are weight parameters and batch sizes of the neural network respectively;

(2) In the supervised joint training, a square error loss function constrained by L2 regularization is applied to carry out gradient self-adaptive training parameters, a score function is adopted to evaluate the prediction accuracy of the MLP-LSTM supervised joint model, the score function is added into a global loss function with certain weight as punishment, and the square error loss function is optimized to obtain an MLP-LSTM supervised joint model biased to early prediction:

wherein, the calculation formula of the square error loss function is as follows:

wherein the values of Θ, w, B, λ,

and y _i Respectively representing a parameter set learned in the MLP-LSTM supervised joint model, a weight parameter set in the MLP-LSTM supervised joint model, a batch size, a regularization parameter, a predicted RUL and a true RUL of an ith sample;

the scoring function Score is calculated as follows:

d＝RUL _pred -RUL _true (16)

the global Loss function Loss _total The calculation formula of (a) is as follows:

Loss _total ＝αLoss _score +(1-α)Loss _MSE (17)

wherein α is the weight of two scoring functions;

(3) Training an MLP-LSTM supervised joint model by adopting a labeled training set through RMSprop gradient self-adaptation, wherein the calculation formula is as follows:

where r is an accumulated variable of the history gradient, ρ is a contraction coefficient for controlling acquisition of history information, η is a learning rate, δ is a constant, and g is Loss _total Of the gradient of (c).

The invention has the following beneficial effects:

the invention provides a general RUL semi-supervised joint prediction framework method aiming at analyzing the conventional RUL prediction method in the running degradation process of machine equipment and combining a deep learning theory and the problems existing in the previous research, thereby realizing the synchronous RUL prediction of health index data fusion and multi-sensor data. The training model provides a continuous visualization process of system degradation, but also ensures efficient prediction of the generated fusion signal for RUL and rapid convergence of the predictive model training process. Furthermore, a loss function of the RUL life prediction model is modified, so that the trained model is more biased to early prediction, the prediction result can ensure the maintenance to be advanced, and the prediction model is safer.

Drawings

FIG. 1 is a schematic diagram of an MLP-LSTM neural network model;

FIG. 2 is a flow diagram of a joint model framework training implementation;

FIG. 3 is a diagram illustrating results generated after different normalization strategies;

FIG. 4 is a graphical (raw) plot of the data fusion output health indicator HI over time, wherein the upper graph in FIG. 4 represents the HI output for all turbine data in the test set, and the lower graph represents the HI output for selected test set partial sample data.

FIG. 5 is a graphical representation of a time-varying (filtered) plot of data fusion output health indicator HI, where the upper graph in FIG. 5 represents the HI output for all turbine data in the test set and the lower graph represents the HI output for selected test set partial sample data.

FIG. 6 is a schematic diagram of a fitting curve of the MLP-LSTM neural network model.

FIG. 7 is a schematic diagram of an on-line prediction RUL fitting curve.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and preferred embodiments, and the objects and effects of the present invention will become more apparent, it being understood that the specific embodiments described herein are merely illustrative of the present invention and are not intended to limit the present invention.

In the method for predicting the residual service life based on the MLP-LSTM supervised joint model, the MLP-LSTM supervised joint model is realized by adding a DBN neural network between an input layer and a deep LSTM neural network, and the MLP neural network is used for data fusion; the deep LSTM neural network is used for residual life prediction, i.e., RUL prediction. And inputting the data slice sequence into an MLP neural network of an MLP-LSTM supervised joint model to calculate the HI time sequence of a single dimension. The last layer of the MLP neural network is connected with a deep LSTM network model, namely the HI sequence fused by the MLP neural network data is sequentially input into the deep LSTM network for calculation according to the length of time _ step. And after a complete MLP-LSTM network joint prediction model is obtained, further performing gradient descent iteration by using an RMSprop optimizer. The specific structure is shown in fig. 1. In addition, because the MLP neural network can deeply extract data characteristics, the overfitting problem can easily occur in the training process, the network is optimized by adopting a batch standardization network layer, and a regularization item is added to the MLP neural network to solve the problem. Inputting the data of the test set, outputting the change process of the health index HI in the MLP network, and outputting the predicted value of the RUL in the final network layer of the whole model. In the concrete engineering implementation, the network layers in the two models are connected in the same model, and an RMSprop optimizer is used for iterative joint training. The entire training process proceeds as shown in fig. 2.

The method comprises the following steps:

the method comprises the following steps: collecting equipment data to form a data set, dividing the data set into a training set and a verification set, and preprocessing the data according to different working conditions; the data of the labeled training set comprises a time stamp of time sequence data, a numerical value of each characteristic variable at each moment, and an RUL data label or equipment life end time for calculating the RUL label; the content of the labeled verification set is the same as that of the labeled training set, and the number of the labeled verification sets is 10-30% of that of the labeled training set.

The tagged data set in the step one is as follows:

X _o ＝{(x _it ，rul _it )|i≤n，t≤T _i } (1)

among them, rul _it For the value of the remaining service life at time t,

rul _it ＝T _i -t (2)

when the device is completely out of service, rul _it 0, and all ruls _it Is inversely increased in time sequence;

x _it for the ith sensor data sequence from initial to time t,

x _it ＝[x _i (1)，x _i (2)，...，x _i (t)] (3)

wherein x is _i From initial to time T for ith sensor data _i The sequence of (a) to (b),

x _i ＝[x _i (1)，x _i (2)，...，x _i (T _i )] (4)

The LSTM recurrent neural network has standard input forms (batch _ size, time _ steps, feature _ nums), where batch _ size refers to the number of samples that are processed in a batch during the training of the neural network model, time _ steps refers to the time step of the time series data in each sample, and feature _ nums refers to the number of dimensions of features in the multi-sensor data. In order to further process the data set into a standard pattern, a sliding time window method is used for sample sampling.

in the second step, a multi-sensor information multi-dimensional time sequence is input into the MLP, the MLP compresses the multi-dimensional data into one dimension, and finally a set comprising a health index HI time sequence is output;

the MLP building and pre-training process comprises the following steps:

wherein the value of the L +1 layer node j is

H＝{h _i (t _j )|i＝1，2，...，N；1，2，...，T _i } (6)

Wherein H is the health index H at each time point _i (t _j ) Set of constituents, h _i (t _j )＝f(x _i (t _j ))，f(x _i ) The function is a function corresponding to the MLP neural network; t is timeThe length of the sequence; h represents the health index HI; x is a radical of a fluorine atom _i (t _j ) Denotes t _j Set of data/for each sensor at a time, x _i (t _j )＝[l _i，1 (t _j )，l _i，2 (t _j )，...，l _i，p (t _j )]∈R ^1×p (ii) a x denotes the raw sample, p denotes the number of sensors; x is the number of _i (t _j ) Is X, X = { X = } _i (t _j )|i＝1，2，...，N；1，2，...，T _i }。

the third step is specifically divided into the following substeps:

the depth LSTM network is formed by stacking a plurality of layers of LSTMs, and the vector dimension of each layer of LSTM is variable; the HI health index is decoded into a multidimensional sensor time sequence through a first layer LSTM, the output of the upper layer of a depth LSTM network is used as the input of the next layer, and the updating formula of the l layer is as follows:

where l represents the number of layers of the deep LSTM neural network, t is the tableThe number of cells at a time of the LSTM is shown,

an input unit indicating the time t of the l-th layer,

a forgetting unit indicating the time t of the l-th layer,

an output unit representing the time t of the l-th layer,

a status cell indicating the time t of the l-th layer,

representing the hidden unit weight at the time t of layer l-1,

representing the hidden unit weight at time t-1 of the l-th layer,

indicating a deviation;

Step four: based on an error calculation loss function between a predicted value and a true value of the RUL, training an MLP-LSTM supervised joint model by adopting a training set through RMSprop gradient self-adaptation; when the error result obtained after the training set and the verification set are input into the current model is smaller than a certain value or the variation of the error result is smaller than a certain value, the loss function of the model training is converged, the model training is finished, and the MLP-LSTM supervised combined model is stored;

the fourth step is specifically divided into the following substeps:

(1) The input layer of the deep LSTM neural network is the l-th layer of the MLP-LSTM supervised joint model and comprises n neurons, and the output layer of the MLP neural network is the l-1-th layer of the MLP-LSTM supervised joint model and only has one neuron; designing neuron error delta of l layer and l-1 layer of MLP-LSTM supervised joint model ^l 、δ ^l-1 The MLP neural network and the deep LSTM neural network are used for realizing synchronous training of the MLP neural network and the deep LSTM neural network;

δ ^l ＝(w ^l+1 ) ^T δ ^l+1 (12)

wherein w and B are weight parameters and batch sizes of the neural network, respectively;

(2) In the supervised joint training, a square error loss function constrained by L2 regularization is applied to carry out gradient self-adaptive training parameters, a score function is adopted to evaluate the prediction accuracy of the MLP-LSTM supervised joint model, the score function is added into a global loss function with certain weight as punishment, the square error loss function is optimized, and the MLP-LSTM supervised joint model biased to early prediction is obtained:

wherein, the square error loss function calculation formula is as follows:

wherein the values of Θ, w, B, λ,

the scoring function Score is calculated as follows:

d＝RUL _pred -RUL _true (16)

Loss _total ＝αLoss _score +(1-α)Loss _MSE (17)

wherein α is the weight of two scoring functions;

(3) Training the MLP-LSTM supervised joint model by adopting a labeled training set through RMSprop gradient self-adaptation, wherein the calculation formula is as follows:

The usefulness of the present invention is illustrated below with reference to a specific industrial example. The invention adopts an open source turbofan engine degradation simulation data set C-MAPSS provided by NASA (national aeronautics and astronautics administration) as an example, the data specifically comprises four sub data sets FD001-FD004 with different operating conditions and failure modes, each data set comprises three files of train _ FD00X, test _ FD00X and RUL _ FD00X which are respectively RUL truth labels of a training set, a testing set and a testing set. The details are shown in the following table:

table 1: details of C-MAPSS dataset

The data set FD002 is mainly used as a research object, compared with FD001 or FD003, the multi-sensor data of the data set has 6 working conditions, the complexity of the external environment is higher, the data is more, and the RUL prediction is more difficult in theory. The specific meanings of the various dimensions of the sensor are shown in the following table:

table 2: multi-sensor data specific representation of a turbomachine

After data are obtained, the data set is divided to obtain a label-free training set, a label training set and a label verification set, the original data are subjected to condition normalization processing according to 6 working conditions of the data set, and then sliding window processing is carried out to obtain a data slice sequence. The operating conditions of the turbine have a great influence on the sensor values, with the readings of the sensor in different states lying in completely different value ranges. The global normalization ignores the influence of working condition conditions, and all values of each sensor are normalized simultaneously. And the condition normalization is to normalize the data of each sensor under the same working condition. As shown in fig. 3, the effect of a certain turbine unit sensor 4, 7 after processing under different normalization strategies, respectively. If global normalization is used, although the prediction accuracy of the RUL is not affected, the output HI of the data fusion model is caused to be a global normalization variable, and a degradation trend is difficult to present. Therefore, a conditional normalization strategy is used in the data preprocessing to obtain the health index HI of the degradation trend.

The training data set and the test data set each had 100 units of turbomachinery. A sliding window of size num steps is used in each subset of units to generate the input sequence. For the model structure itself, it is mainly influenced by two hyper-parameters: batch size and num steps, which are set to different values and trained for different LSTM models in order to compare the effects of different hyper-parameters. The appropriate parameters are then selected based on the Score obtained for each model in the test set.

And adjusting regularization parameters of Dropout and the BN network according to the change of train _ loss and val _ loss and the stable condition, wherein over-fitting problems can be caused if the regularization is too small, and the model precision can be influenced if the regularization is too large. In order to prevent overfitting and reduce the training time cost, an early-stop strategy is used, a threshold value of loss reduction change is set, and training is stopped when the change does not exceed the threshold value in n continuous periods. An appropriate threshold parameter may be set to implement the early-stop strategy as required by the accuracy of the model results.

Based on the built joint training neural network, the output of the multi-sensor data fusion model, namely the time sequence of the health index HI, can be obtained at the middle network layer. The health indicator decay process curve for each turbine is shown in FIG. 4. The HI time series are further filtered using a Savitzky-Golay filter. The method is a method for realizing smooth filtering in a time domain by combining convolution and local polynomial regression. The filter is characterized in that the shape and the width of a signal can be kept unchanged while noise is filtered. The filtered HI time series curve is shown in fig. 5.

Based on the FD002 subdata set of the C-MAPSS data set, the RUL prediction of the test set is carried out by using the neural network model, and the RUL prediction result of the deep learning model on the multi-sensor time series of 100 turbines in the test set can be obtained as shown in FIG. 6. It was found that the MLP-LSTM combined model allows a good prediction of the RUL for each turbine based on the current multi-dimensional sensor time series. And further taking each time series slice as the state of one turbine unit, namely taking the historical information of the turbine at each moment as input, and realizing online real-time RUL prediction. The predicted results for 4 of these turbines are shown in figure 7. The predicted results are shown in table 3 for comparison with other methods:

table 3: comparison result of each RUL prediction algorithm model under RMSE and SCORE indexes

Methods	RMSE	Score
			MLP	80.03	7.80×10 ⁶
SVR	42.00	1.74×10 ⁴
			RVR	31.30	5.90×10 ⁵
CNN	30.29	1.36×10 ⁴
			Depth LSTM	14.93	465
MLP-LSTM	12.74	335

As can be seen from table 3, the LSTM neural network exhibits better prediction effect on the RUL prediction application than the conventional machine learning algorithm and other deep learning algorithms. The MLP-LSTM combined neural network training model provided by the research can provide a health index HI for data fusion. Due to the more complex neural network structure, the deep extraction of model features can be realized, the prediction error is smaller, and the predicted value is more accurate. While having an error on the Score evaluation system of a lower order of magnitude.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and although the invention has been described in detail with reference to the foregoing examples, it will be apparent to those skilled in the art that various changes in the form and details of the embodiments may be made and equivalents may be substituted for elements thereof. All modifications, equivalents and the like which come within the spirit and principle of the invention are intended to be included within the scope of the invention.

Claims

1. A residual service life prediction method based on an MLP-LSTM supervised joint model is characterized in that the MLP-LSTM supervised joint model is that an MLP neural network is added between an input layer and a deep LSTM neural network, and the MLP neural network is used for data fusion; the deep LSTM neural network is used for predicting the residual service life, namely RUL;

the method comprises the following steps:

the fourth step is specifically divided into the following substeps:

δ ^l ＝(w ^l+1 ) ^T δ ^l+1

wherein the values of Θ, w, B, λ,

the calculation formula of the scoring function Score is as follows:

d＝RUL _pred -RUL _true

the global penalty function Loss _total The calculation formula of (c) is as follows:

Loss _total ＝αLoss _score +(1-α)Loss _MSE

wherein α is the weight of the two scoring functions;

where r is an accumulated variable of the history gradient, ρ is a contraction coefficient for controlling acquisition of history information, η is a learning rate, δ is a constant, g is Loss _total A gradient of (a);

2. The MLP-LSTM supervised joint model-based remaining service life prediction method as recited in claim 1, wherein the labeled data set in the first step is:

X _o ＝{(x _it ，rul _it )|i≤n，t≤T _i } (1)

among them, rul _it For the value of the remaining service life at time t,

rul _it ＝T _i -t (2)

x _it for the ith sensor data sequence from initial to time t,

x _it ＝[x _i (1)，x _i (2)，...，x _i (t)] (3)

wherein x is _i For the ith sensor data from initial to time T _i The sequence of (a) to (b),

x _i ＝[x _i (1)，x _i (2)，...，x _i (T _i )] (4)

3. The MLP-LSTM supervised joint model-based residual service life prediction method as recited in claim 1, wherein in the second step, a multi-sensor information multi-dimensional time series is input into the MLP, the MLP compresses the multi-dimensional data into one dimension, and finally, a set comprising a health index HI time series is output;

the MLP building and pre-training process comprises the following steps:

wherein the value of the L +1 layer node j is

H＝{h _i (t _j )|i＝1，2，...，N；j＝1，2，...，T} (6)

Wherein H is the health index H at each time point _i (t _j ) Set of constituents, h _i (t _j )＝f(x _i (t _j ))，f(x _i ) The function is a function corresponding to the MLP neural network; t is the length of the time sequence; h represents the health index HI; x is a radical of a fluorine atom _i (t _j ) Represents t _j Set of each sensor data l at a time, x _i (t _j )＝[l _i，1 (t _j )，l _i，2 (t _j )，...，l _i，p (t _j )]∈R ^1×p (ii) a x represents the raw sample, p represents the number of sensors; x is a radical of a fluorine atom _i (t _j ) Is X, X = { X = } _i (t _j )|i＝1，2，...，N；j＝1，2，...，T}。

4. The MLP-LSTM supervised joint model-based remaining service life prediction method as recited in claim 1, wherein the step III is specifically divided into the following sub-steps:

the depth LSTM neural network is formed by stacking a plurality of layers of LSTMs, and the vector dimension of each layer of LSTM is variable; the HI health index is decoded into a multidimensional sensor time sequence through a first layer LSTM, the output of an upper layer of a deep LSTM network is used as the input of a next layer, and an updating formula of a l layer is as follows:

wherein l represents the number of layers of the deep LSTM neural network, t represents the number of units at a certain time of the LSTM,

an input unit indicating the time t of the l-th layer,

representing a forgetting unit at the moment of the ith layer t,

an output unit indicating the time t of the l-th layer,

a status cell indicating the time t of the l-th layer,

representing the hidden unit weight at the time t of layer l-1,

representing the hidden unit weight at time t-1 of the l-th layer,

represents a deviation;

and outputting the multi-dimensional characteristic vector by the last unit of the LSTM neural network of the last layer, and obtaining the RUL predicted value through linear layer calculation.