CN114792026A

CN114792026A - Method and system for predicting residual life of aircraft engine equipment

Info

Publication number: CN114792026A
Application number: CN202110545171.2A
Authority: CN
Inventors: 文一凭; 谭铮
Original assignee: Hunan University of Science and Technology
Current assignee: Hunan University of Science and Technology
Priority date: 2021-05-19
Filing date: 2021-05-19
Publication date: 2022-07-26

Abstract

The invention discloses a method and a system for predicting the residual life of aeroengine equipment, which are used for carrying out data preprocessing on historical degradation data of complex equipment acquired based on a sensor, then directly carrying out self-adaptive feature extraction on an original monitoring signal based on a stacking denoising self-encoder, and finally intelligently selecting a feature index as an equipment health index HI and inputting the feature index as a similarity degradation model to finally obtain a life prediction result. The problem that sufficient priori knowledge is lacked, and therefore the class is difficult to label manually or the cost for carrying out manual class labeling is too high is solved well by utilizing the stacking denoising self-encoder, and meanwhile, the denoising self-encoder has the denoising capability, so that the improvement of prediction precision can be effectively improved in the face of noise mixed in monitoring data under a complex working environment.

Description

Method and system for predicting residual life of aircraft engine equipment

Technical Field

The invention relates to the field of health management of aero-engines, and provides an engine residual life prediction method based on fusion of an SDAE neural network and a similarity degradation model by effectively combining machine learning and the similarity degradation model.

Background

With the rapid development of scientific technology, the integration level, complexity and intelligent degree of mechanical equipment are increased rapidly, and the traditional fault diagnosis and maintenance support technology is difficult to adapt to new requirements gradually. With the explosive growth of the monitoring data volume of the equipment and the development of storage technology and computing power, the research on the application of the data-driven prediction method based on machine learning in the residual life prediction of the equipment is more and more concerned at present.

The method has the advantages that due to the fact that the quantity of sensors of the equipment is large, the types of sensors are various, high-frequency sampling is achieved, the problems that the space dimensionality of equipment data is high, the dependency relationship is complex, the rule is variable, the data size is large and the like are caused, the problems that the traditional machine learning algorithms lack powerful extraction capability on equipment degradation characteristics at present, manual participation is needed, data labels need to be marked and the like are solved, and the prediction method based on the single machine learning model lacks generalization capability when the conditions that the equipment is various, the working condition is complex and the like are met.

The aircraft engine is used as a core component of the aircraft, the operation state of the aircraft engine has important influence on the aircraft, and the service life prediction of the aircraft engine has important significance for improving the safe and reliable work of the aircraft engine and ensuring the flight safety of the aircraft.

Disclosure of Invention

The invention aims to solve the technical problem that the prior art is not enough, provides a method and a system for predicting the residual service life of equipment, and solves the problem that the prediction error in the prior art is large.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a method for predicting the residual life of equipment comprises the following steps:

s1, acquiring data acquired by an equipment sensor, preprocessing the data, and splitting the preprocessed data into a training set and a test set;

s2, training the SDAE neural network by using the training set, and acquiring equipment health indexes of each equipment in the test set at each time point by using the trained SADE neural network to obtain an equipment health index curve;

and S3, predicting the residual life of the equipment by using the equipment health index curve.

The step S1 reduces the size of the data after the processing of deleting redundant data, improves the computational efficiency of the data, and prevents loss of important data features by performing padding processing on missing data instead of directly discarding the missing data so that the data features are preserved. Normalization processing of the deleted and filled data eliminates dimensional effects between the dimensional sensor data.

Step S2 is to use the SDAE neural network to extract the characteristics of the device data to obtain the health index of the device, the influence of the sensor data of each dimension on the result is considered under the condition of not depending on expert experience and a large amount of manual input, meanwhile, the SDAE neural network can avoid the poor characteristic extraction result caused by the noise existing in the original data, and has good robustness, and the SDAE neural network has a deep structure, thereby avoiding the problems of insufficient information mining capability, low recognition accuracy and the like caused by the traditional shallow structure algorithm.

And step S3, a similarity degradation model is established by taking data as support, historical data is fully utilized, a complex mathematical model does not need to be established, the residual life is predicted from the similarity degree between samples, the model is simpler and more universal compared with a complex machine learning prediction model, and the problem that the complex machine learning prediction model cannot accurately adapt to a test set due to the randomness of degradation processes of different devices, so that the prediction error is larger is solved.

In step S1, the implementation process of preprocessing the data includes: deleting redundant data at the same time point, and only keeping one usable data; filling the missing data by adopting the average of the data of the upper time point and the lower time point of the missing data; and carrying out normalization processing on the deleted and filled data to obtain preprocessed data. The data size is reduced after the redundant data is deleted, the data calculation efficiency is improved, the data characteristics are kept by filling the missing data instead of directly discarding the missing data, and the important data characteristics are prevented from being lost. The normalization processing is carried out on the deleted and filled data to eliminate the dimensional influence among all the dimension sensor data and solve the comparability of the data.

In step S1, the preprocessed data set is divided into two mutually exclusive sub-data sets, that is, the data set is divided into two sets without intersection, where one data set is used as a training set, the other data set is used as a test set, the data samples in the training set do not appear in the test set, and the data samples in the test set do not appear in the training set.

The specific implementation process of step S2 includes:

1) initializing an SDAE neural network structure;

2) selecting an m-dimensional sample x at a jth time point of the training set data ^(j) ＝[x ^(j,1) ,x ^(j,2) ,....x ^(j,m) ]Wherein x is ^(j) For the m-dimensional sensor data at each time point extracted from the training set data, j is 1,2, …, and T is the number of time points; inputting samples of T time points into an SDAE neural network;

3) for data sample x ^(j) Denoising autoencoder DAE for adding noise and training SDAE neural network to contain 1 st hidden layer ₁ And x is ^(j) Is coded into h ₁ ^(j) Use of

Training a denoised self-encoder DAE comprising a 2 nd hidden layer for input data ₂ And will be

Is coded into

And so on until the SDAE neural network is trained to include the Nth hidden layer, and the well-trained SDAE neural network is obtained;

4) and inputting the test set data into the trained SDAE neural network, and performing self-adaptive feature extraction through a plurality of hidden layers of the trained SDAE neural network to obtain equipment health indexes of each equipment in the test set at each time point.

Step S2, the SDAE neural network is used for extracting the features of the equipment data to obtain the equipment health index, the influence of the sensor data of each dimension on the result is considered under the condition of not depending on expert experience and a large amount of manual input, meanwhile, the SDAE neural network can avoid the poor feature extraction result caused by noise existing in the original data, the SDAE neural network has good robustness, and the problems of insufficient information mining capability, low identification accuracy and the like caused by the traditional shallow structure algorithm are solved.

The specific implementation process of step S3 includes: measuring the similarity degree of the test sample and the training sample according to the Euclidean distance of the equipment health index curve of the test sample and the equipment health index curve corresponding to the training sample, and predicting the residual life of the test sample based on the similarity degree and the residual life of the training sample; the test sample refers to a certain sample to be predicted extracted from the test set, and the training sample refers to all samples extracted from the training set.

Judging the similarity degree of the equipment health index curve corresponding to the test sample and each equipment health index curve corresponding to the training sample by using the equipment number corresponding to the minimum Euclidean distance between the equipment health index curve corresponding to the test sample and each equipment health index curve corresponding to the training sample; wherein, the equipment health index curve number corresponding to the distance minimum value of the equipment health index curve corresponding to the test sample and each equipment health index curve corresponding to the training sample

The Euclidean distance between a test sample equipment health index curve and a q-th equipment health index curve of a training sample i is obtained;

represents that the running time length in the training sample i is l _i The curve of the health index of the equipment of (1),

indicates that the training sample i has a running time of l _i A temporal equipment health indicator value;

indicating a running time of l _s The device health indicator curve of the test sample of (a),

indicates that the test specimen has a running time of l _s A temporal equipment health indicator value;

represents the length in the training sample i and the running time l of the test sample _s The same qth equipment health indicator curve,

representing the training sample i as running for q + l _s -a value of device health indicator at 1; d is a radical of ⁱ Representing the value of q when the Euclidean distance between each equipment health index curve in the training sample i and the equipment health index curve of the test sample is minimum, wherein the equipment health index curve of the training sample i

And test sample equipment health index curve

The degree of similarity is the highest in the case of the same,

represents that the training sample i has a running time length d ⁱ +l _s Value of device health index at time-1, i.e.

For training the best matching curve in the sample i, an

And

has a Euclidean distance of

The method takes a plurality of equipment health index curves for each training sample, and makes full use of the historical data of the sample. And the calculation does not involve complicated steps such as derivation and error back propagation which are commonly used in the traditional machine learning algorithm, the calculation is simple and convenient, and the calculation efficiency and the universality are higher compared with the traditional machine learning model.

The calculation formula of the predicted residual life of the test sample based on the training sample is as follows:

wherein the content of the first and second substances,

n is the number of training samples; RUL ⁱ ＝l _i -(d ⁱ +l _s ) The predicted remaining life for the test sample based on the training sample i. The calculation formula for predicting the residual life fully utilizes the historical data of a plurality of similar samples, predicts the residual life from the similarity degree among the samples, and makes the prediction result more accurate and reasonable based on the weighting calculation of the similarity degree among the samples to the prediction result of each life.

In the present invention,

the formula ensures that the sum of the weights of all the finally obtained similar samples is 1, and the Euclidean distance between the optimal matching curve in the training sample i and the health index curve of the test sample equipment

The smaller the weight w of the predicted remaining life of the training sample i based on this formula ⁱ The larger the size is, the simpler the calculation is, and the weight distribution result is reasonable.

The invention also provides a system for predicting the residual life of the equipment, which comprises computer equipment; the computer device is configured or programmed for performing the steps of the above-described method.

Compared with the prior art, the invention has the beneficial effects that:

1) the invention provides a data driving method based on machine learning and similarity degradation model fusion, which is used for predicting the residual life of equipment, greatly reduces the artificial participation degree in the prediction process through the strong feature extraction capability of an SDAE neural network, and effectively improves the adaptability of the prediction method in different scenes;

2) the model fusion prediction method based on the similarity degradation model well solves the problem that due to the randomness of degradation processes of different devices, the model cannot accurately adapt to a test set, so that prediction errors are large;

3) the method realizes accurate prediction of the residual life of the equipment through an indirect prediction strategy, in the indirect prediction strategy, an equipment health index curve needs to be constructed firstly, then multi-step or iterative prediction is carried out on the equipment health index curve, the residual life value is finally predicted, and the prediction result can be better close to the real life value.

Drawings

Fig. 1 is a block diagram of a stacked noise reduction auto-encoder of the present invention.

FIG. 2 is a block diagram of a similarity degradation model of the present invention.

FIG. 3 is a flow chart of the feature extraction step of the present invention.

Fig. 4 is an algorithm flow diagram of the present invention.

Detailed Description

The method selects the data of the engine sensor to extract the degradation characteristics of the engine. Aiming at the fact that sensor data contain a large number of complex degradation features, a traditional feature extraction method usually depends on manual processing, the method utilizes a stacking denoising self-encoder to better solve the problem that the cost is too high due to the fact that enough priori knowledge is lacked and manual class labeling is difficult to carry out, meanwhile, the denoising self-encoder has denoising capacity, and can effectively improve prediction accuracy when the noise mixed in the data is monitored in a complex working environment. Meanwhile, the prediction method based on the similarity degradation model well solves the problem that due to the randomness of degradation processes of different devices, the model cannot be accurately adapted to a test set, so that prediction errors are large.

The theoretical basis of the scheme of the invention is as follows:

1. the proposal of the automatic encoder is as follows:

the single automatic encoder is an unsupervised neural network with three layers, which is divided into an input layer, a hidden layer and an output layer, and aims to reproduce input signals as much as possible to enable output values to be equal to input values, the network is divided into an encoding network part and a decoding network part, and noise interference is added to input data to become a denoising automatic encoder, as shown in fig. 1. Taking a denoising automatic coding machine as an example, in the training process, the denoising automatic coding machine encodes the data added with noise through a coding network, then decodes the coding result by using a decoding network, uses the difference between the generated reconstruction data and the original input as a reconstruction error, and trains by using a gradient descent algorithm.

For a single de-noising automatic encoder, e.g. given a set of device data samples

Where M is the number of data samples, x ^(j) Is the jth input data sample, the input data from the coder is x ^(j) By randomly mapping the function q _D For input data x ^(j) Go to destruction, data x ^(j) Start to carry noise, x ^(j) Encoded network function f _θ Generating output h of hidden layer ^(j) Output h of the hidden layer ^(j) Then decoding the network function

Generating reconstruction data z ^(j) Input of numberAccording to x ^(j) And reconstructed data z ^(j) The difference of (c) is used as reconstruction error for training.

By randomly mapping a function q _D For original data x ^(j) The noise adding is carried out in consideration of reality, the working environment of the equipment is complex, the working condition fluctuation is large, interference is easily caused in the data acquisition process, the acquired data have noise, in order to enable the feature transformation learned by the automatic coding machine to be as robust as possible, pollution and loss of the original data can be resisted to a certain degree, certain constraint is given to the automatic coding machine from the data at first, and then the noise problem caused by the reality reason is solved by reconstructing sample data with noise. The main idea is as follows: the coding network adds noise with certain statistical characteristics into sample data, then codes the noise-added sample to obtain hidden layer data, and the decoding network estimates the original form of the noise-added sample from the hidden layer data, so that the automatic coding machine learns more robust characteristics from the noise-added sample, and the sensitivity of the automatic coding machine to micro-disturbance and data insufficiency or error is reduced.

Wherein the coding function f _θ Adding noisy samples x in a given data set ^(j) Conversion to hidden layer output h ^(j) The method comprises the following steps:

h ^(j) ＝f _θ (x ^(j) )＝s _f (Wx ^(j) +b) (15)

in the formula s _f To encode the activation function of a network, θ is the set of parameters of the network, and has:

θ＝{W,b} (16)

w is the weight of the input layer to the hidden layer, and b is the bias term.

Then the hidden layer outputs h ^(j) By decoding the network g _θ′ Retransforming to input data x ^(j) Is represented by z ^(j) The method comprises the following steps:

in the formula g _θ′ To decode the activation function of the network, θ' is the set of parameters of the encoded network, and has:

θ′＝{W′,b′} (18)

w 'is the weight from the hidden layer to the output layer, and b' is the bias term. And to simplify the calculation of the weight gradient descent, there is W ═ W ^T 。

Autoencoder pass minimization of x ^(j) And z ^(j) Reconstruction error L (x) therebetween ^(j) ,z ^(j) ) And finishing the training of the whole network.

L(x ^(j) ,z ^(j) )＝||x ^(j) -z ^(j) || (19)

The minimum mean square error is adopted as a cost function, and the error function is formed by a gradient descent method

And minimizing, and finishing the training of the denoising automatic encoder.

2. Similarity degradation model:

the prediction method based on the similarity degradation model belongs to a data driving method, and the main idea can be expressed as follows: if the test sample and the reference sample have similar degradation performance, they may have similar remaining life. The test sample refers to a sample taken from a device that has not failed (test device), and the reference sample refers to a historical sample taken from a device that has failed (training device) operating under the same operating conditions.

FIG. 2 depicts a remaining life overall framework based on a similarity degradation model. In the training stage, firstly, feature extraction is carried out on the sensor data of the training equipment, the data are mapped into a one-dimensional equipment health index HI, and equipment health index curves of all training units changing along with time are fitted to represent the degradation process. And then extracting a sub-curve from the equipment health index curve and putting the sub-curve into a reference sample library for residual life estimation. And in the testing stage, the same processing is carried out on the sensor data of the testing equipment to extract a testing sample, the testing sample is matched with the reference sample library, and the residual life is predicted.

Specifically, the prediction method of the embodiment of the invention comprises the following steps:

1) data preprocessing: firstly, preprocessing data, including data redundancy and data missing processing, K-means normalization processing, and dividing a training set and a test set.

2) Characteristic extraction: and setting the number of layers of the SDAE, the network batch processing number and the noise adding rate, extracting the characteristics of the equipment degradation process by using the SDAE neural network, and fusing multidimensional parameter data of the equipment sensor to form single-dimensional parameter data serving as an equipment health index of the training set equipment.

3) And (3) predicting the residual life: and carrying out moving average treatment on the obtained equipment health index curve to reduce local noise, fitting the equipment performance degradation trend by using a similarity degradation model, and predicting the residual life of the equipment.

The data preprocessing step in the step (1) is as follows

The method for processing data redundancy in the multi-dimensional numerical data from the equipment sensor is to directly delete the redundant data at the same time point and only keep one piece of available data; the data missing processing is to fill the missing data into the average of the upper and lower time point data of the missing data; the K-means normalization processing is used for clustering working conditions of the original data and then normalizing the working conditions to be in a [0,1] interval, so that dimension difference among sensors is eliminated; the method for dividing the training set and the test set is to divide the data set into two mutually exclusive parts, namely, to divide the data set into two sets without intersection, wherein one data set is used as the training set, the other data set is used as the test set, the data sample of the training set does not appear in the test set, and the data sample of the test set does not appear in the training set. Typically the ratio of training set to test set is 70% to 30%. Meanwhile, the division of the training set test set has two cautions: 1. the consistency of the data distribution is maintained as much as possible. The influence on the final result due to extra deviation introduced in the data dividing process is avoided; 2. several random divisions are used to avoid instability of the single-use leave-out method.

The characteristic extraction step in the step (2) is as follows

1) Initializing an SDAE network structure, for example, setting the number of layers of the SDAE, the number of nodes in each layer, and the like, wherein the SDAE network structure is formed by stacking a plurality of Denoise Auto Encoders (DAEs).

2) Selecting m-dimensional sample x at jth time point of normalized training set data ^(j) ＝[x ^(j,1) ,x ^(j ^,2) ,....x ^(j,m) ]Wherein x is ^(j) The method comprises the steps that m-dimensional sensor data at each time point extracted for training set data are obtained, j is 1,2, …, T is time point number, samples at all the T time points are input into an SDAE network, the input layer node number of the network is consistent with the dimension of the sensor data, and namely the input layer node number is m.

3) Training the hidden layer by using a layer-by-layer training algorithm to extract deep layer features, wherein a training flow chart is as follows, and a data sample x is subjected to ^(j) Make noise to train DAE ₁ ，DAE ₁ Representing a denoised self-encoder containing a 1 st hidden layer, and converting x ^(j) Is coded into

Namely:

in the formula (II)

Is DAE ₁ Activation function of the coding network, theta ₁ Is DAE ₁ Set of parameters of the coding network, w ₁ Is DAE ₁ Input layer to hidden layer weight of b ₁ Is the bias term therein. Because of

Can be reconstructed as input data, so x is obtained ^(j) The main information of (1). Then use

Training DAE for input data ₂ , DAE ₂ Representing a denoised self-encoder containing a 2 nd hidden layer and encoding the input as

Namely:

in the formula

Is DAE ₂ Activation function of the coding network, theta ₂ Is DAE ₂ Set of parameters of the coding network, w ₂ Is DAE ₂ Input layer to hidden layer weight of b ₂ Is the bias term therein.

This process is repeated until the DAE _N After the pre-training is finished, a plurality of DAEs are connected with one another by layer training to form an SDAE stacking structure, and the output of a layer N is hidden at the moment

The dimensionality reduction and denoising of the input data by the network are completed, namely:

in the formula

Is DAE _N Activation function of the coding network, θ _N Is DAE _N Set of parameters of the coding network, w _N Is DAE _N Input layer to hidden layer weight of b _N Is the bias term therein.

Because the sensor data of the equipment contains important information of the whole process from the normal operation state to the degradation state of the equipment, the important information can be used as a direct reaction of the equipment state, the output of the hidden layer N is the optimal one-dimensional data representation of the input multi-parameter sensor data and can be used as a health state representation of the equipment, and the equipment health index HI of the equipment at the jth time point is as follows:

5) when verification is carried out on the test set, test data are input into a trained SDAE network to carry out self-adaptive feature extraction through a plurality of hidden layers, and equipment health indexes of each equipment in the test set at each time point are obtained.

The residual life prediction in the step (3) comprises the following steps

There are currently two types of data-driven remaining life prediction strategies: direct prediction and indirect prediction. The former uses historical state monitoring data and various test data to directly establish the mapping relation between the original data and the residual life value. The indirect prediction needs to construct an equipment health index curve, and then perform multi-step or iterative prediction on the equipment health index curve, so as to finally predict the residual life value.

y _{RUL_direct} ＝f _RUL (x ^(j,1) ,x ^(j,2) ,....x ^(j,m) )

(5)

HI ^j ＝f _HI (x ^(j,1) ,x ^(j,2) ,....x ^(j,m) )

(6)

y _{RUL_indirect} ＝g _RUL (HI ^j )

(7)

In the formula, x ^(j) ＝[x ^(j,1) ,x ^(j,2) ,....x ^(j,m) ]Monitoring data for the state of the apparatus, y _{RUL_direct} And y _{RUL_indirect} Predicted residual life values, HI, for both strategies ^j Is the equipment health index of the equipment at the time point j, f _RUL ， f _HI ，g _RUL The mapping function is characterized.

The method combines the SDAE neural network and the similarity degradation model to adopt the indirect prediction strategy, models the system degradation state through the SDAE neural network to further obtain an equipment health index curve, estimates the residual life of the equipment through the similarity degradation model, and establishes an indirect prediction model extrapolated from the equipment health index curve to a final residual life value.

The main idea of the similarity degradation model applied to the life prediction in the step (3) can be expressed as follows: if the test sample and the training sample have similar degradation processes, they may have similar remaining lifetimes. The test sample refers to a certain sample to be predicted extracted from the test set, and the training sample refers to all samples extracted from the training set.

The distance relationship between the equipment health index curve of the equipment i in the training sample and the equipment health index curve of the test sample is represented by an array S ⁱ The definition, namely:

S ⁱ ＝(t ⁱ ,s),i＝1,2,.....n (8)

wherein

Denotes run time in training sample is l _i N represents the number of training samples, s ═ s ₁ ,s ₂ ,...s _ls ]Representing run time of l _s The equipment health indicator curve of the test sample. Extracting length l from each training sample according to a point-by-point calculation mode _s The equipment health index curve of (1), then training sample t ⁱ May be represented as:

in the formula: matrix H ⁱ Represents the training sample t ⁱ Set of sub-tracks of (2), each row vector in the matrix

q＝1,2,...l _i -l _s +1, length in training sample i and test sample time length l _s The same qth equipment health index curve. Training samples t based on representation ⁱ The equipment health index curve set matrix H ⁱ Calculating Euclidean distances between equipment health index curves of all training samples i and equipment health index curves of the test samples:

in the formula

Is the Euclidean distance between the test sample equipment health index curve and the q-th equipment health index curve of the training sample i. The best-match curve position can be expressed as:

d ⁱ and the value of q when the distance between each equipment health index curve in the training sample i and the equipment health index curve of the test sample is minimum is represented.

At this time, d of training sample i ⁱ Health index curve of strip equipment

And test sample equipment health index curve

Most similar, i.e. that

For training the best matching curve in the sample i, an

And

in betweenEuropean distance of

Because the equipment health index is used as the reflection of the equipment health state, and the similar equipment health index curve means the similar residual life, the running time of the training sample i can be d ⁱ The time remaining life as a test sample has a running time of l _s The remaining life of the battery.

Training sample i has a run length of d ⁱ The remaining lifetime at time is expressed as:

RUL ⁱ ＝l _i -(d ⁱ +l _s ) (12)

the result, i.e., the test sample, is based on the predicted remaining life of the training sample i.

Each device i in the training sample set can generate an estimated residual life for the test sample, the final residual life of the test sample is calculated by weighting each residual life obtained based on n training samples, and the Euclidean distance between the optimal matching curve in the training sample i and the health index curve of the test sample device

The smaller, the more weight the predicted remaining life based on training sample i:

the weighting function should be a monotonically decreasing function, in the present invention the weighting function w ⁱ Comprises the following steps:

the data set used in the present example is from C-MAPSS, which is a simulation-based large commercial turbofan engine developed by NASA from the start of operation to failure.

Specifically, when data is obtained, the data has 27 columns, each row represents monitoring data of the engine at a certain time point, and the 1 st column represents an engine starting number; column 2 indicates the time point; columns 3-5 show the setting of 3 operating conditions; columns 6-27 show the monitored values of 21 sensors on the engine.

First, the embodiment of the present invention defines x _k Represents all data from device k and lets

Denotes the measured values of the sensor i in the device K at the time T, K being 1,2 _k Where K is the number of devices, T _k Is the total running time of the equipment k, and let

Data sample composed of monitoring values of a plurality of sensors in a device k at time t

m is the number of sensors. The solution of data redundancy is to directly delete redundant data at the same time point and only keep one usable data, i.e. there are multiple data

Deleting other redundant data and only keeping one usable data; data missing is processed by filling out missing data as the average of the data of upper and lower time points of the missing data, i.e. when missing

Such that:

because the equipment is in different working conditions during operation, and the difference of the equipment operation data among different working conditions is large, the different working conditions need to be identified and then normalized. The method mainly has the advantages that the time of K-means clustering in the aspect of calculation is short, the speed is high, the idea is simple and easy to explain, and the like, and the input engine data are clustered according to the working conditions and then subjected to normalization processing. Normalizing the raw data to a [0,1] interval by normalization processing, and eliminating dimensional differences among the sensors, wherein the method of the embodiment adopts linear function normalization, and the normalization method is as follows:

the method of the embodiment normalizes the data of the sensor at the time t to [0,1]From the original data x ^(t) Converted into normalized data N (x) ^(t) ) Wherein x is _max Maximum value of data, x, representing the operating mode of the plant data at time t _min The minimum value of the data representing the working condition of the equipment data at the time point t; the division of the training set and the test set is left out, and 30% of data are randomly taken as the test set, and the rest 70% of data are taken as the training set.

Because the data of the sensor is 21-dimensional, the dimension of the data of the input SDAE neural network is 21, the node number of the hidden layer N is determined as the single-parameter dimension of the equipment health index of 1, only the number of the middle layer and the node number are used as the super-parameters influencing the SDAE performance, the adjustment is needed, and the data reconstruction error of the SDAE model is used as the index. The smaller the reconstruction error value, the better the feature extraction effect of the SDAE model. After multiple tests, the network structure is determined to be 21-10-5-1-5-10-21, the network batch processing quantity is 256, the data noise rate is 0.1, the loss function is defined to be Mse, the optimization algorithm is Adam, the iteration times of all data are 50, and training set equipment and test set equipment of the data are sequentially brought into the neural network for training and testing.

This embodiment uses three different prediction error calculation formulas, where MAE is the average of the absolute errors. It can better reflect the actual situation of the prediction error, and is defined as follows:

MSE is the average of the squared errors. It can evaluate the degree of change of the data, defined as follows:

cumulative Relative Accuracy (CRA), which can comprehensively evaluate the accuracy of predictive methods by aggregating relative prediction precision. Given the remaining life prediction result, the CRA value calculation formula is as follows:

in the formula, w _k The weight coefficient is that because equipment is in the operation final stage when being close to the trouble, the equipment residual life who accurately discerns this stage makes equipment maintain or replace when being close to scrapping and helps improving equipment resource utilization ratio and production security, in order to improve the degree of emphasis to equipment operation final stage prediction result, has:

k is from the start of the prediction time t _k Predicted duration to the time of failure of the device, RA (t) _k ) To predict the time point as t _k Relative prediction accuracy of time, wherein:

wherein ActRIL (t) _k ) Is the actual remaining life at the time of prediction, RUL (t) _k ) Is predicted residual Life, CRA valueThe closer to 1, the more accurate the remaining life estimation result of the prediction method.

In order to check the prediction capability of the model on the residual service life of equipment and consider that the running time of most engines in a test set is 150-200 h and the equipment is more prone to failure at the end of running so as to have great influence on production work, a prediction time starting point t is selected _k 160h for remaining life prediction of the engine in the test set.

Meanwhile, in order to verify the effectiveness of the invention, the embodiment introduces two common data driving methods for constructing the equipment health index as comparison, namely a method based on PCA and a method based on BP network when constructing the equipment health index; and two commonly used models for predicting RUL, namely an LSTM network-based approach and a BP network-based approach. The results of the RUL predictions for a number of engines in the test set are shown in table 1.

TABLE 1 remaining Life prediction results for a number of engines

Claims

1. A method for predicting the residual life of equipment is characterized by comprising the following steps:

s1, acquiring data acquired by the equipment sensor, preprocessing the data, and splitting the preprocessed data into a training set and a test set;

2. The method for predicting the remaining life of equipment according to claim 1, wherein in step S1, the step of preprocessing the data includes: deleting redundant data at the same time point, and only keeping one usable data; filling the missing data by adopting the average of the data of the upper time point and the lower time point of the missing data; and carrying out normalization processing on the deleted and filled data to obtain preprocessed data.

3. The method of predicting remaining life of equipment according to claim 1 or 2, wherein in step S1, the preprocessed data set is divided into two mutually exclusive sub data sets, one of which is used as a training set, and the other is used as a test set.

4. The method for predicting the remaining life of equipment according to claim 1, wherein the step S2 is implemented by:

1) initializing an SDAE neural network structure;

3) for data sample x ^(j) Denoising autoencoder DAE for adding noise and training SDAE neural network and containing 1 st hidden layer ₁ And x is ^(j) Is coded into

Use of

Training a denoised autocoder DAE comprising a 2 nd hidden layer for input data ₂ And will be

Is coded into

And so on until the SDAE neural network is trained to include the Nth hidden layer, so as to obtain the trained SDAE neural network;

5. The method for predicting the remaining life of equipment according to claim 1, wherein the step S3 is implemented by: measuring the similarity degree of the test sample and the training sample according to the Euclidean distance of the equipment health index curve of the test sample and the equipment health index curve corresponding to the training sample, and predicting the residual life of the test sample based on the similarity degree and the residual life of the training sample; the test sample refers to a certain sample to be predicted extracted from the test set, and the training sample refers to all samples extracted from the training set.

6. The method for predicting the remaining life of equipment according to claim 5, wherein the similarity between the equipment health index curve corresponding to the test sample and each equipment health index curve corresponding to the training sample is judged by using the equipment number corresponding to the minimum Euclidean distance between the equipment health index curve corresponding to the test sample and each equipment health index curve corresponding to the training sample; wherein, the equipment health index curve number corresponding to the distance minimum value of the equipment health index curve corresponding to the test sample and each equipment health index curve corresponding to the training sample

denotes a run-time of l in the training sample i _i The equipment health meansThe line of the standard curve is shown,

representing training sample i at run time of l _i A temporal equipment health index value;

indicating a running time of l _s The device health indicator curve of the test sample of (1),

indicates that the test specimen has a running time of l _s A temporal equipment health index value;

representing training sample i as q + l in running time _s -a value of device health indicator at 1; d is a radical of ⁱ The value of q when the Euclidean distance between each equipment health index curve in the training sample i and the equipment health index curve of the test sample is minimum is represented, and at the moment, the equipment health index curve of the training sample i

And test sample equipment health index curve

The degree of similarity is the highest in the case of the same,

representing the training sample i as running for a time d ⁱ +l _s Value of device health index at time-1, i.e.

For training the best matching curve in the sample i, an

And

has a Euclidean distance of

7. The method for predicting the residual life of equipment according to claim 6, wherein the formula for predicting the residual life of the test samples based on the training samples is as follows:

wherein, the first and the second end of the pipe are connected with each other,

n is the number of training samples; RUL ⁱ ＝l _i -(d ⁱ +l _s ) The predicted remaining life for the test sample based on the training sample i.

8. The method of predicting remaining life of a device according to claim 7,

9. a system for predicting the residual life of equipment is characterized by comprising computer equipment; the computer device is configured or programmed to perform the steps of the method according to any one of claims 1 to 8.