CN113642754B

CN113642754B - Complex industrial process fault prediction method based on RF noise reduction self-coding information reconstruction and time convolution network

Info

Publication number: CN113642754B
Application number: CN202110442777.3A
Authority: CN
Inventors: 高学金; 马东阳; 韩华云; 高慧慧
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2021-04-23
Filing date: 2021-04-23
Publication date: 2023-10-10
Anticipated expiration: 2041-04-23
Also published as: CN113642754A

Abstract

The invention discloses a new method for predicting faults of a complex industrial process, which comprises two stages of 'fault state feature extraction' and 'fault prediction'. The fault state feature extraction includes: firstly, screening out characteristics related to faults from data of a complex industrial process by utilizing a random forest algorithm; then introducing a stack noise reduction self-coding network to perform feature reconstruction, further constructing Square Prediction Error (SPE) statistics as fault state features, and determining a control limit by using a kernel density estimation method; and finally substituting the new data into the model, calculating statistics thereof and judging whether the statistics are normal. "failure prediction" includes: and firstly, forming the SPE into a time sequence, and then, utilizing a prediction model of the SFTCN to realize trend prediction of the SPE. According to the invention, the random forest algorithm is adopted to reduce the training cost of the stack noise reduction self-coding network, and the improved time convolution network is adopted to effectively extract the time sequence characteristic of the fault state, so that the fault prediction precision is higher.

Description

Complex industrial process fault prediction method based on RF noise reduction self-coding information reconstruction and time convolution network

Technical Field

The invention relates to the technical field of fault prediction based on data driving, in particular to a fault prediction technology for a continuous process. The data-driven-based method of the present invention is a specific application in the prediction of faults in complex industrial processes.

Background

With the rapid development of information technology and automation technology, the integration level and complexity of modern industrial systems are increasing. The mutual influence among the parts is also more and more complex, so that the probability of system failure and functional failure is gradually increased, and once the failure occurs, the damage influence is extremely large, and the failure and paralysis of the whole system can be seriously caused. Therefore, with the increase of the requirements for the reliability of the system, the fault prediction technology is focused on in the industry and in the academia. The fault prediction refers to predicting the time of fault occurrence or judging whether the system will fail at a future moment according to the acquired past and present running states of the system. On-line fault prediction, also known as prediction of system life, is aimed at predicting the Remaining Useful Life (RUL) before a fault occurs and the running process collapses given the current plant status and past operating conditions.

The accuracy of a predictive system is related to its ability to predict fault degradation. At present, scholars at home and abroad have developed various prediction methods and are successfully used for estimating different types of degradation processes. The failure prediction method can be divided into two methods: model-based methods and data-driven based methods. For model-based methods, it is assumed that an accurate mathematical model can be constructed using the principles of the system. However, model-based approaches require specific knowledge of the failure mechanism and theory related to the monitored device, in which case it is difficult or even impossible to capture the behavior of the system. The data driven approach uses the actual data to approximate and track the degradation of the system without using any knowledge of the degradation mechanism.

In recent years, deep learning methods, particularly LSTM, are excellent in fully mining time-varying features of time-series data. However, recent studies indicate that convolutional neural networks should be substituted for recurrent neural networks as the first choice for processing sequence tasks, and Time Convolutional Networks (TCNs) have been proposed that have good effects on time series predictions.

Existing fault prediction methods still have some problems, one of the most important being that they use all measured variables for modeling to determine the correlation of faults. In fact, in some fault conditions, not all process variables are disturbed, some will be severely disturbed, while other variables may remain similar to the normal ones and therefore may not contain meaningful information about the fault. Since not all measured variables are related to the fault degradation process, an important issue is to eliminate redundant features.

Disclosure of Invention

The high dimensionality, nonlinearity and other characteristics of the complex industrial process data lead to the defects of large calculation amount, long operation time consumption and the like in the operation of a fault monitoring system. In order to discover the fault state of a complex industrial process as soon as possible and accurately predict the fault trend so as to facilitate early handling by staff, a complex industrial fault state trend prediction method based on noise reduction self-coding information reconstruction and an SFTCN network is provided. Aiming at the collected data in the complex industrial process, firstly, a random forest algorithm is utilized to screen out characteristic variables related to faults as input data of a fault monitoring network, non-relevant characteristics are eliminated, and the dimension reduction of the input data is realized. And then, further extracting nonlinear fault characteristics by using a stack noise reduction self-coding network, reconstructing the input characteristics, and acquiring fault state characteristics SPE according to the reconstructed residual error. And finally, forming the obtained fault state characteristics SPE into a time sequence, and realizing the state trend prediction of the SPE through a prediction model of an improved time convolution network (SFTCN). The method can reduce the complexity of the model by adopting a random forest algorithm, can improve the robustness of the reconstructed model by adopting a noise reduction self-coding network, effectively reduces the occurrence of false alarm and missing report in process monitoring, and improves the accuracy of fault monitoring. In addition, the prediction precision of the TCN network can be improved by introducing a Swish activation function and FRN normalization.

1. The invention adopts the following technical scheme and implementation steps:

fault state feature extraction is divided into two parts, namely offline training and online monitoring:

offline training stage:

1) Screening out characteristic variables related to faults in the industrial process by utilizing a random forest algorithm, and realizing dimension reduction of input data;

the specific process comprises the following steps:

for a feature set preliminarily selected in a complex industrial process, F= { F ₁ ,f ₂ ,…,f _N N is the size of the feature set, and when the random forest algorithm is used for selecting the features of the feature set F, the feature set F is selected by the featuresThe feature importance is obtained by applying disturbance to out-of-bag (OOB) data which does not participate in decision tree training, and then calculating the change of classification accuracy. The random forest algorithm randomly extracts Q bootstrap data sets, and the Q OOB data sets are corresponding to the Q bootstrap data sets, and the steps of feature importance ordering are as follows:

(1) Initializing q=1;

(2) Training a decision tree by using the q-th bootstrap data set, and calculating the classification accuracy of the q-th OOB data set

(3) For feature I in OOB dataset _u U=1, 2, …, N applies a perturbation, recalculates the classification accuracy

(4) Repeating steps (2), (3) for q=2, 3, …, Q;

(5) Computing characteristics I _u Importance, i.e.

(6) P pair P _u And (4) arranging in a descending order to obtain feature importance ranking, wherein the higher the ranking is, the higher the importance is.

And arranging the feature sets F according to the importance descending order, wherein the features corresponding to the first J importance are feature variables which are screened out in the complex industrial process and are relevant to faults.

2) According to the characteristic selection result in the step 1), collecting historical data under normal working conditions of the industrial process, and forming a sample set X= (X) by data obtained by offline testing ₁ ,X ₂ ,...,X _K ) ^T Wherein X is _i Representing the ith sampling instant data, comprising K sampling instants, each sampling instant collecting J process variables, namely X _i ＝(x _i,1 ,x _i,2 ,...,x _i,J ) Wherein x is _i,J A measured value of a J-th process variable representing an i-th sampling instant;

3) The historical data X is subjected to standardized processing in the following manner:

first, the mean and standard deviation of all process variables at all times of the historical data X are calculated, wherein the mean of the jth process variable at the kth sampling timeThe calculation formula of (2) is->Measurement of the jth process variable representing the kth sampling instant, k=1,..k, j=1,..j; standard deviation s of the jth process variable at the kth sampling instant _k,j The calculation formula of (a) is as follows,

k=1..k, j=1..j.; the historical data X is then normalized, wherein the normalized calculation formula for the jth process variable at the kth sampling instant is as follows:

normalized data isWherein->Noise added is +.>Where j=1,..j, k=1,..k,;

4) Inputting the standardized data added with noise in the step 3) into a stack noise reduction self-coding network for training to obtain a reconstruction feature Z= (Z) ₁ ,Z ₂ ,...,Z _K ) ^T Wherein the reconstruction data at the ith moment is Z _i ＝(z _i,1 ,z _i,2 ,...,z _i,J ) Where i=1,..k, j=1,..j;

the stack noise reduction self-coding network structure is specifically as follows:

the stack noise-reducing automatic coding network structure includes input layer, hidden layer and output layer, and the input is the standardized industrial process data of step 3) added with noiseThe coding process from the input layer to the hidden layer comprises the following specific forms:

h is a nonlinear activation function, here selected as Sigmoid function, W ₁ 、B ₁ For the weight matrix and bias vector of the coding network, Y is the hidden layer data, i.e. the extracted robust features.

The decoding process from the hidden layer to the output layer comprises the following specific forms:

Z＝H(W ₂ Y+B ₂ )

wherein Z is output layer data, W ₂ 、B ₂ A weight matrix and a bias vector for the decoding network;

the loss function of the stack noise reduction self-coding network is as follows:

training the model is preceded by a network parameter θ= { W ₁ ,W ₂ ,B ₁ ,B ₂ Randomly initializing, and iteratively training a stack noise reduction automatic coding network by adopting a gradient descent algorithm;

since the noise reduction self-coding network training initialization parameters are set based on the reconstruction errors. The smaller the reconstruction error, the higher the ability of the neural network to recover the original signal, the more efficient the noise reduction self-encoding neural network can recover the original training sample data signal at the highest level of the high-dimensional features learned from the original signal.

Statistics Spe= (SPE) corresponding to modeling data ₁ ,., SPEK), k=1, SPE statistics at K, K sample times are defined as follows:

for input +.>The actual output obtained from the trained stack noise reduction self-coding network is obtained; finally, estimating an estimated value of the obtained SPE statistic when a confidence limit is preset by using a nuclear density estimation method, and taking the estimated value as a control limit of the SPE statistic;

on-line monitoring:

5) Collecting data tm of J process variables at the mth sampling time of the current industrial process, and normalizing the data tm obtained in the step 3) according to the mean value and the standard deviation of the m time obtained in the historical data

6) Calculating a monitoring statistic TSPE of data collected at the mth moment of the current industrial process _m The calculation formula is as follows:

wherein z is _m To be used inInputting the reconstructed characteristics obtained by the stack noise reduction self-coding network trained in the step 4;

7) The monitoring statistic TSPE calculated in the step 6) is processed _m Comparing the detected fault with the control limit determined in the step 4), and if the detected fault exceeds the limit, judging that the fault occurs, and alarming; otherwise, the method is normal;

8) If the industrial process is finished, stopping monitoring; otherwise, collecting the data of the next moment, returning to the step 7), and continuing to monitor the process. TSPE at each time acquired at this time _m Is a fault status feature.

Fault prediction stage:

9) Combining the acquired fault state features into a time series tspe= { TSPE ₁ 、...、TSPE _V Normalized with a linear normalization function (Min-Max Normalization) pair, wherein the state data value TSPE at time v _v The normalized calculation formula of (c) is as follows,v＝1,...,Vk＝1,...,K；[A,B]is the normalized region. The normalized TSPE' is divided into a training data set and a test data set for fault prediction.

10 Before SFTCN modeling, a sliding window is set for processing data to obtain input and output sample pairs of a prediction model, and the method is specifically as follows:

wherein Y is _train Input samples representing a predictive model, Y _train The model label of the prediction model, namely the actual output value of the prediction model, n is the number of training samples of the model, and h represents the size of the sliding window;

11 Modeling training the improved time convolution network SFTCN with the processed training data;

the improved time convolution network SFTCN adopts a time convolution network TCN and improves a residual error network of the time convolution network TCN, and the specific improvement comprises two steps: first, replace the ReLU activation function with Swish function to solve the problem that the negative gradient is set to zero and the neuron may not be activated due to zero derivative in the negative interval of the data; secondly, a Filter Response Normalization normalization layer is used for replacing a weight normalization layer and is used for further improving the expression capacity of the network;

the SFTCN network parameter setting conditions are as follows: the maximum iteration number is 1000, the number of network layers is 4, the convolution kernel size of the dilation causal convolution is 2, dropout is 0.05, the sliding window length is 10, batch is 256, the learning rate is 0.002, and an adam optimizer is adopted.

The training data is input into the model, the mean square error of the predicted data and the original data is calculated, and the network parameters are updated through back propagation. And (5) continuously iterating, and storing the trained model.

12 And (3) carrying out sample processing in the step 10) on the normalized test data by using a sliding window, substituting the processed test data into a trained SFTCN network, further predicting fault trend, and verifying the accuracy of the model.

Advantageous effects

1) The method adopts the stack noise reduction automatic encoder to build the monitoring model, reconstructs the original characteristics, introduces the steps of noise adding and noise removing on the basis of the automatic encoder, is more sensitive to faults than the traditional automatic encoder, enhances the robustness of the monitoring model, improves the accuracy of fault monitoring, and can reduce the occurrence of false alarm and missing alarm in the process monitoring.

2) The method adopts the time convolution network to establish a fault prediction model, introduces a Swish activation function and FRN standardization to improve TCN, and improves the precision of time convolution network fault prediction.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 shows input data X _960×52 A schematic diagram of the composition form;

FIG. 3 is a schematic diagram of a data composition of input data after random forest feature selection;

FIG. 4 is a fault condition feature extraction flow chart;

FIG. 5 is an automatic encoder configuration of the present invention;

FIG. 6 is a time convolution network architecture of the present invention;

FIG. 7 is a residual block of a time convolution network;

FIG. 8 is a residual module of the SFTCN proposed by the present invention;

FIG. 9 is a fault 13 (slow drift fault) random forest feature selection result;

FIG. 10 is a SPE monitoring graph of fault 13 (slow drift fault) for the method of the present invention;

FIG. 11 is a fault 8 (random fault) random forest feature selection result;

FIG. 12 is a SPE monitoring graph of the method of the present invention for fault 8 (random fault);

FIG. 13 is a graph showing the effect of the method of the present invention in predicting a fault 13 (slow drift fault);

FIG. 14 is a graph showing the effect of the method of the present invention on fault 8 (random fault) prediction;

fig. 15 (a) -15 (d) are graphs showing the effect of fault prediction on the fault 13 (slow drift fault) by the comparison method;

FIGS. 16 (a) -16 (d) are graphs showing the effect of fault prediction on fault 8 (random fault) by the comparison method;

Detailed Description

Depending on the actual chemical reaction process, eastman chemical company in the United states developed a chemical model simulation platform-Tennessee Eastman (TE) simulation platform that was open and challenging. The TE chemical process is a prototype of an actual process flow, and the whole process comprises 5 main operation units, namely a reactor, a condenser, a circulating compressor, a gas-liquid separator and a product desorber, and the generated data has time-varying, strong coupling and nonlinear characteristics and is widely used for testing control and fault diagnosis models of complex industrial processes.

The TE process has 41 total measured variables and 12 manipulated variables and sets 21 fault types. The 12 th manipulated variable XMV (12) remains unchanged throughout the process and is not used as input data. As the original observed signal vector at a particular moment, 52 variables [ XMEAS (1), …, XMEAS (41), XMV (1), …, XMV (11) ] are chosen. In the simulation experiment, the total reaction time of the TE process is set to 48h, and the sampling interval is set to 3min. All anomalies were introduced 8h after the start of the simulation. Thus, simulation of each type of fault may collect 960 data samples, each sample data containing 52-dimensional features, each type of abnormal data sample being a 960 x 52 sample matrix. The test is performed herein for fault 13 and fault 8, respectively.

The method of the invention is applied to the TE process simulation object, which comprises two large steps of fault state feature extraction and fault prediction, and is specifically stated as follows:

offline training stage:

step 1: screening out characteristic variables related to faults in the industrial process by utilizing a random forest algorithm, and realizing dimension reduction of input data;

the method specifically comprises the following steps: the feature set preliminarily selected in the complex industrial process in the implementation is F= { F ₁ ,f ₂ ,…,f _u ,…,f _N } = { XMEAS (1), …, XMEAS (41), XMV (1), …, XMV (11) }, where n=52 is the size of the feature set, containing 41 measurement variables and 11 process variables. When the random forest algorithm is used for selecting the characteristics of the characteristic set F, the characteristic importance is obtained by applying disturbance to out-of-bag (OOB) data which does not participate in the training of the decision tree and then calculating the change of the classification accuracy by taking the characteristic importance as a standard. The random forest algorithm randomly extracts 30 bootstrap data sets, and the steps of feature importance sequencing corresponding to 30 OOB data sets are as follows:

(1) Initializing q=1;

(3) For feature f in OOB dataset _u U=1, 2, …, N applies a perturbation, recalculates the classification accuracy

(4) Repeating steps (2), (3) for q=2, 3, …, Q;

(5) Calculating the characteristic f _u Importance, i.e.

And arranging the feature sets F according to the importance descending order, wherein the first 30 features corresponding to the importance are feature variables which are screened out in the complex industrial process and are relevant to faults.

For the third feature XMEAS (3), the feature importance calculation formula is:

wherein, the liquid crystal display device comprises a liquid crystal display device,classification accuracy for the q-th OOB dataset,/->For feature f in the q-th OOB dataset ₃ Adding classification accuracy after disturbance; then, the importance is ranked, and the first J features are selected as the features after screening, in this embodiment J is 30.

In addition to the simulation situation, in the actual complex industrial process, the characteristic variables need to be selected preliminarily according to the actual situation, the preliminary characteristic variables of the complex industrial process can be different according to different industrial processes and different engineering experiences, and then the important characteristic variables are screened out by a random forest algorithm to serve as input data of the subsequent steps.

Step 2: according to the characteristic selection result of the step 1, collecting historical data under normal working conditions of the industrial process, and forming a sample set X= (X) by data obtained by offline testing ₁ ,X ₂ ,...,X _K ) ^T Wherein X is _i Representing the ith sampling instant data, comprising K sampling instants, each sampling instant collecting J process variables, namely X _i ＝(x _i,1 ,x _i,2 ,...,x _i,J ) Wherein x is _i,J A measurement value of the jth process variable representing the ith sample instant, k=960, j=30;

step 3: the historical data X is subjected to standardized processing in the following manner:

first, the mean and standard deviation of all process variables at all times of the historical data X are calculated, wherein the mean of the jth process variable at the kth sampling timeThe calculation formula of (2) is->x _k,j Measurement of the jth process variable representing the kth sampling instant, k=1,..k, j=1,..j; standard deviation s of the jth process variable at the kth sampling instant _k,j The calculation formula of (2) is-> k=1..k, j=1..j.; the historical data X is then normalized, wherein the normalized calculation formula for the jth process variable at the kth sampling instant is as follows:

normalized data are +.>In this embodiment, gaussian white noise is added, and the noise added is +.>Where j=1,..j, k=1,..k,;

step 4: inputting the standardized data added with noise in the step 3 into a stack noise reduction self-coding network for training to obtain a reconstruction feature Z= (Z) ₁ ,Z ₂ ,...,Z _i ,…,Z _K ) ^T Wherein the reconstruction data at the ith moment is Z _i ＝(z _i,1 ,z _i,2 ,...,z _i,J ) Where i=1,.. J, k=960, j=30;

firstly, determining a stack noise reduction automatic coding network structure, wherein the stack noise reduction automatic coding network structure comprises an input layer, an hidden layer and an output layer, inputting the standardized industrial process data in the step 3 for adding noise, and outputting the standardized industrial process data in the step 3. The coding process from the input layer to the hidden layer comprises the following specific forms:

Z＝H(W ₂ Y+B ₂ )

wherein Z is output layer data,W ₂ 、B ₂ a weight matrix and a bias vector for the decoding network;

training the model is preceded by a network parameter θ= { W ₁ ,W ₂ ,B ₁ ,B ₂ And (3) carrying out random initialization, adopting a gradient descent algorithm to iterate and train the stack noise reduction automatic coding network, and storing the parameters after final iteration.

Statistics Spe= (SPE) corresponding to modeling data ₁ ,...,SPE _K ) SPE statistics at the kth sampling time are defined as follows.

the radial basis function is selected as a nuclear density function, and the nuclear density of the SPE is estimated as follows:

where w is an estimated parameter, and w >0, k=960.

On-line monitoring:

step 5: collecting data t of J process variables at mth sampling moment of current industrial process _m And normalizing the m moment mean and standard deviation obtained according to the historical data in the step 3

Step 6: calculating a monitoring statistic TSPE of data collected at the mth moment of the current industrial process _m The calculation formula is as follows:

step 7: the monitoring statistic TSPE calculated in the step 6 is processed _m Comparing with the control limit determined in step 5 of the modeling stage, if the limit is exceededThe fault is considered to occur, and an alarm is given; otherwise, it is normal.

Step 8: if the industrial process is finished, stopping monitoring; otherwise, collecting the data at the next moment, returning to the step 7), and continuing to monitor the process, wherein the obtained TSPE is the fault state characteristic.

Fault prediction stage:

step 9: combining the acquired fault state features into a time series tspe= { TSPE ₁ 、...、TSPE _V V=960. Normalization with a linear normalization function (Min-Max Normalization) in which the fault state characteristics TSPE at time 3 ₃ The normalized calculation formula of (c) is as follows,[A,B]is the normalized region. Here a=0, b=1, i.e. the TSPE value is attributed between 0 and 1. The normalized TSPE' is divided into a training data set and a test data set for fault prediction.

Step 10: before SFTCN modeling, a sliding window is set for processing data to obtain input and output sample pairs of a prediction model, and the method is specifically as follows:

wherein X is _train Input samples representing prediction samples, Y _train Model labels representing the prediction model, i.e., the actual output value of the prediction model, n=400 being the number of training samples of the model, h=10 representing the sliding window size;

step 11: modeling and training the processed data on the improved time convolution network SFTCN.

The improved time convolution network SFTCN adopts a time convolution network TCN and improves a residual error network of the time convolution network TCN, and the specific improvement comprises two steps: first, replace the ReLU activation function with Swish function to solve the problem that the negative gradient is set to zero and the neuron may not be activated due to zero derivative in the negative interval of the data; secondly, a Filter Response Normalization normalization layer is used for replacing a weight normalization layer and is used for further improving the expression capacity of the network; the network structure before and after improvement is shown in fig. 7 and 8 in detail.

Step 12: and (3) processing the normalized test data by using the sliding window sample in the step (10), and substituting the processed test data into a trained SFTCN network so as to predict the fault trend.

The steps are specific application of the method in the field of TE simulation platform fault prediction. In order to verify the effectiveness of the method, experiments of online prediction stages were performed on fault 8 (random fault) and fault 13 (slow drift fault), respectively. Fig. 9 is a graph of the result of feature selection of the random forest algorithm on the fault 13 data, and fig. 11 is a graph of the result of feature selection of the random forest algorithm on the fault 8 data. Only 30 variables are taken as the input of the subsequent stack noise reduction self-coding network, and are arranged according to the characteristic weight. The experimental results obtained in the fault state feature extraction stage are shown in fig. 10 (fault 13) and fig. 12 (fault 8), each of which includes a broken line parallel to the abscissa and a curve, wherein the broken line parallel to the abscissa is a control limit determined by the kernel density estimation method, and the curve is a real-time monitoring value. If the value of the curve is greater than the value of the control limit, indicating that the industrial process has failed at the moment; otherwise, the industrial process is indicated to be operating normally. The experimental results of the fault prediction stage are shown in fig. 12 to 16, where the dashed black line is the predicted value of the fault state, and the solid black line without the dashed line is the actual value of the fault state. If the prediction curve is close to the real curve, the method can accurately predict the fault trend; if the deviation is large, the method can not predict failure trend well.

Fig. 10 and fig. 12 are graphs of fault monitoring effects of the method according to the present invention on fault 13 and fault 8 data, wherein a dashed line parallel to the abscissa is a control limit, and a curve is a real-time SPE monitoring value. It can be found that the SPE monitoring graphs in the two graphs are out of limit from 161, namely faults can be monitored, and the monitoring effect is good. Fig. 13 is a graph showing the effect of the method of the present invention on fault prediction for fault 13 (slow drift fault), where the dashed black line is the predicted value of the fault state and the solid black line without the dashed line is the true value of the fault state; fig. 14 is a graph showing the effect of the method of the present invention on fault 8, where the dashed black line is the predicted value of the fault state, and the solid black line without the dashed line is the actual value of the fault state. It can be seen that the method of the present invention is well predictive of both different types of fault status data. To verify the validity of the model, MAE, RMSE, MAPE and R are used ² The prediction effect is evaluated, and the evaluation indexes are shown in table 1, and the prediction accuracy of the SFTCN network is relatively high for both slow drift faults and random faults.

Table 1: SPE predictive evaluation index on SFTCN

To quantitatively analyze the superiority of the method, STCN, TCN, LSTM and GRU networks are used for predicting the status trend of industrial process faults. As for the failure 13, the comparative test prediction effect graph and the evaluation index are shown in fig. 15 (a) -15 (d) and table 2, respectively, the black "×" broken line is the failure state prediction value, and the black solid line without "×" is the failure state realism value. Fig. 15 (a) shows STCN network prediction effects, which can completely fit SPE dynamic trends, and fig. 16 (b) -16 (d) show TCN, LSTM, and GRU network prediction effects, respectively, which are slightly deviated from the peak values of SPE dynamic trends.

Table 2: fault 13: comparison of algorithms on evaluation index

For the failure 8, the comparative test prediction effect graph and the evaluation index are shown in fig. 16 (a) -16 (d) and table 3, respectively, the black "×" broken line is the failure state prediction value, and the black solid line without "×" is the failure state realism value.

Table 3: fault 8: comparison of algorithms on evaluation index

In summary, from analysis of the prediction effect graphs (fig. 13-16) and the table (2) -table (3) data, it can be seen that the SFTCN network proposed by the present invention has much improved accuracy in predicting fault conditions over TCN, LSTM and GRU networks for slow drift faults and random faults.

Claims

1. The complex industrial process fault prediction method based on the RF noise reduction self-coding information reconstruction and the time convolution network comprises two stages of fault state feature extraction and fault prediction, and is characterized by comprising the following steps of:

offline training stage:

2) According to the characteristic selection result in the step 1), collecting historical data under normal working conditions of the industrial process, and forming a sample set X= (X) by data obtained by offline testing ₁ ,X ₂ ,...,X _K ) ^T Wherein X is _i Representing the ith sampleThe etching data contains K sampling moments, and each sampling moment acquires J process variables, namely X _i ＝(x _i,1 ,x _i,2 ,...,x _i,J ) Wherein x is _i,J A measured value of a J-th process variable representing an i-th sampling instant;

first, the mean and standard deviation of all process variables at all times of the historical data X are calculated, wherein the mean of the jth process variable at the kth sampling timeThe calculation formula of (2) is->x _k,j Measurement of the jth process variable representing the kth sampling instant, k=1,..k, j=1,..j; standard deviation s of the jth process variable at the kth sampling instant _k,j The calculation formula of (2) is-> The historical data X is then normalized, wherein the normalized calculation formula for the jth process variable at the kth sampling instant is as follows:

normalized data isWherein-> Noise added is +.>Where j=1,..j, k=1,..k,;

4) Inputting the standardized data added with noise in the step 3) into a stack noise reduction self-coding network for training to obtain a reconstruction feature Z= (Z) ₁ ,...,Z _i ,...,Z _K ) ^T Wherein the reconstruction at the ith moment is characterized by Z _i ＝(z _i,1 ,z _i,2 ,...,z _i,J ) Where i=1,..k, j=1,..j;

the stack noise reduction automatic coding network structure comprises an input layer, an implicit layer and an output layer, wherein the input is the standardized industrial process data of the step 3) of adding noiseThe coding process from the input layer to the hidden layer comprises the following specific forms:

h is a nonlinear activation function, here selected as Sigmoid function, W ₁ 、B ₁ For the weight matrix and the bias vector of the coding network, Y is hidden layer data, namely the extracted robust feature;

Z＝H(W ₂ Y+B ₂ )

wherein W is ₂ 、B ₂ A weight matrix and a bias vector for the decoding network;

statistics Spe= (SPE) corresponding to modeling data ₁ ,...,SPE _K ) SPE statistics at k=1,..k, K, the kth sampling time are defined as follows:

on-line monitoring:

5) Collecting data t of J process variables at mth sampling moment of current industrial process _m The mean and standard deviation of the m moment obtained according to the historical data in the step 3) are standardized to obtain

8) If the industrial process is finished, stopping monitoring; otherwise, collecting the data of the next moment, returning to the step 7), continuing to monitor the process, and obtaining the TSPE of each moment _m Is a fault status feature;

fault prediction stage:

9) Combining the acquired fault state features into a time series tspe= { TSPE ₁ 、...、TSPE _V Normalized by a linear Normalization function Min-Max, normalization, wherein the fault state characteristics TSPE at time v _v The normalized calculation formula of (c) is as follows,v＝1,...,V,[A，B]is the normalized region; the normalized TSPE' is divided into a training data set and a test data set so as to conduct fault prediction;

wherein X is _train Input samples representing a predictive model, Y _train Model tags representing predictive models, i.e. predictive modelsThe actual output value, n is the number of training samples of the model, and h represents the size of the sliding window;

the improved time convolution network SFTCN adopts a time convolution network TCN and improves a residual error network of the time convolution network TCN, and the specific improvement comprises two steps: first, replace the ReLU activation function with Swish function to solve the problem that the negative gradient is set to zero and the neuron may not be activated due to zero derivative in the negative interval of the data; secondly, a Filter, response, normalization layer is used for replacing a weight Normalization layer, so that the expression capacity of the network is further improved;

12 Sample processing is carried out on the normalized test data by the sliding window in the step 10), and then the processed test data is substituted into a trained SFTCN network, so that the fault trend is predicted.

2. The complex industrial process fault prediction method based on the RF noise reduction self-coding information reconstruction and time convolution network according to claim 1, wherein the method comprises the following steps: SFTCN network parameter setting: the maximum iteration number is 1000, the number of network layers is 4, the convolution kernel size of the dilation causal convolution is 2, dropout is 0.05, the sliding window length is 10, batch is 256, the learning rate is 0.002, and an adam optimizer is adopted;

and inputting training data into a model, calculating the mean square error of the predicted data and the original data, updating network parameters through back propagation, and storing the trained model through continuous iteration.