CN113642754B - Complex industrial process fault prediction method based on RF noise reduction self-coding information reconstruction and time convolution network - Google Patents

Complex industrial process fault prediction method based on RF noise reduction self-coding information reconstruction and time convolution network Download PDF

Info

Publication number
CN113642754B
CN113642754B CN202110442777.3A CN202110442777A CN113642754B CN 113642754 B CN113642754 B CN 113642754B CN 202110442777 A CN202110442777 A CN 202110442777A CN 113642754 B CN113642754 B CN 113642754B
Authority
CN
China
Prior art keywords
data
fault
network
noise reduction
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110442777.3A
Other languages
Chinese (zh)
Other versions
CN113642754A (en
Inventor
高学金
马东阳
韩华云
高慧慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202110442777.3A priority Critical patent/CN113642754B/en
Publication of CN113642754A publication Critical patent/CN113642754A/en
Application granted granted Critical
Publication of CN113642754B publication Critical patent/CN113642754B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a new method for predicting faults of a complex industrial process, which comprises two stages of 'fault state feature extraction' and 'fault prediction'. The fault state feature extraction includes: firstly, screening out characteristics related to faults from data of a complex industrial process by utilizing a random forest algorithm; then introducing a stack noise reduction self-coding network to perform feature reconstruction, further constructing Square Prediction Error (SPE) statistics as fault state features, and determining a control limit by using a kernel density estimation method; and finally substituting the new data into the model, calculating statistics thereof and judging whether the statistics are normal. "failure prediction" includes: and firstly, forming the SPE into a time sequence, and then, utilizing a prediction model of the SFTCN to realize trend prediction of the SPE. According to the invention, the random forest algorithm is adopted to reduce the training cost of the stack noise reduction self-coding network, and the improved time convolution network is adopted to effectively extract the time sequence characteristic of the fault state, so that the fault prediction precision is higher.

Description

Complex industrial process fault prediction method based on RF noise reduction self-coding information reconstruction and time convolution network
Technical Field
The invention relates to the technical field of fault prediction based on data driving, in particular to a fault prediction technology for a continuous process. The data-driven-based method of the present invention is a specific application in the prediction of faults in complex industrial processes.
Background
With the rapid development of information technology and automation technology, the integration level and complexity of modern industrial systems are increasing. The mutual influence among the parts is also more and more complex, so that the probability of system failure and functional failure is gradually increased, and once the failure occurs, the damage influence is extremely large, and the failure and paralysis of the whole system can be seriously caused. Therefore, with the increase of the requirements for the reliability of the system, the fault prediction technology is focused on in the industry and in the academia. The fault prediction refers to predicting the time of fault occurrence or judging whether the system will fail at a future moment according to the acquired past and present running states of the system. On-line fault prediction, also known as prediction of system life, is aimed at predicting the Remaining Useful Life (RUL) before a fault occurs and the running process collapses given the current plant status and past operating conditions.
The accuracy of a predictive system is related to its ability to predict fault degradation. At present, scholars at home and abroad have developed various prediction methods and are successfully used for estimating different types of degradation processes. The failure prediction method can be divided into two methods: model-based methods and data-driven based methods. For model-based methods, it is assumed that an accurate mathematical model can be constructed using the principles of the system. However, model-based approaches require specific knowledge of the failure mechanism and theory related to the monitored device, in which case it is difficult or even impossible to capture the behavior of the system. The data driven approach uses the actual data to approximate and track the degradation of the system without using any knowledge of the degradation mechanism.
In recent years, deep learning methods, particularly LSTM, are excellent in fully mining time-varying features of time-series data. However, recent studies indicate that convolutional neural networks should be substituted for recurrent neural networks as the first choice for processing sequence tasks, and Time Convolutional Networks (TCNs) have been proposed that have good effects on time series predictions.
Existing fault prediction methods still have some problems, one of the most important being that they use all measured variables for modeling to determine the correlation of faults. In fact, in some fault conditions, not all process variables are disturbed, some will be severely disturbed, while other variables may remain similar to the normal ones and therefore may not contain meaningful information about the fault. Since not all measured variables are related to the fault degradation process, an important issue is to eliminate redundant features.
Disclosure of Invention
The high dimensionality, nonlinearity and other characteristics of the complex industrial process data lead to the defects of large calculation amount, long operation time consumption and the like in the operation of a fault monitoring system. In order to discover the fault state of a complex industrial process as soon as possible and accurately predict the fault trend so as to facilitate early handling by staff, a complex industrial fault state trend prediction method based on noise reduction self-coding information reconstruction and an SFTCN network is provided. Aiming at the collected data in the complex industrial process, firstly, a random forest algorithm is utilized to screen out characteristic variables related to faults as input data of a fault monitoring network, non-relevant characteristics are eliminated, and the dimension reduction of the input data is realized. And then, further extracting nonlinear fault characteristics by using a stack noise reduction self-coding network, reconstructing the input characteristics, and acquiring fault state characteristics SPE according to the reconstructed residual error. And finally, forming the obtained fault state characteristics SPE into a time sequence, and realizing the state trend prediction of the SPE through a prediction model of an improved time convolution network (SFTCN). The method can reduce the complexity of the model by adopting a random forest algorithm, can improve the robustness of the reconstructed model by adopting a noise reduction self-coding network, effectively reduces the occurrence of false alarm and missing report in process monitoring, and improves the accuracy of fault monitoring. In addition, the prediction precision of the TCN network can be improved by introducing a Swish activation function and FRN normalization.
1. The invention adopts the following technical scheme and implementation steps:
fault state feature extraction is divided into two parts, namely offline training and online monitoring:
offline training stage:
1) Screening out characteristic variables related to faults in the industrial process by utilizing a random forest algorithm, and realizing dimension reduction of input data;
the specific process comprises the following steps:
for a feature set preliminarily selected in a complex industrial process, F= { F 1 ,f 2 ,…,f N N is the size of the feature set, and when the random forest algorithm is used for selecting the features of the feature set F, the feature set F is selected by the featuresThe feature importance is obtained by applying disturbance to out-of-bag (OOB) data which does not participate in decision tree training, and then calculating the change of classification accuracy. The random forest algorithm randomly extracts Q bootstrap data sets, and the Q OOB data sets are corresponding to the Q bootstrap data sets, and the steps of feature importance ordering are as follows:
(1) Initializing q=1;
(2) Training a decision tree by using the q-th bootstrap data set, and calculating the classification accuracy of the q-th OOB data set
(3) For feature I in OOB dataset u U=1, 2, …, N applies a perturbation, recalculates the classification accuracy
(4) Repeating steps (2), (3) for q=2, 3, …, Q;
(5) Computing characteristics I u Importance, i.e.
(6) P pair P u And (4) arranging in a descending order to obtain feature importance ranking, wherein the higher the ranking is, the higher the importance is.
And arranging the feature sets F according to the importance descending order, wherein the features corresponding to the first J importance are feature variables which are screened out in the complex industrial process and are relevant to faults.
2) According to the characteristic selection result in the step 1), collecting historical data under normal working conditions of the industrial process, and forming a sample set X= (X) by data obtained by offline testing 1 ,X 2 ,...,X K ) T Wherein X is i Representing the ith sampling instant data, comprising K sampling instants, each sampling instant collecting J process variables, namely X i =(x i,1 ,x i,2 ,...,x i,J ) Wherein x is i,J A measured value of a J-th process variable representing an i-th sampling instant;
3) The historical data X is subjected to standardized processing in the following manner:
first, the mean and standard deviation of all process variables at all times of the historical data X are calculated, wherein the mean of the jth process variable at the kth sampling timeThe calculation formula of (2) is->Measurement of the jth process variable representing the kth sampling instant, k=1,..k, j=1,..j; standard deviation s of the jth process variable at the kth sampling instant k,j The calculation formula of (a) is as follows,
k=1..k, j=1..j.; the historical data X is then normalized, wherein the normalized calculation formula for the jth process variable at the kth sampling instant is as follows:
normalized data isWherein->Noise added is +.>Where j=1,..j, k=1,..k,;
4) Inputting the standardized data added with noise in the step 3) into a stack noise reduction self-coding network for training to obtain a reconstruction feature Z= (Z) 1 ,Z 2 ,...,Z K ) T Wherein the reconstruction data at the ith moment is Z i =(z i,1 ,z i,2 ,...,z i,J ) Where i=1,..k, j=1,..j;
the stack noise reduction self-coding network structure is specifically as follows:
the stack noise-reducing automatic coding network structure includes input layer, hidden layer and output layer, and the input is the standardized industrial process data of step 3) added with noiseThe coding process from the input layer to the hidden layer comprises the following specific forms:
h is a nonlinear activation function, here selected as Sigmoid function, W 1 、B 1 For the weight matrix and bias vector of the coding network, Y is the hidden layer data, i.e. the extracted robust features.
The decoding process from the hidden layer to the output layer comprises the following specific forms:
Z=H(W 2 Y+B 2 )
wherein Z is output layer data, W 2 、B 2 A weight matrix and a bias vector for the decoding network;
the loss function of the stack noise reduction self-coding network is as follows:
training the model is preceded by a network parameter θ= { W 1 ,W 2 ,B 1 ,B 2 Randomly initializing, and iteratively training a stack noise reduction automatic coding network by adopting a gradient descent algorithm;
since the noise reduction self-coding network training initialization parameters are set based on the reconstruction errors. The smaller the reconstruction error, the higher the ability of the neural network to recover the original signal, the more efficient the noise reduction self-encoding neural network can recover the original training sample data signal at the highest level of the high-dimensional features learned from the original signal.
Statistics Spe= (SPE) corresponding to modeling data 1 ,., SPEK), k=1, SPE statistics at K, K sample times are defined as follows:
for input +.>The actual output obtained from the trained stack noise reduction self-coding network is obtained; finally, estimating an estimated value of the obtained SPE statistic when a confidence limit is preset by using a nuclear density estimation method, and taking the estimated value as a control limit of the SPE statistic;
on-line monitoring:
5) Collecting data tm of J process variables at the mth sampling time of the current industrial process, and normalizing the data tm obtained in the step 3) according to the mean value and the standard deviation of the m time obtained in the historical data
6) Calculating a monitoring statistic TSPE of data collected at the mth moment of the current industrial process m The calculation formula is as follows:
wherein z is m To be used inInputting the reconstructed characteristics obtained by the stack noise reduction self-coding network trained in the step 4;
7) The monitoring statistic TSPE calculated in the step 6) is processed m Comparing the detected fault with the control limit determined in the step 4), and if the detected fault exceeds the limit, judging that the fault occurs, and alarming; otherwise, the method is normal;
8) If the industrial process is finished, stopping monitoring; otherwise, collecting the data of the next moment, returning to the step 7), and continuing to monitor the process. TSPE at each time acquired at this time m Is a fault status feature.
Fault prediction stage:
9) Combining the acquired fault state features into a time series tspe= { TSPE 1 、...、TSPE V Normalized with a linear normalization function (Min-Max Normalization) pair, wherein the state data value TSPE at time v v The normalized calculation formula of (c) is as follows,v=1,...,Vk=1,...,K;[A,B]is the normalized region. The normalized TSPE' is divided into a training data set and a test data set for fault prediction.
10 Before SFTCN modeling, a sliding window is set for processing data to obtain input and output sample pairs of a prediction model, and the method is specifically as follows:
wherein Y is train Input samples representing a predictive model, Y train The model label of the prediction model, namely the actual output value of the prediction model, n is the number of training samples of the model, and h represents the size of the sliding window;
11 Modeling training the improved time convolution network SFTCN with the processed training data;
the improved time convolution network SFTCN adopts a time convolution network TCN and improves a residual error network of the time convolution network TCN, and the specific improvement comprises two steps: first, replace the ReLU activation function with Swish function to solve the problem that the negative gradient is set to zero and the neuron may not be activated due to zero derivative in the negative interval of the data; secondly, a Filter Response Normalization normalization layer is used for replacing a weight normalization layer and is used for further improving the expression capacity of the network;
the SFTCN network parameter setting conditions are as follows: the maximum iteration number is 1000, the number of network layers is 4, the convolution kernel size of the dilation causal convolution is 2, dropout is 0.05, the sliding window length is 10, batch is 256, the learning rate is 0.002, and an adam optimizer is adopted.
The training data is input into the model, the mean square error of the predicted data and the original data is calculated, and the network parameters are updated through back propagation. And (5) continuously iterating, and storing the trained model.
12 And (3) carrying out sample processing in the step 10) on the normalized test data by using a sliding window, substituting the processed test data into a trained SFTCN network, further predicting fault trend, and verifying the accuracy of the model.
Advantageous effects
1) The method adopts the stack noise reduction automatic encoder to build the monitoring model, reconstructs the original characteristics, introduces the steps of noise adding and noise removing on the basis of the automatic encoder, is more sensitive to faults than the traditional automatic encoder, enhances the robustness of the monitoring model, improves the accuracy of fault monitoring, and can reduce the occurrence of false alarm and missing alarm in the process monitoring.
2) The method adopts the time convolution network to establish a fault prediction model, introduces a Swish activation function and FRN standardization to improve TCN, and improves the precision of time convolution network fault prediction.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 shows input data X 960×52 A schematic diagram of the composition form;
FIG. 3 is a schematic diagram of a data composition of input data after random forest feature selection;
FIG. 4 is a fault condition feature extraction flow chart;
FIG. 5 is an automatic encoder configuration of the present invention;
FIG. 6 is a time convolution network architecture of the present invention;
FIG. 7 is a residual block of a time convolution network;
FIG. 8 is a residual module of the SFTCN proposed by the present invention;
FIG. 9 is a fault 13 (slow drift fault) random forest feature selection result;
FIG. 10 is a SPE monitoring graph of fault 13 (slow drift fault) for the method of the present invention;
FIG. 11 is a fault 8 (random fault) random forest feature selection result;
FIG. 12 is a SPE monitoring graph of the method of the present invention for fault 8 (random fault);
FIG. 13 is a graph showing the effect of the method of the present invention in predicting a fault 13 (slow drift fault);
FIG. 14 is a graph showing the effect of the method of the present invention on fault 8 (random fault) prediction;
fig. 15 (a) -15 (d) are graphs showing the effect of fault prediction on the fault 13 (slow drift fault) by the comparison method;
FIGS. 16 (a) -16 (d) are graphs showing the effect of fault prediction on fault 8 (random fault) by the comparison method;
Detailed Description
Depending on the actual chemical reaction process, eastman chemical company in the United states developed a chemical model simulation platform-Tennessee Eastman (TE) simulation platform that was open and challenging. The TE chemical process is a prototype of an actual process flow, and the whole process comprises 5 main operation units, namely a reactor, a condenser, a circulating compressor, a gas-liquid separator and a product desorber, and the generated data has time-varying, strong coupling and nonlinear characteristics and is widely used for testing control and fault diagnosis models of complex industrial processes.
The TE process has 41 total measured variables and 12 manipulated variables and sets 21 fault types. The 12 th manipulated variable XMV (12) remains unchanged throughout the process and is not used as input data. As the original observed signal vector at a particular moment, 52 variables [ XMEAS (1), …, XMEAS (41), XMV (1), …, XMV (11) ] are chosen. In the simulation experiment, the total reaction time of the TE process is set to 48h, and the sampling interval is set to 3min. All anomalies were introduced 8h after the start of the simulation. Thus, simulation of each type of fault may collect 960 data samples, each sample data containing 52-dimensional features, each type of abnormal data sample being a 960 x 52 sample matrix. The test is performed herein for fault 13 and fault 8, respectively.
The method of the invention is applied to the TE process simulation object, which comprises two large steps of fault state feature extraction and fault prediction, and is specifically stated as follows:
fault state feature extraction is divided into two parts, namely offline training and online monitoring:
offline training stage:
step 1: screening out characteristic variables related to faults in the industrial process by utilizing a random forest algorithm, and realizing dimension reduction of input data;
the method specifically comprises the following steps: the feature set preliminarily selected in the complex industrial process in the implementation is F= { F 1 ,f 2 ,…,f u ,…,f N } = { XMEAS (1), …, XMEAS (41), XMV (1), …, XMV (11) }, where n=52 is the size of the feature set, containing 41 measurement variables and 11 process variables. When the random forest algorithm is used for selecting the characteristics of the characteristic set F, the characteristic importance is obtained by applying disturbance to out-of-bag (OOB) data which does not participate in the training of the decision tree and then calculating the change of the classification accuracy by taking the characteristic importance as a standard. The random forest algorithm randomly extracts 30 bootstrap data sets, and the steps of feature importance sequencing corresponding to 30 OOB data sets are as follows:
(1) Initializing q=1;
(2) Training a decision tree by using the q-th bootstrap data set, and calculating the classification accuracy of the q-th OOB data set
(3) For feature f in OOB dataset u U=1, 2, …, N applies a perturbation, recalculates the classification accuracy
(4) Repeating steps (2), (3) for q=2, 3, …, Q;
(5) Calculating the characteristic f u Importance, i.e.
(6) P pair P u And (4) arranging in a descending order to obtain feature importance ranking, wherein the higher the ranking is, the higher the importance is.
And arranging the feature sets F according to the importance descending order, wherein the first 30 features corresponding to the importance are feature variables which are screened out in the complex industrial process and are relevant to faults.
For the third feature XMEAS (3), the feature importance calculation formula is:
wherein, the liquid crystal display device comprises a liquid crystal display device,classification accuracy for the q-th OOB dataset,/->For feature f in the q-th OOB dataset 3 Adding classification accuracy after disturbance; then, the importance is ranked, and the first J features are selected as the features after screening, in this embodiment J is 30.
In addition to the simulation situation, in the actual complex industrial process, the characteristic variables need to be selected preliminarily according to the actual situation, the preliminary characteristic variables of the complex industrial process can be different according to different industrial processes and different engineering experiences, and then the important characteristic variables are screened out by a random forest algorithm to serve as input data of the subsequent steps.
Step 2: according to the characteristic selection result of the step 1, collecting historical data under normal working conditions of the industrial process, and forming a sample set X= (X) by data obtained by offline testing 1 ,X 2 ,...,X K ) T Wherein X is i Representing the ith sampling instant data, comprising K sampling instants, each sampling instant collecting J process variables, namely X i =(x i,1 ,x i,2 ,...,x i,J ) Wherein x is i,J A measurement value of the jth process variable representing the ith sample instant, k=960, j=30;
step 3: the historical data X is subjected to standardized processing in the following manner:
first, the mean and standard deviation of all process variables at all times of the historical data X are calculated, wherein the mean of the jth process variable at the kth sampling timeThe calculation formula of (2) is->x k,j Measurement of the jth process variable representing the kth sampling instant, k=1,..k, j=1,..j; standard deviation s of the jth process variable at the kth sampling instant k,j The calculation formula of (2) is-> k=1..k, j=1..j.; the historical data X is then normalized, wherein the normalized calculation formula for the jth process variable at the kth sampling instant is as follows:
normalized data are +.>In this embodiment, gaussian white noise is added, and the noise added is +.>Where j=1,..j, k=1,..k,;
step 4: inputting the standardized data added with noise in the step 3 into a stack noise reduction self-coding network for training to obtain a reconstruction feature Z= (Z) 1 ,Z 2 ,...,Z i ,…,Z K ) T Wherein the reconstruction data at the ith moment is Z i =(z i,1 ,z i,2 ,...,z i,J ) Where i=1,.. J, k=960, j=30;
the stack noise reduction self-coding network structure is specifically as follows:
firstly, determining a stack noise reduction automatic coding network structure, wherein the stack noise reduction automatic coding network structure comprises an input layer, an hidden layer and an output layer, inputting the standardized industrial process data in the step 3 for adding noise, and outputting the standardized industrial process data in the step 3. The coding process from the input layer to the hidden layer comprises the following specific forms:
h is a nonlinear activation function, here selected as Sigmoid function, W 1 、B 1 For the weight matrix and bias vector of the coding network, Y is the hidden layer data, i.e. the extracted robust features.
The decoding process from the hidden layer to the output layer comprises the following specific forms:
Z=H(W 2 Y+B 2 )
wherein Z is output layer data,W 2 、B 2 a weight matrix and a bias vector for the decoding network;
the loss function of the stack noise reduction self-coding network is as follows:
training the model is preceded by a network parameter θ= { W 1 ,W 2 ,B 1 ,B 2 And (3) carrying out random initialization, adopting a gradient descent algorithm to iterate and train the stack noise reduction automatic coding network, and storing the parameters after final iteration.
Statistics Spe= (SPE) corresponding to modeling data 1 ,...,SPE K ) SPE statistics at the kth sampling time are defined as follows.
For input +.>The actual output obtained from the trained stack noise reduction self-coding network is obtained; finally, estimating an estimated value of the obtained SPE statistic when a confidence limit is preset by using a nuclear density estimation method, and taking the estimated value as a control limit of the SPE statistic;
the radial basis function is selected as a nuclear density function, and the nuclear density of the SPE is estimated as follows:
where w is an estimated parameter, and w >0, k=960.
On-line monitoring:
step 5: collecting data t of J process variables at mth sampling moment of current industrial process m And normalizing the m moment mean and standard deviation obtained according to the historical data in the step 3
Step 6: calculating a monitoring statistic TSPE of data collected at the mth moment of the current industrial process m The calculation formula is as follows:
wherein z is m To be used inInputting the reconstructed characteristics obtained by the stack noise reduction self-coding network trained in the step 4;
step 7: the monitoring statistic TSPE calculated in the step 6 is processed m Comparing with the control limit determined in step 5 of the modeling stage, if the limit is exceededThe fault is considered to occur, and an alarm is given; otherwise, it is normal.
Step 8: if the industrial process is finished, stopping monitoring; otherwise, collecting the data at the next moment, returning to the step 7), and continuing to monitor the process, wherein the obtained TSPE is the fault state characteristic.
Fault prediction stage:
step 9: combining the acquired fault state features into a time series tspe= { TSPE 1 、...、TSPE V V=960. Normalization with a linear normalization function (Min-Max Normalization) in which the fault state characteristics TSPE at time 3 3 The normalized calculation formula of (c) is as follows,[A,B]is the normalized region. Here a=0, b=1, i.e. the TSPE value is attributed between 0 and 1. The normalized TSPE' is divided into a training data set and a test data set for fault prediction.
Step 10: before SFTCN modeling, a sliding window is set for processing data to obtain input and output sample pairs of a prediction model, and the method is specifically as follows:
wherein X is train Input samples representing prediction samples, Y train Model labels representing the prediction model, i.e., the actual output value of the prediction model, n=400 being the number of training samples of the model, h=10 representing the sliding window size;
step 11: modeling and training the processed data on the improved time convolution network SFTCN.
The improved time convolution network SFTCN adopts a time convolution network TCN and improves a residual error network of the time convolution network TCN, and the specific improvement comprises two steps: first, replace the ReLU activation function with Swish function to solve the problem that the negative gradient is set to zero and the neuron may not be activated due to zero derivative in the negative interval of the data; secondly, a Filter Response Normalization normalization layer is used for replacing a weight normalization layer and is used for further improving the expression capacity of the network; the network structure before and after improvement is shown in fig. 7 and 8 in detail.
The SFTCN network parameter setting conditions are as follows: the maximum iteration number is 1000, the number of network layers is 4, the convolution kernel size of the dilation causal convolution is 2, dropout is 0.05, the sliding window length is 10, batch is 256, the learning rate is 0.002, and an adam optimizer is adopted.
The training data is input into the model, the mean square error of the predicted data and the original data is calculated, and the network parameters are updated through back propagation. And (5) continuously iterating, and storing the trained model.
Step 12: and (3) processing the normalized test data by using the sliding window sample in the step (10), and substituting the processed test data into a trained SFTCN network so as to predict the fault trend.
The steps are specific application of the method in the field of TE simulation platform fault prediction. In order to verify the effectiveness of the method, experiments of online prediction stages were performed on fault 8 (random fault) and fault 13 (slow drift fault), respectively. Fig. 9 is a graph of the result of feature selection of the random forest algorithm on the fault 13 data, and fig. 11 is a graph of the result of feature selection of the random forest algorithm on the fault 8 data. Only 30 variables are taken as the input of the subsequent stack noise reduction self-coding network, and are arranged according to the characteristic weight. The experimental results obtained in the fault state feature extraction stage are shown in fig. 10 (fault 13) and fig. 12 (fault 8), each of which includes a broken line parallel to the abscissa and a curve, wherein the broken line parallel to the abscissa is a control limit determined by the kernel density estimation method, and the curve is a real-time monitoring value. If the value of the curve is greater than the value of the control limit, indicating that the industrial process has failed at the moment; otherwise, the industrial process is indicated to be operating normally. The experimental results of the fault prediction stage are shown in fig. 12 to 16, where the dashed black line is the predicted value of the fault state, and the solid black line without the dashed line is the actual value of the fault state. If the prediction curve is close to the real curve, the method can accurately predict the fault trend; if the deviation is large, the method can not predict failure trend well.
Fig. 10 and fig. 12 are graphs of fault monitoring effects of the method according to the present invention on fault 13 and fault 8 data, wherein a dashed line parallel to the abscissa is a control limit, and a curve is a real-time SPE monitoring value. It can be found that the SPE monitoring graphs in the two graphs are out of limit from 161, namely faults can be monitored, and the monitoring effect is good. Fig. 13 is a graph showing the effect of the method of the present invention on fault prediction for fault 13 (slow drift fault), where the dashed black line is the predicted value of the fault state and the solid black line without the dashed line is the true value of the fault state; fig. 14 is a graph showing the effect of the method of the present invention on fault 8, where the dashed black line is the predicted value of the fault state, and the solid black line without the dashed line is the actual value of the fault state. It can be seen that the method of the present invention is well predictive of both different types of fault status data. To verify the validity of the model, MAE, RMSE, MAPE and R are used 2 The prediction effect is evaluated, and the evaluation indexes are shown in table 1, and the prediction accuracy of the SFTCN network is relatively high for both slow drift faults and random faults.
Table 1: SPE predictive evaluation index on SFTCN
To quantitatively analyze the superiority of the method, STCN, TCN, LSTM and GRU networks are used for predicting the status trend of industrial process faults. As for the failure 13, the comparative test prediction effect graph and the evaluation index are shown in fig. 15 (a) -15 (d) and table 2, respectively, the black "×" broken line is the failure state prediction value, and the black solid line without "×" is the failure state realism value. Fig. 15 (a) shows STCN network prediction effects, which can completely fit SPE dynamic trends, and fig. 16 (b) -16 (d) show TCN, LSTM, and GRU network prediction effects, respectively, which are slightly deviated from the peak values of SPE dynamic trends.
Table 2: fault 13: comparison of algorithms on evaluation index
For the failure 8, the comparative test prediction effect graph and the evaluation index are shown in fig. 16 (a) -16 (d) and table 3, respectively, the black "×" broken line is the failure state prediction value, and the black solid line without "×" is the failure state realism value.
Table 3: fault 8: comparison of algorithms on evaluation index
In summary, from analysis of the prediction effect graphs (fig. 13-16) and the table (2) -table (3) data, it can be seen that the SFTCN network proposed by the present invention has much improved accuracy in predicting fault conditions over TCN, LSTM and GRU networks for slow drift faults and random faults.

Claims (2)

1. The complex industrial process fault prediction method based on the RF noise reduction self-coding information reconstruction and the time convolution network comprises two stages of fault state feature extraction and fault prediction, and is characterized by comprising the following steps of:
fault state feature extraction is divided into two parts, namely offline training and online monitoring:
offline training stage:
1) Screening out characteristic variables related to faults in the industrial process by utilizing a random forest algorithm, and realizing dimension reduction of input data;
2) According to the characteristic selection result in the step 1), collecting historical data under normal working conditions of the industrial process, and forming a sample set X= (X) by data obtained by offline testing 1 ,X 2 ,...,X K ) T Wherein X is i Representing the ith sampleThe etching data contains K sampling moments, and each sampling moment acquires J process variables, namely X i =(x i,1 ,x i,2 ,...,x i,J ) Wherein x is i,J A measured value of a J-th process variable representing an i-th sampling instant;
3) The historical data X is subjected to standardized processing in the following manner:
first, the mean and standard deviation of all process variables at all times of the historical data X are calculated, wherein the mean of the jth process variable at the kth sampling timeThe calculation formula of (2) is->x k,j Measurement of the jth process variable representing the kth sampling instant, k=1,..k, j=1,..j; standard deviation s of the jth process variable at the kth sampling instant k,j The calculation formula of (2) is-> The historical data X is then normalized, wherein the normalized calculation formula for the jth process variable at the kth sampling instant is as follows:
normalized data isWherein-> Noise added is +.>Where j=1,..j, k=1,..k,;
4) Inputting the standardized data added with noise in the step 3) into a stack noise reduction self-coding network for training to obtain a reconstruction feature Z= (Z) 1 ,...,Z i ,...,Z K ) T Wherein the reconstruction at the ith moment is characterized by Z i =(z i,1 ,z i,2 ,...,z i,J ) Where i=1,..k, j=1,..j;
the stack noise reduction self-coding network structure is specifically as follows:
the stack noise reduction automatic coding network structure comprises an input layer, an implicit layer and an output layer, wherein the input is the standardized industrial process data of the step 3) of adding noiseThe coding process from the input layer to the hidden layer comprises the following specific forms:
h is a nonlinear activation function, here selected as Sigmoid function, W 1 、B 1 For the weight matrix and the bias vector of the coding network, Y is hidden layer data, namely the extracted robust feature;
the decoding process from the hidden layer to the output layer comprises the following specific forms:
Z=H(W 2 Y+B 2 )
wherein W is 2 、B 2 A weight matrix and a bias vector for the decoding network;
the loss function of the stack noise reduction self-coding network is as follows:
training the model is preceded by a network parameter θ= { W 1 ,W 2 ,B 1 ,B 2 Randomly initializing, and iteratively training a stack noise reduction automatic coding network by adopting a gradient descent algorithm;
statistics Spe= (SPE) corresponding to modeling data 1 ,...,SPE K ) SPE statistics at k=1,..k, K, the kth sampling time are defined as follows:
for input +.>The actual output obtained from the trained stack noise reduction self-coding network is obtained; finally, estimating an estimated value of the obtained SPE statistic when a confidence limit is preset by using a nuclear density estimation method, and taking the estimated value as a control limit of the SPE statistic;
on-line monitoring:
5) Collecting data t of J process variables at mth sampling moment of current industrial process m The mean and standard deviation of the m moment obtained according to the historical data in the step 3) are standardized to obtain
6) Calculating a monitoring statistic TSPE of data collected at the mth moment of the current industrial process m The calculation formula is as follows:
wherein z is m To be used inInputting the reconstructed characteristics obtained by the stack noise reduction self-coding network trained in the step 4;
7) The monitoring statistic TSPE calculated in the step 6) is processed m Comparing the detected fault with the control limit determined in the step 4), and if the detected fault exceeds the limit, judging that the fault occurs, and alarming; otherwise, the method is normal;
8) If the industrial process is finished, stopping monitoring; otherwise, collecting the data of the next moment, returning to the step 7), continuing to monitor the process, and obtaining the TSPE of each moment m Is a fault status feature;
fault prediction stage:
9) Combining the acquired fault state features into a time series tspe= { TSPE 1 、...、TSPE V Normalized by a linear Normalization function Min-Max, normalization, wherein the fault state characteristics TSPE at time v v The normalized calculation formula of (c) is as follows,v=1,...,V,[A,B]is the normalized region; the normalized TSPE' is divided into a training data set and a test data set so as to conduct fault prediction;
10 Before SFTCN modeling, a sliding window is set for processing data to obtain input and output sample pairs of a prediction model, and the method is specifically as follows:
wherein X is train Input samples representing a predictive model, Y train Model tags representing predictive models, i.e. predictive modelsThe actual output value, n is the number of training samples of the model, and h represents the size of the sliding window;
11 Modeling training the improved time convolution network SFTCN with the processed training data;
the improved time convolution network SFTCN adopts a time convolution network TCN and improves a residual error network of the time convolution network TCN, and the specific improvement comprises two steps: first, replace the ReLU activation function with Swish function to solve the problem that the negative gradient is set to zero and the neuron may not be activated due to zero derivative in the negative interval of the data; secondly, a Filter, response, normalization layer is used for replacing a weight Normalization layer, so that the expression capacity of the network is further improved;
12 Sample processing is carried out on the normalized test data by the sliding window in the step 10), and then the processed test data is substituted into a trained SFTCN network, so that the fault trend is predicted.
2. The complex industrial process fault prediction method based on the RF noise reduction self-coding information reconstruction and time convolution network according to claim 1, wherein the method comprises the following steps: SFTCN network parameter setting: the maximum iteration number is 1000, the number of network layers is 4, the convolution kernel size of the dilation causal convolution is 2, dropout is 0.05, the sliding window length is 10, batch is 256, the learning rate is 0.002, and an adam optimizer is adopted;
and inputting training data into a model, calculating the mean square error of the predicted data and the original data, updating network parameters through back propagation, and storing the trained model through continuous iteration.
CN202110442777.3A 2021-04-23 2021-04-23 Complex industrial process fault prediction method based on RF noise reduction self-coding information reconstruction and time convolution network Active CN113642754B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110442777.3A CN113642754B (en) 2021-04-23 2021-04-23 Complex industrial process fault prediction method based on RF noise reduction self-coding information reconstruction and time convolution network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110442777.3A CN113642754B (en) 2021-04-23 2021-04-23 Complex industrial process fault prediction method based on RF noise reduction self-coding information reconstruction and time convolution network

Publications (2)

Publication Number Publication Date
CN113642754A CN113642754A (en) 2021-11-12
CN113642754B true CN113642754B (en) 2023-10-10

Family

ID=78415700

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110442777.3A Active CN113642754B (en) 2021-04-23 2021-04-23 Complex industrial process fault prediction method based on RF noise reduction self-coding information reconstruction and time convolution network

Country Status (1)

Country Link
CN (1) CN113642754B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114403486B (en) * 2022-02-17 2022-11-22 四川大学 Intelligent control method of airflow type cut-tobacco drier based on local peak value coding circulation network
CN114785824B (en) * 2022-04-06 2024-05-14 深圳前海用友力合科技服务有限公司 Intelligent Internet of things big data transmission method and system
CN115083123B (en) * 2022-05-17 2023-05-02 中国矿业大学 Mine coal spontaneous combustion intelligent grading early warning method driven by measured data
CN115034312B (en) * 2022-06-14 2023-01-06 燕山大学 Fault diagnosis method for dual neural network model satellite power system
CN115047839B (en) * 2022-08-17 2022-11-04 北京化工大学 Fault monitoring method and system for industrial process of preparing olefin from methanol
CN115793590A (en) * 2023-01-30 2023-03-14 江苏达科数智技术有限公司 Data processing method and platform suitable for system safety operation and maintenance
CN116502141A (en) * 2023-06-26 2023-07-28 武汉新威奇科技有限公司 Data-driven-based electric screw press fault prediction method and system
CN117350609A (en) * 2023-09-21 2024-01-05 广东省有色工业建筑质量检测站有限公司 Construction method of intelligent transport control system of detection laboratory based on AGV

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109740687A (en) * 2019-01-09 2019-05-10 北京工业大学 A kind of fermentation process fault monitoring method based on DLAE
CN111324110A (en) * 2020-03-20 2020-06-23 北京工业大学 Fermentation process fault monitoring method based on multiple shrinkage automatic encoders
CN111783531A (en) * 2020-05-27 2020-10-16 福建亿华源能源管理有限公司 Water turbine set fault diagnosis method based on SDAE-IELM

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110286279B (en) * 2019-06-05 2021-03-16 武汉大学 Power electronic circuit fault diagnosis method based on extreme tree and stack type sparse self-coding algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109740687A (en) * 2019-01-09 2019-05-10 北京工业大学 A kind of fermentation process fault monitoring method based on DLAE
CN111324110A (en) * 2020-03-20 2020-06-23 北京工业大学 Fermentation process fault monitoring method based on multiple shrinkage automatic encoders
CN111783531A (en) * 2020-05-27 2020-10-16 福建亿华源能源管理有限公司 Water turbine set fault diagnosis method based on SDAE-IELM

Also Published As

Publication number Publication date
CN113642754A (en) 2021-11-12

Similar Documents

Publication Publication Date Title
CN113642754B (en) Complex industrial process fault prediction method based on RF noise reduction self-coding information reconstruction and time convolution network
Su et al. An end-to-end framework for remaining useful life prediction of rolling bearing based on feature pre-extraction mechanism and deep adaptive transformer model
CN109146246B (en) Fault detection method based on automatic encoder and Bayesian network
CN111914873A (en) Two-stage cloud server unsupervised anomaly prediction method
CN111273623B (en) Fault diagnosis method based on Stacked LSTM
CN112904810B (en) Process industry nonlinear process monitoring method based on effective feature selection
CN111340110B (en) Fault early warning method based on industrial process running state trend analysis
Wei et al. A novel deep learning model based on target transformer for fault diagnosis of chemical process
CN111914897A (en) Fault diagnosis method based on twin long-short time memory network
Mathew et al. Regression kernel for prognostics with support vector machines
CN112327701B (en) Slow characteristic network monitoring method for nonlinear dynamic industrial process
CN111122811A (en) Sewage treatment process fault monitoring method of OICA and RNN fusion model
Liu et al. Deep & attention: A self-attention based neural network for remaining useful lifetime predictions
Kefalas et al. Automated machine learning for remaining useful life estimation of aircraft engines
Sadoughi et al. A deep learning approach for failure prognostics of rolling element bearings
CN114297921A (en) AM-TCN-based fault diagnosis method
Xu et al. Global attention mechanism based deep learning for remaining useful life prediction of aero-engine
CN111241629B (en) Intelligent prediction method for performance change trend of hydraulic pump of airplane based on data driving
CN117032165A (en) Industrial equipment fault diagnosis method
Bond et al. A hybrid learning approach to prognostics and health management applied to military ground vehicles using time-series and maintenance event data
CN116522993A (en) Chemical process fault detection method based on countermeasure self-coding network
CN115221942A (en) Equipment defect prediction method and system based on time sequence fusion and neural network
Singh et al. Predicting the remaining useful life of ball bearing under dynamic loading using supervised learning
Chan et al. Explainable health state prediction for social iots through multi-channel attention
CN115309736B (en) Time sequence data anomaly detection method based on self-supervision learning multi-head attention network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant