CN113591954A

CN113591954A - Filling method of missing time sequence data in industrial system

Info

Publication number: CN113591954A
Application number: CN202110818499.7A
Authority: CN
Inventors: 戴运桃; 梁源; 彭立章; 王淑娟; 曾占魁; 沈继红; 谭思超; 赵富龙; 关昊夫
Original assignee: Harbin Engineering University
Current assignee: Harbin Engineering University
Priority date: 2021-07-20
Filing date: 2021-07-20
Publication date: 2021-11-02
Anticipated expiration: 2041-07-20
Also published as: CN113591954B

Abstract

The invention discloses a method for filling missing time sequence data in an industrial system, which comprises the following steps: preprocessing data; step two: aiming at missing multivariate time sequence data, constructing and generating a confrontation network model BiGRU-BEGAN based on a border balance generation confrontation network model; step three: training a BiGRU-BEGAN network model; step four: and generating complete artificial data by using the trained BiGRU-BEGAN model and filling the original missing data. According to the method, a model combining the countermeasure network and the bidirectional cyclic neural network is generated, the real existing data information is utilized to the maximum extent, complete artificial data which accords with the characteristic distribution rule of the original missing data is generated, and then the missing data is filled completely. And performing subsequent fault classification tasks by using the filled complete data, and increasing the classification accuracy of the missing data.

Description

Filling method of missing time sequence data in industrial system

Technical Field

The invention belongs to the field of missing data filling, and relates to a method for filling missing time sequence data in an industrial system, in particular to an algorithm based on a generation countermeasure network and a bidirectional cyclic neural network.

Background

In a complex large-scale industrial system, system device failure and various fault conditions cannot be completely avoided. In the operation process, the operation state needs to be monitored by observing the parameters of the instrument, and various transient working conditions are judged. The effective system fault diagnosis technology can help an operator to master the operation condition in real time, find instrument faults in time and make effective response, thereby improving the safety of system operation.

In the process of extracting historical data of the industrial system, the condition that data acquired by partial measuring instruments are missing may influence subsequent fault diagnosis tasks. In particular, when the missing time is long and the amount of missing data is large, a large diagnosis error is caused. Therefore, before fault diagnosis, it is necessary to establish a filling model for multi-element time sequence missing data. Based on the current research situation at home and abroad, the existing missing value filling method mainly considers some specific data missing types and rarely considers the time sequence information of data, and in an industrial system, the obtained time sequence data usually has numerous data types. Therefore, the existing generation countermeasure network model is improved in consideration of complex influence factors and time sequence generated by the missing values of the monitored data in the industrial system, and a filling model for the missing time sequence data is established by fully utilizing the front and back information of the missing positions of the data in combination with the bidirectional cyclic neural network.

Disclosure of Invention

In view of the above prior art, the technical problem to be solved by the present invention is to provide a method for filling missing time series data in an industrial system, so as to solve the missing problem of multivariate time series data in fault diagnosis of the industrial system.

In order to solve the above technical problem, the method for filling missing time series data in an industrial system of the present invention comprises the following steps:

the method comprises the following steps: data preprocessing: acquiring complete data in the historical operation process of an industrial system as an original data set X for fault diagnosis, wherein n types of fault data exist in the X, and performing random loss processing on the complete data, namely, if the probability of occurrence of a loss event is specified to be p0, the probability of non-occurrence is 1-p 0;

step two: aiming at missing multivariate time sequence data, constructing and generating a confrontation network model BiGRU-BEGAN based on a boundary balance generation confrontation network model, comprising a discriminator D and a generator G, wherein the discriminator D is a self-Encoder model and comprises an Encoder and a Decoder, the generator G adopts a bidirectional cyclic neural network BiGRU, the BiGRU network model is divided into a forward cyclic neural network layer and a backward cyclic neural network layer, each training sequence is respectively provided with two gated cyclic neural networks forward and backward, the hidden state in the network depends on the hidden state and the input value at the previous moment, and the two neural networks are connected with an output layer;

step three: training a BiGRU-BEGAN network model;

step four: generating complete artificial data by using a trained BiGRU-BEGAN model and filling original missing data: inputting a characteristic vector z | t extracted from original missing data in a trained model to generate complete time sequence data, and interpolating the generated data to the corresponding missing position of the original missing data.

The invention also includes:

1. step three, the training of the BiGRU-BEGAN network model specifically comprises the following steps:

s3.1: setting parameters: batch blocksize, iteration number epoch and hyper-parameter learning for set training dataRate α, Loss₁Loss function coefficient theta, updated learning rate lambda, and reconstructed loss weight coefficient k of generated data_t；

S3.2: training a discriminator D: extracting the characteristic z | t of the original missing data as a low-dimensional vector input by a generator, generating data G (z | t) by the generator, and calculating a loss function L of the discriminator by using the original missing data x | t and the generated data G (z | t) as input of the discriminator_D＝L(x|t)-k_tL (G (z | t)), updating the weight parameters of the arbiter according to Adam optimization algorithm:

d_w←▽_w[L(x|t)-k_tL(G(z|t))]

w_d←w_d-α*Adam(w_d,d_w)

k_t+1←k_t+λ(rL(x|t)-L(G(z|t)))

k_t＝min(max(k,0),1)

wherein ,k_tIs a weight coefficient for generating data reconstruction loss, and λ is k_tAn updated learning rate; x | t represents the original missing data, and G (z | t) represents the generated data; l (x | t), L (G (z | t)) represent the reconstruction loss of the original data and the generated data, w_dWeight parameter representing the discriminator, d_wIs parameter w to the arbiter_dThe gradient sought;

s3.3: training generator G: generating data G (z | t) by using a low-dimensional vector z | t generated by original data characteristics as an input of a generator, and calculating a loss function L of the generator_G＝L(G(z|t))+θLoss₁Updating the weight parameters of the generator according to an Adam optimization algorithm:

g_w←▽_w[L(G(z|t))+θLoss₁]

w_g←w_g-α*Adam(w_g,g_w)

where L (G (z | t)) represents the reconstruction loss of the generated data, θ is the weight coefficient, w_gRepresenting the weight parameter of the generator, g_wRepresenting the gradient sought for the parameters of the generator. Loss₁Representing L between real data and generated data₁Norm, mathematics ofThe expression is as follows:

Loss₁＝||G(z|t)-x|t||₁

s3.4: training the arbiter and generator alternately to M_gloableThe loss function value tends to be stable and does not decrease any more, M_gloableThe loss function is as follows:

M_gloable＝L(x|t)+||rL(x|t)-L(G(z|t))||₁

where r represents a diversity ratio for adjusting the balance between the generator and the arbiter, the formula is as follows:

when E (L (x | t)) -E (L (G (z | t))), the arbiter and generator reach equilibrium.

2. Step two, the gated recurrent neural network comprises an update gate and a reset gate, wherein the update gate z_tDefining the amount of the information memorized before to the current time, and controlling the output state h of the current time_tHow many history states h to keep in_t-1And how many candidate states at the current time are retained

Reset gate r_tDetermining candidate states at a current time t

Whether it needs to rely on the network status h of the last moment_t-1And the degree of dependence, the update formula of which is:

z_t＝σ(W_zx_t+U_zh_t-1+b_z)

r_t＝σ(W_rx_t+U_rh_t-1+b_r)

wherein ,

is current candidate state information, x_tFor input at the current time, h_tIs the current implicit status information, h_t-1Is the implicit status information of the last moment, z_tIndicating an update gate, r_tDenotes a reset gate, W_r,U_r,b_rIs the weight and bias parameter of the reset gate, W_z,U_z,b_zIs to update the weight and bias parameters of the gate, W_c,U_c,b_cThe method comprises the following steps of updating the weight and the bias parameter of current candidate state information, wherein sigma is a sigmoid function, a tanh function is used when a memory cell is updated, and the specific function formula is as follows:

the invention has the beneficial effects that: compared with the prior art, the method aims at the missing condition of the multivariate time sequence data in the industrial fault diagnosis, adopts the mode of combining the generation countermeasure network and the bidirectional cyclic neural network, establishes the BiGRU-BEGAN model, generates complete artificial data, and completely fills the original missing data. The advantages are that: (1) the model includes an encoder and a decoder, wherein the encoder performs feature extraction on missing original data, and the extracted feature vectors are used as input when generating data, thereby eliminating the step of searching for suitable low-dimensional vectors. (2) When time sequence data are generated, the bidirectional cyclic neural network BiGRU is used, when the time sequence data are processed, the relation between current output and information at the previous moment can be considered, the relation between the current output and the information at the later moment can be considered, and the generation effect and the performance in subsequent classification tasks are superior to those of the gated neural network GRU. (3) The L1 norm between the original missing data and the generated data is added into the loss function, so that the generated data can be effectively close to the original data in the aspects of numerical distance and distribution trend, and the generated complete artificial data is more consistent with the expectation.

Drawings

FIG. 1 is a diagram of a BiGRU-BEGAN generation countermeasure network framework structure according to the present invention;

FIG. 2 is a schematic diagram of a self-encoder model in a boundary balanced generation countermeasure network BEGAN;

FIG. 3 is a schematic diagram of a bi-directional recurrent neural network BiGRU expanded in time;

FIG. 4 is a diagram of a missing timing data population and fault diagnosis model architecture according to the present invention;

FIG. 5(a) is a comparison of data populated by a generator using a GRU network and a BiGRU network, respectively, with the original data on average;

FIG. 5(b) is a comparison of data populated by the generator using a GRU network and a BiGRU network, respectively, with the original data in terms of variance;

FIG. 6 is an MRE comparison between data populated by a generator using a GRU network and a BiGRU network, respectively, and the original data;

FIG. 7 is a comparison of training set classification accuracy for 2500 iterations of the data-filled data set and the original complete data set, respectively;

FIG. 8 is a comparison of the classification accuracy of the validation set of 2500 iterations of the data filled data set and the original complete data set, respectively.

Detailed Description

The invention is further described with reference to the drawings and the detailed description.

According to the method, a model combining the countermeasure network and the bidirectional cyclic neural network is generated, the real existing data information is utilized to the maximum extent, complete artificial data which accords with the characteristic distribution rule of the original missing data is generated, and then the missing data is filled completely. And performing subsequent fault classification tasks by using the filled complete data, and increasing the classification accuracy of the missing data.

The invention is realized by the following steps:

the method comprises the following steps: and (4) preprocessing data. Acquiring complete data in the historical operation process of the industrial system as an original data set X for fault diagnosis, wherein n types of fault data exist in the X, and performing random loss processing on the complete data by using an algorithm, namely, the probability of occurrence of a loss event is specified to be p0, and the probability of non-occurrence is 1-p 0.

Step two: and (3) constructing a generated confrontation network model BiGRU-BEGAN based on the boundary balance generated confrontation network BEGAN model aiming at the missing multivariate time sequence data, wherein the network structure is shown in figure 1. The main framework of the BiGRU-BEGAN model is the boundary balance generation confrontation network BEGAN, which comprises two parts, namely a discriminator D and a generator G, wherein the discriminator D is a self-encoder model. As shown in fig. 2, where Encoder denotes an Encoder and Decoder denotes a Decoder, it is essentially a mapping process. And (3) calculating the error between the real data and the reconstruction loss of the generated data by using a self-coding model as a discriminator by the boundary balance generation countermeasure network BEGAN, representing that the real data and the generated data are distributed similarly if the reconstruction loss of the real data and the generated data are similar, and optimizing the lower bound of the distance between pixel-wise losses of the real data and the generated data by the Wassertein distance to train the model. The generator G is composed of a bidirectional cyclic neural network (BiGRU) which is divided into a forward cyclic neural network layer and a backward cyclic neural network layer, the basic idea of the network is that each training sequence is respectively two Gated cyclic neural networks (GRUs) forward and backward, the hidden states in the network both depend on the hidden state and the input value at the previous moment, and the two neural networks are both connected to an output layer, fig. 3 shows a schematic diagram of the bidirectional cyclic neural network expanded in time steps.

The gated recurrent neural network (GRU) in the bidirectional recurrent neural network BiGRU model consists of an update gate (update gate) and a reset gate (reset gate), and the update gate z_tDefining the amount of the information memorized before to the current time, and controlling the output state h of the current time_tHow many history states h to keep in_t-1And how many candidate states at the current time are retained

Reset gate r_tDetermines how to combine the new input information with the previous memory information, and determines the candidate state at the current time t

Whether it needs to rely on the network status h of the last moment_t-1And how much, r, it needs to depend on_tThe value of (A) determines the candidate state

For the state h at the previous moment_t-1The degree of dependence of (c). It updates the formula as

z_t＝σ(W_zx_t+U_zh_t-1+b_z)

r_t＝σ(W_rx_t+U_rh_t-1+b_r)

wherein ,

is current candidate state information, x_tFor input at the current time, h_tIs the current implicit status information, h_t-1Is the implicit status information of the last moment, z_tIndicating an update gate, r_tDenotes a reset gate, W_r,U_r,b_rIs the weight and bias parameter of the reset gate, W_z,U_z,b_zThe weights and bias parameters of the gates are updated. W_c,U_c,b_cIs the weight and bias parameters for updating the current candidate state information, and σ is the sigmoid function. In order to overcome the problem of gradient disappearance, the tanh function is used when updating the memory cell, and the specific function formula is

Step three: training a BiGRU-BEGAN network model, comprising the following steps:

(1) and setting parameters. Setting batch blocksize, iteration times epoch and hyper-parameter learning rates alpha, Loss of training data₁Loss function coefficient theta, updated learning rate lambda, and reconstructed loss weight coefficient k of generated data_t。

(2) And training a discriminator D. Extracting the characteristic z | t of the original missing data as a low-dimensional vector input by a generator, generating data G (z | t) by the generator, and calculating a loss function L of the discriminator by using the original missing data x | t and the generated data G (z | t) as input of the discriminator_D＝L(x|t)-k_tL (G (z | t)), updating the weight parameters of the arbiter according to Adam optimization algorithm:

w_d←w_d-α*Adam(w_d,d_w)

k_t+1←k_t+λ(rL(x|t)-L(G(z|t)))

k_t＝min(max(k,0),1)

wherein ,k_tIs a weight coefficient for generating data reconstruction loss, and λ is k_tAn updated learning rate; x | t represents the original missing data, and G (z | t) represents the generated data; l (x | t), L (G (z | t)) represent reconstruction loss of the original data and the generated data, respectively. w is a_dWeight parameter representing the discriminator, d_wIs parameter w to the arbiter_dThe gradient is determined.

(3) A training generator G. Generating data G (z | t) by using a low-dimensional vector z | t generated by original data characteristics as an input of a generator, and calculating a loss function L of the generator_G＝L(G(z|t))+θLoss₁And updating the weight parameters of the generator according to an Adam optimization algorithm:

w_g←w_g-α*Adam(w_g,g_w)

where L (G (z | t)) represents the reconstruction loss of the generated data, θ is the weight coefficient, w_gRepresenting the weight parameter of the generator, g_wRepresenting the gradient sought for the parameters of the generator. Loss₁Representing L between real data and generated data₁Norm whose mathematical expression is

Loss₁＝||G(z|t)-x|t||₁

(4) Training the arbiter and generator alternately to M_gloableThe loss function value tends to be stable and does not decrease any more, M_gloableThe loss function is as follows:

M_gloable＝L(x|t)+||rL(x|t)-L(G(z|t))||₁

Step four: and generating complete artificial data by using the trained BiGRU-BEGAN model and filling the original missing data. Inputting a characteristic vector z | t extracted from original missing data in a trained model to generate complete time sequence data, and interpolating the generated data to the corresponding missing position of the original missing data.

Specific examples are given below with reference to specific parameters:

the invention provides a model BiGRU-BEGAN for generating an antagonistic network and a bidirectional recurrent neural network by combining boundary balance based on the generation of the antagonistic network BEGAN by the boundary balance and fully considering the upper and lower time information of time sequence data, and the overall structure is shown in figure 1. Boundary equilibrium generating countermeasureThe discriminator D used by the network bgan is a self-coder model, as shown in fig. 2. Based on the characteristic that the self-encoder can extract the data features, the original missing data x | t is input into the encoder, the output real data features p are used as implicit vectors and led to the input of a generator, and the distribution (Fc) of the guiding variable z | t can approximate to the distribution of the real data, so that the distribution trend of the generated data is close to the real data. The generator G uses a bidirectional recurrent neural network BiGRU, which is shown in fig. 3 as a bidirectional recurrent neural network that is spread along time. On the loss function, the L between the original missing data and the generated data₁The norm is part of the loss function of the generator in order to approximate the generated data to the original data in terms of value and overall distribution trend. And finally, training missing data by using an end-to-end BiGRU-BEGAN model to obtain generated complete artificial data, filling original missing data, performing a classification task by using the completely filled data, and comparing the classification task with the classification effect of an original complete data set, wherein FIG. 4 is a model structure for missing time sequence data filling and fault diagnosis.

The invention relates to a filling method for industrial fault diagnosis time sequence data deletion, which comprises the following steps:

the method comprises the following steps: and (4) preprocessing data. Acquiring complete data in the operation process of an industrial system as an original data set X for Fault diagnosis, wherein the X has 6 data types, a Fault1 data set is taken as a missing processing, and an algorithm is used for carrying out random missing processing on original complete Fault1 data, namely, the probability of occurrence of a loss event is specified to be p0, and the probability of non-occurrence is 1-p0 (the random loss is set to be p0 to be 50 percent). In order to fully consider the situation of long-term missing, three groups of data with the lengths of 5s continuously, 10s continuously and 15s continuously and a group of data with five rows, five columns and a group of ten rows and ten columns are manually removed, and the missing part is occupied by 0 to obtain a required training data set;

step two: aiming at missing multivariate time sequence data, constructing a BiGRU-BEGAN model on the basis of a border balance generation countermeasure network BEGAN model, wherein the border balance generation countermeasure network BEGAN comprises a discriminator D and a generator G, the discriminator D is a self-encoder model, the generator uses a Bi-directional Gated Recurrent Unit (BiGRU), and the BiGRU network model consists of a full connection layer and a GRU network layer. Input _ size, i.e., the number of feature values of input data, hidden _ size, i.e., the number of features of hidden layers, and num _ layers, i.e., the number of layers of a recurrent neural network, of the GRU network layers are set to 1 according to the input data. Setting the Bidirectional direction in the GRU network as True, namely the Bidirectional recurrent neural network BiGRU used by the invention.

(1) and setting parameters. Setting a path of training data, training a batch blocksize (120), an iteration number epoch (50000) and a hyper-parameter learning rate alpha (0.0001), Loss₁Loss function coefficient θ (1), updated learning rate λ (0.001), and reconstructed loss weight coefficient k of generated data_t(initial value is 0). During training, batch normalization processing is adopted for input data.

(2) And (5) training a discriminator. Extracting low-dimensional characteristic vector z | t of original missing data, inputting the low-dimensional characteristic vector z | t into a generator, generating data G (z | t) through the generator, taking original missing data x and generated data G (z | t) as the input of a discriminator, and calculating the loss function L of the discriminator_D＝L(x|t)-k_tL (G (z | t)), updating the weight parameters of the arbiter according to Adam optimization algorithm:

w_d←w_d-α*Adam(w_d,d_w)

k_t+1←k_t+λ(rL(x|t)-L(G(z|t)))

k_t＝min(max(k,0),1)

wherein ,k_tIs a weight coefficient for generating data reconstruction loss, and λ is k_tThe updated learning rate, x represents the original missing data, G (z | t) represents the generated data, and L (x | t) and L (G (z | t)) represent the reconstruction loss of the original data and the generated data, respectively.w_dWeight parameter representing the discriminator, d_wIs parameter w to the arbiter_dThe gradient is determined.

w_g←w_g-α*Adam(w_g,g_w)

where L (G (z | t)) represents the reconstruction loss of the generated data, θ is the weight coefficient, w_gRepresenting the weight parameter of the generator, g_wParameter g representing pair generator_wThe gradient is determined. Loss₁Representing L between real data and generated data₁Norm whose mathematical expression is

Loss₁＝||G(z|t)-x|t||₁

M_gloable＝L(x|t)+||rL(x|t)-L(G(z|t))||₁

Step four: and generating complete data by using the trained BiGRU-BEGAN model and filling the original missing data into the complete data. Inputting a characteristic vector z | t extracted from original missing data in a trained model to generate complete artificial time sequence data, filling the generated complete data in the corresponding missing position of the original data, establishing a 1D/2D-CNN fault diagnosis model according to the filled data set, and observing the classification effect.

Analyzing the fault diagnosis experiment result of the thermal hydraulic system of the nuclear power station:

according to the specific parameter embodiment, data come from a simulation data set for Fault diagnosis of a thermal hydraulic system of a nuclear power station, as shown in table 1, the simulation data set comprises a time sequence working condition data set and a normal state data set of 5 Fault states, a Fault1 data set is used as a filling experiment data set, and training is carried out according to a constructed BEGAN-BiGRU network model and a training mode. FIGS. 5(a) and 5(b) show the comparison of the mean and variance of the data after the generator has filled the original missing data with the BiGRU network and the GRU network, respectively, with the original complete data, and FIG. 6 shows the comparison of the MRE (mean relative error) between the original missing data and the original complete data after the generator has filled the original missing data with the BiGRU network and the GRU network, respectively, the calculation formula of the MRE is as follows

Wherein x' is the generated data and x is the original data. Table 2 shows the test set classification accuracy of the interpolated data and raw data of the generator using the BiGRU network and the GRU network, respectively. As can be seen from the above indexes, the data filled by the present invention is very close to the original data in mean and variance, and the MRE value between the data filled by using the BiGRU network as a generator and the original data is smaller, which reflects that the error between the filled data and the original data is smaller. In conclusion, the BiGRU-BEGAN model constructed by the invention has better effect on the data quality for the generation and filling of the time series data.

And replacing the filled Fault1 data into the original data set based on the BiGRU-BEGAN model to perform six classification tasks, and observing the Fault classification accuracy. Table 3 shows the number of data samples in the training set, test set, and validation set used in the classification process, and fig. 7 and 8 are graphs comparing the classification accuracy of the training set and validation set using the original full data and replacing the Fault1 data with the filled data of the present invention in six classifications for 2500 iterations, respectively, where the dotted line represents the classification accuracy of the original full data and the solid line represents the classification accuracy of the filled data using the model. It can be seen from the graph that the classification accuracy tends to be stable when the iteration is performed for 2500 times, and the classification accuracy of the interpolation data is very close to that of the original data, which shows that in the case that the data loss rate is 50% and the interpolation data has long-term loss, the classification accuracy of the data filled by using the model is very close to that of the original complete data, and the data filled by using the method can be used for fault classification tasks.

TABLE 1 nuclear power plant data types and data sample quantities

TABLE 2 test set Classification accuracy comparison

TABLE 3 data sample number

Claims

1. A method for filling missing time series data in an industrial system is characterized by comprising the following steps:

step three: training a BiGRU-BEGAN network model;

2. The method for filling missing time series data in the industrial system according to claim 1, wherein: step three, the training of the BiGRU-BEGAN network model specifically comprises the following steps:

s3.1: setting parameters: setting batch blocksize, iteration times epoch and hyper-parameter learning rates alpha, Loss of training data₁Loss function coefficient theta, updated learning rate lambda, and reconstructed loss weight coefficient k of generated data_t；

w_d←w_d-α*Adam(w_d,d_w)

k_t+1←k_t+λ(rL(x|t)-L(G(z|t)))

k_t＝min(max(k,0),1)

w_g←w_g-α*Adam(w_g,g_w)

where L (G (z | t)) represents the reconstruction loss of the generated data, θ is the weight coefficient, w_gRepresenting the weight parameter of the generator, g_wRepresenting the gradient sought for the parameters of the generator. Loss₁Representing L between real data and generated data₁Norm, whose mathematical expression is:

Loss₁＝||G(z|t)-x|t||₁

M_gloable＝L(x|t)+||rL(x|t)-L(G(z|t))||₁

3. The method for filling missing time series data in the industrial system according to claim 1 or 2, characterized in that: step two, the gated recurrent neural network comprises an update gate and a reset gate, wherein the update gate z_tDefining the amount of the information memorized before to the current time, and controlling the output state h of the current time_tHow many history states h to keep in_t-1And how many candidate states at the current time are retained

Reset gate r_tDetermining candidate states at a current time t

z_t＝σ(W_zx_t+U_zh_t-1+b_z)

r_t＝σ(W_rx_t+U_rh_t-1+b_r)

wherein ,