CN115457307A

CN115457307A - Chemical process fault diagnosis method based on improved residual error network

Info

Publication number: CN115457307A
Application number: CN202210954740.3A
Authority: CN
Inventors: 陈晓兵; 卢佳祺; 包涵; 康丽; 张冰莹; 张润
Original assignee: Huaiyin Institute of Technology
Current assignee: Huaiyin Institute of Technology
Priority date: 2022-08-10
Filing date: 2022-08-10
Publication date: 2022-12-09

Abstract

The invention discloses a chemical process fault diagnosis method based on an improved residual error network, which aims at the chemical process of Issmann, naxi, and collects and arranges simulation data of the chemical process; preprocessing data, realizing polynomial characteristic dimension increasing for each sample data, and processing the sample data into a two-dimensional gray image; properly improving a residual error network, replacing the original 3 x 3 convolution by using a depth separable convolution, replacing the last 1 x 1 convolution by an inclusion module, adding a channel attention module and a space attention module, building a new residual error network model by using a pyrroch frame, and selecting a normal state and a plurality of fault states from a training set to train the model; and selecting test set data corresponding to the fault state involved in training to evaluate the model effect. Compared with the prior art, the method improves the fault diagnosis capability of the residual error network on the data by improving the structure of the residual error network aiming at the complex chemical process showing the characteristics of coupling, nonlinearity and the like.

Description

Chemical process fault diagnosis method based on improved residual error network

Technical Field

The invention belongs to the technical field of supervision algorithms and fault diagnosis, and particularly relates to a chemical process fault diagnosis method based on an improved residual error network.

Background

In recent years, with the mass production and development of modern industrial processes, the operation complexity of the chemical process is continuously increased, and the situations of aging and corrosion of operation equipment are increasingly increased, so that potential safety hazards such as chemical substance leakage, fire, explosion and the like which are difficult to predict can occur, and serious personal safety problems such as environmental pollution and property loss accidents can also be caused. Therefore, the fault diagnosis of the chemical process can prevent the occurrence of catastrophic accidents, reduce the casualties of personnel and achieve the aim of ensuring the product quality.

The residual network is a convolutional neural network proposed by 4 Chinese scholars from Microsoft research, and the image classification and object recognition champion were obtained in 2015 ImageNet large-scale visual recognition competition. The main idea of ResNet is to add a direct connection channel, i.e. the idea of HighwayNetwork, in the network. Previous network architectures have performed a non-linear transformation of the performance input, while HighwayNetwork allows a certain proportion of the output of the previous network layer to be preserved. The idea of ResNet is very similar to that of HighwayNetwork, allowing the original input information to be passed directly to the next layer, so that the neural network of this layer can learn the residual of the last network output instead of the whole output, and therefore ResNet is also called residual network.

The main role of the Tennessee-Ismann (TE) chemical process created by Iseman chemical company is to provide an actual industrial process control data set for the field of fault diagnosis and monitoring research of chemical processes. At this stage, some progress and efforts have been made in the research of fault detection and diagnosis of TE data sets using conventional fault diagnosis methods, including analytical model-based methods and qualitative empirical knowledge-based methods.

Tianwende et al propose an on-line parameter estimation method of a dynamic model for chemical process fault diagnosis. Tianwende et al propose a model-based chemical pipeline leakage fault diagnosis method. Simani et al propose a robust model-based chemical process fault diagnosis method, and a diagnosis system based on robust estimation of process output. Iri et al first proposed the application of SDG (signal Directed Graph, SDG) to chemical process fault diagnosis, and Tsuge proposed a chemical process system fault diagnosis algorithm based on SDG and its improved "SDG with delay" and "multiple-stage SDG (MSDG)", based on its algorithm. Shiozaki et al also proposed an improved SDG chemical process fault diagnosis algorithm based on such studies. The Beijing university of chemical industry Lianfeng et al provides a chemical process diagnostic model SDG modeling method. The method combines SDG with Qualitative Trend Analysis (QTA), adopts a bidirectional reasoning algorithm based on hypothesis and verification to find out all possible fault causes and corresponding consistent paths in an SDG model, then uses an improved QTA algorithm to extract and analyze the trends of nodes on the consistent paths found in the last step, and utilizes a new consistency rule based on the Qualitative Trend to find out the true fault cause from candidate causes. Qian et al propose an expert system fault diagnosis method for real-time fault diagnosis in complex chemical processes. Xu et al propose an expert system for online fault diagnosis in an industrial lubricant refining process.

In the face of fault diagnosis in the chemical process, research on the existing paper mainly focuses on how to improve the traditional fault diagnosis method, including a method based on an analysis model and a method based on qualitative experience knowledge to obtain more comprehensive prior information or expert experience and knowledge, so as to construct a more perfect mathematical model and system, and lack of reasonable utilization of a large amount of existing chemical process data.

The traditional fault diagnosis method has the defects of long modeling time, low fault diagnosis accuracy, frequently occurring missing report and false report, low generalization capability and the like, along with the improvement of technical means, a large amount of fault data can be stored, and an accurate mechanism model is not needed for constructing the model in a data driving mode, so that the method is very suitable for the current complex process industry.

Disclosure of Invention

The purpose of the invention is as follows: aiming at the problems pointed out in the background technology, the invention provides a chemical process fault diagnosis method based on an improved residual error network, a fault diagnosis model is built in a data driving mode, meanwhile, the depth separable convolution is applied to the residual error network, the parameter quantity is reduced, the model building time is saved, the residual error unit of the network is improved by an inclusion module, the improved residual error network can extract the deep information of data from different layers, and a channel attention mechanism and a space attention mechanism are introduced into the residual error unit, so that the model has the capability of automatically searching the position of important information, the problems existing in the fault diagnosis in the chemical process are solved, and the fault diagnosis accuracy is improved.

The technical scheme is as follows: the invention discloses a chemical process fault diagnosis method based on an improved residual error network, which comprises the following steps:

step 1: preparing a data set, collecting data aiming at the data set generated in the chemical process, and respectively storing the data simulated under the normal state and 21 different faults;

and 2, step: preprocessing collected data, namely performing dimension increasing on a feature vector of a sample by using a polynomial dimension increasing method, processing a one-dimensional feature vector into a two-dimensional gray image, and randomly dividing a data set into a training set and a test set;

and step 3: improving the framework of a residual error network, building a network model by using a pyrrch framework, and selecting a normal state and a plurality of fault states from a training set to train the model;

the improved residual error network has the following structure:

firstly, performing feature extraction on a gray image through a conv1 convolution layer, performing one-round screening on features through a maximum pooling layer pool1, then connecting 4 groups of improved residual error units, namely a conv2_ x convolution layer, a conv3_ x convolution layer, a conv4_ x convolution layer and a conv5_ x convolution layer, wherein the number of bottleneck structures contained in each group of convolution layers is different, the conv2_ x convolution layer contains 2 residual error units, the conv3_ x convolution layer contains 4 residual error units, the conv4_ x convolution layer contains 6 residual error units, the conv5_ x convolution layer contains 3 residual error units, then performing another-round screening on the features with a global average pooling layer, then connecting two full connection layers FC1 and FC2 to mine rules hidden in the features, and finally realizing fault diagnosis and result output by using a softmax classifier;

the improved residual error unit comprises two branches, wherein one branch is a convolution layer of 1 multiplied by 1 in sequence and is processed by a batch normalization algorithm and a Relu activation function, the convolution can be separated in depth and is processed by the batch normalization algorithm and the Relu activation function, and features, channel attention layers and space attention layers of different scales are extracted by an inclusion module; the other branch is a 1 x 1 skipconv convolution layer, and the results of the last two branches are summed and then processed by a Relu activation function;

and 4, step 4: and selecting test set data corresponding to several fault states involved in training to evaluate the effect of the model.

Further, the specific method of step 1 is as follows:

step 1.1: collecting original data generated in a chemical process, storing the data in a normal state in G00, and storing 21 fault data in G01, G02, \8230;, G21 respectively;

step 1.2: 960 pieces of data are extracted from the data set G00 storing the normal state and stored in the data set G0_ 00;

step 1.3: the fault states are 21 types, fault signals are introduced from 160 th data, 160 th to 960 th data in each fault state are extracted and stored in G0_01, \ 8230; \8230;, G0_21 respectively.

Further, the specific method of step 2 is as follows:

step 2.1: for each sample data in G0_00, G0_01, \8230;, G0_21 dataset, a polynomial upscaling is implemented, the original sample data is a feature vector consisting of 52 features, i.e. the original sample data

Wherein

The s-th feature representing the original data sample, variable s e [1, 52]The characteristic vector of the original sample is subjected to dimensionality increase by adopting a second-order polynomial dimensionality increasing method to obtain a new characteristic vector

Wherein

Representing the s-th feature of the data sample after feature dimension raising, wherein a variable s belongs to [1, 1430 ]]；

Step 2.2: processing the one-dimensional feature vector data after feature dimension increasing into a two-dimensional gray image with the size of 38 multiplied by 38, and filling the positions without numbers with 0;

step 2.3: a training set and a test set are divided for the processed two-dimensional gray scale image data, normal state data and fault state data used for training a model are respectively stored in the training sets G00_ tr, G01_ tr, 8230, G21_ tr, and normal state data and fault state data of the effect of the test model are respectively stored in G00_ te, G01_ te, 8230, 8230and G21_ te.

Further, in the step 3, setting the convolution kernel size in the conv1 convolution layer to be 3 × 3, the number of channels to be 64, the step size to be 2, and the edge filling to be 1; setting the size of the largest pooling layer of the pool1 to be 3 multiplied by 3, the step length to be 2 and the edge filling to be 1; setting the number of neurons of an FC1 full connection layer to be 1024; the number of neurons in the FC2 full junction layer was set to 5.

Further, the depth separable convolution used in the improved residual error unit includes extracting feature map information using DepthWise convolution with a convolution kernel size of 3 × 3, a step size of 1, and an edge fill of 1, fusing the information using PointWise convolution with a convolution kernel size of 1 × 1 and a step size of 1, and finally outputting the feature map.

Further, the inclusion module used in the improved residual error unit includes four branches:

the first branch is a 1 × 1 convolutional layer and is processed by using a batch normalization algorithm and a Relu activation function;

the second branch is sequentially a 1 × 1 convolutional layer and is processed by using a batch normalization algorithm and a Relu activation function, and a 3 × 3 convolutional layer is processed by using a batch normalization algorithm and a Relu activation function;

the third branch is sequentially a 1 × 1 convolutional layer and is processed by using a batch normalization algorithm and a Relu activation function, and a 5 × 5 convolutional layer is processed by using a batch normalization algorithm and a Relu activation function;

the fourth branch sequentially comprises a 3 × 3 maximum pooling layer and a 1 × 1 convolution layer and is processed by using a batch normalization algorithm and a Relu activation function;

the feature maps generated by the four branches are aggregated together at the output to form a very deep feature map.

Further, the step length of the 1 × 1 convolutional layer used in the inclusion module in step 3 is 1; the step size of the 3 × 3 convolutional layer used is 1 and the edge fill is also 1; the step size of the 5 × 5 convolutional layer used is 1 and the edge fill is 2; the step size of the maximum pooling layer of 3 x 3 used is 1 and the edge padding is 1.

Further, the channel attention layer used in the refined residual unit includes two branches, one branch retaining the original feature map F, the other branch including a channel attention module to tell which channel's feature is more important, resulting in a channel description F' _c In situ deliveryCorrespondingly multiplying the contents of the two branches during the output to obtain a new characteristic diagram F ', namely F ' = F multiplied by F ' _c (ii) a The spatial attention layer comprises two branches, one branch retains a received feature map F ', and the other branch distinguishes the position of important information in the feature map through a spatial attention module to obtain a general description F' _s At the time of output, the contents of the two branches are multiplied to obtain a new feature map F ″, i.e., F "= F '× F' _s 。

Further, the specific method of the channel attention module used by the channel attention layer is as follows:

step 11.1: avgPool (F) is processed on a profile F with input dimensions H × W × C using a global average pooled AvgPool, resulting in a channel description F with dimensions 1 × 1 × C _a Wherein H represents the height of the acceptance feature map, W represents the width of the acceptance feature map, and C represents the depth of the acceptance feature map;

step 11.2: maxpool (F) is processed by utilizing the global maximum pooling Maxpool to input a feature map F with the size of H multiplied by W multiplied by C, and a channel description F with the size of 1 multiplied by C is obtained _m Wherein H represents the height of the acceptance feature map, W represents the width of the acceptance feature map, and C represents the depth of the acceptance feature map;

step 11.3: multi-layer perceptron MLP pair F using one parameter share _a To perform a treatment, i.e. MLP (F) _a ) To give new description F' _a ；

Step 11.4: multi-layer perceptron MLP pair F using one parameter share _m To perform a treatment, i.e. MLP (F) _m ) To give a new description F' _m ；

Step 11.5: two new channels are described as F' _a And F' _m Adding to obtain new description F _add I.e. F _add ＝F _a ‘+F’ _m ；

Step 11.6: using a Sigmoid pair of activation functions F _add Processing to obtain final channel description F _c ', i.e. F _c ’＝Sigmoid(F _add )。

Further, the specific method of the spatial attention module used by the spatial attention layer is as follows:

step 12.1: avgPool (F ') is processed on a feature map F' with input size H x W x C using average pooled AvgPool for one channel dimension, resulting in a channel description F with size H x W x 1 _a I.e. F _a = AvgPool (F'), where H represents the height of the acceptance profile, W represents the width of the acceptance profile, and C represents the depth of the acceptance profile;

step 12.2: maxPoint (F ') is processed by utilizing max pooling MaxPoint of one channel dimension to input feature map F' with the size of H multiplied by W multiplied by C, and channel description F with the size of H multiplied by W multiplied by 1 is obtained _m Wherein H represents the height of the acceptance feature map, W represents the width of the acceptance feature map, and C represents the depth of the acceptance feature map;

step 12.3: f is to be _a And F _m Splicing according to channels to obtain a new channel description F with the size of H multiplied by W multiplied by 2 _con Wherein H represents the height of the acceptance feature map, W represents the width of the acceptance feature map, and the depth is 2;

step 12.4: description F with convolution kernel size 3 × 3, step size 1, edge fill 1 convolution _ spa pairs _con Processing to obtain new channel description F _s I.e. F _s ＝conv_spa(F _con )；

Step 12.5: description of a channel using the Sigmoid function F _s Processing to obtain the final weight coefficient F' _s I.e. F' _s ＝Sigmoid(F _s )。

Has the advantages that:

firstly, aiming at the West Tianna Islam chemical process, collecting and arranging simulation data of the chemical process, and respectively storing data generated under normal conditions and data generated under various fault conditions in respective data sets; then preprocessing the data, realizing feature dimension increasing on each sample data, processing the sample data into a two-dimensional gray image, and randomly dividing the preprocessed data into a training set and a test set; secondly, the residual error network is properly improved to be more suitable for processing data generated in a chemical process, the original convolution mode of the residual error network is replaced by the depth separable convolution, the parameter quantity is reduced, the time for building the model is saved, the residual error unit of the network is improved by the inclusion module, the deep information of the data can be extracted from different layers by the improved residual error network, a channel attention mechanism and a space attention mechanism are introduced into the residual error unit, the model has the capability of automatically searching the position of important information, the problem of fault diagnosis in the chemical process is solved, and the accuracy of fault diagnosis is improved.

Drawings

FIG. 1 is a general flow diagram of the present invention;

FIG. 2 is a flow chart of data set preparation;

FIG. 3 is a flow chart of dataset preprocessing;

FIG. 4 is a schematic diagram of a residual network model;

FIG. 5 is a schematic diagram of a modified residual error unit;

FIG. 6 is a schematic diagram of a depth separable convolution;

FIG. 7 is a schematic structural diagram of an inclusion module;

FIG. 8 is a schematic view of a channel attention layer;

FIG. 9 is a schematic structural diagram of a channel attention module;

FIG. 10 is a schematic view of a spatial attention layer;

fig. 11 is a schematic structural diagram of a spatial attention module.

Detailed Description

The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of the invention, as various equivalent modifications of the invention will occur to those skilled in the art upon reading the present disclosure and fall within the scope of the appended claims.

Referring to the accompanying drawings 1 to 11, the invention discloses a chemical process fault diagnosis method based on an improved residual error network, which specifically comprises the following steps:

step 1: data set preparation. Data are collected for the chemical process of eastern iseman, and the data simulated under the normal state and 21 different faults are respectively stored in the data sets G0_00, G0_01, G0_21, as shown in fig. 2:

step 1.1: collecting original data of a Tennessman chemical process in Tennessee, storing the data in a normal state in G00, and storing 21 fault data in G01, G02, G.

step 1.3: the number of fault states is 21, a fault signal is introduced from the 160 th data, and the 160 th to 960 th data in each fault state are extracted and stored in G0_01, a.

Step 2: the collected data is pre-processed. Firstly, a polynomial dimension-increasing method is used for increasing the dimension of a characteristic vector of a sample, the one-dimensional characteristic vector is processed into a two-dimensional gray-scale image, and then a data set is randomly divided into a training set and a testing set. The new data are stored in training sets G00_ tr, G01_ tr, and G21_ tr according to normal states and different fault types, respectively, and the test set is stored in data sets G00_ te, G01_ te, and G21_ te, as shown in fig. 3:

step 2.1: a polynomial upscaling is implemented for each sample data in the G0_00, G0_ 01. The original sample data is a feature vector D consisting of 52 features, i.e.

Wherein

Wherein

Representing the mth feature of the data sample after feature dimension raising, the variable m is the element of [1, 1430 ∈ ]]；

Step 2.2: processing the one-dimensional feature vector data after feature dimension raising into a two-dimensional gray image with the size of 38 multiplied by 38, and filling the positions without numbers with 0;

step 2.3: and dividing the processed two-dimensional gray image data into a training set and a test set. Normal state and fault state data for training the model are stored in training sets G00_ tr, G01_ tr,. Once.. And G21_ tr, respectively, and normal state and fault state data for testing the model effect are stored in G00_ te, G01_ te,. Once.. And G21_ te, respectively.

And step 3: the method comprises the following steps of improving the framework of a residual error network, building a network model by using a pyrrch framework, and selecting a normal state and a plurality of fault states from a training set to train the model, wherein the specific method comprises the following steps:

the structure of the improved residual error network is shown in fig. 4, and is specifically described as follows: firstly, performing feature extraction on a gray image by a conv1 convolution layer, then performing one-round screening on features by a maximum pooling layer pool1, and then connecting 4 improved residual error units, namely a conv2_ x convolution layer, a conv3_ x convolution layer, a conv4_ x convolution layer and a conv5_ x convolution layer, wherein the number of bottleneck structures contained in each group of convolution layers is different, the conv2_ x convolution layer contains 2 residual error units, the conv3_ x convolution layer contains 4 residual error units, the conv4_ x convolution layer contains 6 residual error units, and the conv5_ x convolution layer contains 3 residual error units, then performing another-round screening on the features by a global average pooling layer AvgPool, then connecting two full-connection layers FC1 and FC2 to mine rules in the features, and finally using a soffmax classifier to diagnose faults and output results.

In this embodiment, the convolution kernel size in the conv1 convolution layer is set to 3 × 3, the number of channels is set to 64, the step length is set to 2, and the edge padding is set to 1; setting the size of the largest pooling layer of the pool1 to be 3 multiplied by 3, the step length to be 2 and the edge filling to be 1; setting the number of neurons of an FC1 full connection layer to be 1024; the number of neurons in the FC2 full junction layer was set to 5.

In this embodiment, an improved residual error unit structure is shown in fig. 5, and includes two branches, where one branch is a 1 × 1 convolution layer and is processed by a batch normalization algorithm and a Relu activation function, a depth separable convolution DSC layer is processed by a batch normalization algorithm and a Relu activation function, an inclusion module extracts features of different scales, a channel attention layer and a spatial attention layer, the other branch is a 1 × 1 skipconv convolution layer, and the results of the two branches are summed and then processed by a Relu activation function.

The deep separable convolution, as shown in fig. 6, specifically includes extracting feature map information using a DepthWise convolution layer with a convolution kernel size of 3 × 3, a step size of 1, and edge padding of 1, fusing the information using a PointWise convolution layer with a convolution kernel size of 1 × 1 and a step size of 1, and finally outputting a feature map.

The inclusion module, as shown in fig. 7 in particular, includes four branches. The first branch is a 1 × 1 convolutional layer and is processed by using a batch normalization algorithm and a Relu activation function, the second branch is a 1 × 1 convolutional layer in sequence and is processed by using a batch normalization algorithm and a Relu activation function, the 3 × 3 convolutional layer is processed by using a batch normalization algorithm and a Relu activation function, the third branch is a 1 × 1 convolutional layer in sequence and is processed by using a batch normalization algorithm and a Relu activation function, the 5 × 5 convolutional layer is processed by using a batch normalization algorithm and a Relu activation function, the fourth branch sequentially comprises a 3 × 3 maximum pooling layer and a 1 × 1 convolutional layer and is processed by using a batch normalization algorithm and a Relu activation function, and feature maps generated by the four branches are uniformly aggregated at the output to form a very deep feature map.

Specifically, in this embodiment, the step length of the 1 × 1 convolution layer used in the inclusion module is 1; the step size of the 3 × 3 convolutional layer used is 1 and the edge fill is also 1; the step size of the 5 × 5 convolutional layer used is 1 and the edge fill is 2; the step size of the maximum pooling layer of 3 x 3 used is 1 and the edge padding is 1.

The channel attention layer used in the improved residual unit, as shown in particular in fig. 8, comprises two branches. One of the branches retains the original feature map F, and the other branch includes a channel attention module to distinguish which channel feature is more important, resulting in a channel description F' _c And correspondingly multiplying the contents of the two branches during output to obtain a new characteristic diagram F ', namely F ' = F multiplied by F ' _c 。

As shown in fig. 9, the channel attention module used in the channel attention layer specifically includes: first, avgPool (F) is processed on a feature map F with input size H × W × C using global average pooled AvgPool, resulting in a channel description F with size 1 × 1 × C _a Wherein H represents the height of the received feature map, W represents the width of the received feature map, and C represents the depth of the received feature map, and meanwhile, the input feature map F with the size of H multiplied by W multiplied by C is processed by the global maximum pooling MaxPoint (F) to obtain the channel description F with the size of 1 multiplied by C _m Wherein H represents the height of the acceptance feature map, W represents the width of the acceptance feature map, and C represents the depth of the acceptance feature map; second, a multi-layer perceptron MLP pair F with one parameter sharing _a To perform a treatment, i.e. MLP (F) _a ) To give a new description F' _a While, multi-layer perceptron MLP pairs F with parameter sharing _m To perform a treatment, i.e. MLP (F) _m ) To give a new description F' _m (ii) a Next, two new channels are described as F' _a And F' _m Adding to obtain new description F _add I.e. F _add ＝F‘ _a +F’ _m (ii) a Finally, a Sigmoid pair of activation functions F is utilized _add Processed to give a final channel description F' _c I.e. F' _c ＝Sigmoid(F _add )。

The spatial attention layer used in the improved residual unit, as shown in particular in fig. 10, comprises two branches. One branch retains the received feature map F ', and the other branch distinguishes the position of important information in the feature map through a spatial attention module to obtain a channel description F' _s The contents of the two branches are correspondingly multiplied when being outputObtaining a new characteristic diagram F ', namely F' = F '× F' _s 。

As shown in fig. 11, the spatial attention module used in the spatial attention layer specifically includes: firstly, avgPool (F ') is processed on a feature map F' with an input size of H × W × C by using average pooled AvgPool of one channel dimension, resulting in a channel description with a size of H × W × 1

Namely, it is

Wherein H represents the height of the received feature map, W represents the width of the received feature map, C represents the depth of the received feature map, and MaxPool (F ') is processed by using max-pooling MaxPool of one channel dimension to input feature map F' with size H multiplied by W multiplied by C to obtain channel description with size H multiplied by W multiplied by 1

Wherein H represents the height of the acceptance feature map, W represents the width of the acceptance feature map, and C represents the depth of the acceptance feature map; then, will

And

splicing according to channels to obtain a new channel description F with the size of H multiplied by W multiplied by 2 _con Wherein H represents the height of the acceptance feature map, W represents the width of the acceptance feature map, and the depth is 2; next, description F is described using convolution layer conv _ spa pairs with convolution kernel size of 3 × 3, step size of 1, and edge fill of 1 _con Processing to obtain new channel description F _s I.e. F _s ＝conv_spa(F _con ) (ii) a Finally, the channel is described by using Sigmoid function F _s Processing to obtain the final weight coefficient F' _s I.e. F' _s ＝Sigmoid(F _s )。

For the step 3, training the improved residual error network, the specific method is as follows:

firstly, selecting a chemical process training data set G00_ tr in a normal state, a chemical process training data set G01_ tr in a fault 1 state, a chemical process training data set G04_ tr in a fault 4 state, a training data set G08_ tr in a fault 8 state and a training data set G12_ tr in a fault 12 state as training data sets of models; secondly, optimizing the model by adopting an Adam optimizer and a cross entropy loss function, wherein the learning rate of the Adam optimizer is set to be 0.0002, 100 rounds of training are performed, and the formula of the cross entropy loss function is

Where m represents the number of samples used in one training; y is _i Representing a true probability distribution;

representing the model prediction probability distribution.

And 4, step 4: selecting test set data corresponding to a plurality of fault states involved in training to evaluate the effect of the model, wherein the specific method comprises the following steps:

a chemical process test data set G00_ te in a normal state, a chemical process test data set G01_ te in a fault 1 state, a chemical process test data set G04_ te in a fault 4 state, a test data set G08_ te in a fault 8 state and a test data set G12_ te in a fault 12 state are selected to test the model effect, and experimental results show that the upper limit of the fault diagnosis accuracy of the improved residual error network in different states is higher than 94.7%, the abnormal values are few, the fault diagnosis performance of the model on the chemical process data set is stable and high in accuracy, the fault 8 and the fault 12 are faults generated under different disturbances, but the fault types are faults related to random variation, the faults can generate larger fluctuation, the upper limit of the accuracy in the improved residual error network model is higher than 96.3%, and the experimental results show that the model effect is good.

In conclusion, the method can be combined with data generated in a chemical process, and can solve the problems of low accuracy, frequently occurring missing report and false report phenomena and low generalization capability in process industrial fault diagnosis by utilizing the strong mining capability of the improved residual error network on information in complex data.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims

1. A chemical process fault diagnosis method based on an improved residual error network is characterized by comprising the following steps:

step 2: preprocessing collected data, namely performing dimension increasing on a feature vector of a sample by using a polynomial dimension increasing method, processing a one-dimensional feature vector into a two-dimensional gray image, and randomly dividing a data set into a training set and a test set;

and step 3: improving the framework of a residual error network, building a network model by using a pytorch framework, and selecting a normal state and a plurality of fault states from a training set to train the model;

the improved residual error network has the following structure:

the improved residual error unit comprises two branches, wherein one branch is a convolution layer of 1 multiplied by 1 in sequence and is processed by a batch normalization algorithm and a Relu activation function, the convolution can be separated in depth and is processed by the batch normalization algorithm and the Relu activation function, and an inclusion module extracts features, a channel attention layer and a space attention layer of different scales; the other branch is a 1 x 1 skipconv convolution layer, and the results of the last two branches are summed and then processed by a Relu activation function;

and 4, step 4: and selecting test set data corresponding to a plurality of fault states involved in training to evaluate the effect of the model.

2. The chemical process fault diagnosis method based on the improved residual error network as claimed in claim 1, wherein the specific method in step 1 is as follows:

3. The method for diagnosing the fault of the chemical process based on the improved residual error network as claimed in claim 2, wherein the specific method in the step 2 is as follows:

Wherein

The s-th feature representing the original data sample, variable s e [1, 52]The characteristic vector of the original sample is subjected to dimension raising by adopting a second-order polynomial dimension raising method to obtain a new characteristic vector

Wherein

The s-th feature of the data sample after feature dimension raising is represented, and a variable s is E [1, 1430]；

step 2.3: the processed two-dimensional gray image data is divided into a training set and a test set, normal state data and fault state data used for training a model are respectively stored in the training sets G00_ tr, G01_ tr, \8230;, G21_ tr, and normal state data and fault state data of the effect of the test model are respectively stored in G00_ te, G01_ te, \8230;, 8230;, G21_ te.

4. The method for diagnosing the fault of the chemical process based on the improved residual error network as claimed in claim 1, wherein in the step 3, the sizes of convolution kernels in the conv1 convolution layer are set to be 3 × 3, the number of channels is set to be 64, the step size is set to be 2, and the edge filling is set to be 1; setting the size of the largest pooling layer of the pool1 to be 3 multiplied by 3, the step length to be 2 and the edge filling to be 1; setting the number of neurons of an FC1 full connection layer to be 1024; the number of neurons in the FC2 full junction layer was set to 5.

5. The improved residual error network-based chemical process fault diagnosis method as claimed in claim 1, wherein the depth separable convolution used in the improved residual error unit comprises extracting feature map information using DepthWise convolution with convolution kernel size of 3 x 3, step size of 1, and edge padding of 1, fusing the information using PointWise convolution with convolution kernel size of 1 x 1, and step size of 1, and finally outputting the feature map.

6. The improved residual error network-based chemical process fault diagnosis method according to claim 1, wherein the inclusion module used in the improved residual error unit comprises four branches:

7. The improved residual error network-based chemical process fault diagnosis method according to claim 1, wherein the step size of the 1 x 1 convolutional layer used in the inclusion module in the step 3 is 1; the step size of the 3 × 3 convolutional layer used is 1 and the edge fill is also 1; the step size of the 5 × 5 convolutional layer used is 1 and the edge fill is 2; the step size of the maximum pooling layer of 3 x 3 used is 1 and the edge padding is 1.

8. The method as claimed in claim 1, wherein the channel attention layer used in the modified residual error unit comprises two branches, one branch retains the original feature map F, and the other branch comprises a channel attention module to distinguish which channel feature is more important, and obtain a channel description F' _c And correspondingly multiplying the contents of the two branches during output to obtain a new characteristic diagram F ', namely F ' = F multiplied by F ' _c (ii) a The spatial attention layer comprises two branches, one branch retains a received feature map F ', and the other branch distinguishes the position of important information in the feature map through a spatial attention module to obtain a general description F' _s When outputting, the contents of the two branches are correspondingly multiplied to obtain a new characteristic diagram F ', namely F' = F '× F' _s 。

9. The improved residual error network-based chemical process fault diagnosis method according to claim 8, wherein the specific method of the channel attention module used by the channel attention layer is as follows:

step 11.2: maxpool (F) is processed by using a global maximum pooling Maxpool to input a feature map F with the size of H × W × C, and a channel description F with the size of 1 × 1 × C is obtained _m Wherein H represents the height of the acceptance feature map, W represents the width of the acceptance feature map, and C represents the depth of the acceptance feature map;

step 11.3: multi-layer perceptron MLP pair F using one parameter share _a To perform a treatment, i.e. MLP (F) _a ) To give a new description F' _a ；

Step 11.4: by means of oneParameter-shared multi-layer perceptron MLP pair F _m To perform a treatment, i.e. MLP (F) _m ) To give a new description F' _m ；

Step 11.5: two new channels are described as F' _a And F' _m Adding to obtain new description F _add I.e. F _add ＝F‘ _a +F′ _m ；

Step 11.6: using a Sigmoid activation function pair F _add Processed to give a final channel description F' _c I.e. F' _c ＝Sigmoid(F _add )。

10. The improved residual error network-based chemical process fault diagnosis method according to claim 8, wherein the spatial attention module used by the spatial attention layer is as follows:

Step 12.5: using Sigmoid functionPairs of channel descriptions F _s Processing to obtain final weight coefficient F _s ', i.e. F' _s ＝Sigmoid(F _s )。