CN113222045A

CN113222045A - Semi-supervised fault classification method based on weighted feature alignment self-encoder

Info

Publication number: CN113222045A
Application number: CN202110575307.4A
Authority: CN
Inventors: 张新民; 张宏毅
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2021-05-26
Filing date: 2021-05-26
Publication date: 2021-08-06
Anticipated expiration: 2041-05-26
Also published as: CN113222045B

Abstract

The invention discloses a semi-supervised fault classification method based on a weighted feature alignment self-encoder. Then, the weight of the unlabeled sample is calculated according to the probability density function of the error reconstructed by the training data. Further, a semi-supervised classification model based on the weighted feature alignment self-encoder is constructed by utilizing the labeled sample set, the unlabeled sample set and the corresponding weights. The weighted feature alignment self-encoder classification model designs a cross entropy training loss function based on weighted Sinkhorn distance, and the function enables the model to use labeled data and unlabeled data at the fine tuning stage, so that not only can deep mining of data information be realized, but also the generalization capability of a network model can be improved. Meanwhile, due to the introduction of a weighting strategy, the robustness of the model is obviously improved.

Description

Semi-supervised fault classification method based on weighted feature alignment self-encoder

Technical Field

The invention belongs to the field of industrial process control, and particularly relates to a semi-supervised fault classification method based on a weighted feature alignment self-encoder.

Background

Modern industrial processes are moving towards large scale, complex processes. How to ensure the safety of the production process is one of key problems which are focused on and need to be solved in the field of industrial process control. The fault diagnosis is a key technology for guaranteeing the safe operation of the industrial process, and has important significance for improving the product quality and the production efficiency. The fault classification belongs to a link in fault diagnosis, and automatic identification and judgment of fault types are realized by learning from historical fault information, so that production personnel are helped to quickly locate and repair the faults, and further loss caused by the faults is avoided. With the continuous development and progress of modern measurement means, a great deal of data is accumulated in the industrial production process. The data describes the actual conditions of each production stage of the manufacturing, provides valuable data resources for reading, analyzing and optimizing the manufacturing process, and is an intelligent source for realizing intelligent manufacturing. Therefore, how to reasonably utilize the data information accumulated in the manufacturing process to establish a data-driven intelligent analysis model to better serve the intelligent decision and quality control of the manufacturing process is a hot point of great concern in the industry. The data-driven fault classification method utilizes intelligent analysis technologies such as machine learning and deep learning to deeply mine, model and analyze industrial data and provide a data-driven fault diagnosis mode for users and industries. Most of the existing data-driven fault classification methods belong to supervised learning methods, and when sufficient labeled data can be obtained, the model can obtain excellent performance. However, it is difficult to obtain large, sufficient tagged data in certain industrial scenarios. Thus, there is often a large amount of unlabeled data and a small amount of labeled data. In order to effectively utilize the unlabeled data to improve the classification performance of the model, a fault classification method based on semi-supervised learning is gradually receiving attention. However, most existing semi-supervised fault classification methods mostly rely on certain data assumptions, such as semi-supervised learning methods based on statistical learning, semi-supervised learning methods based on graphs, and other methods for labeling unlabeled data based on cooperative training, self-training, etc., which all rely on one assumption, namely: the labeled and unlabeled swatches belong to the same distribution. However, this assumption has its limitation, data collected by an industrial process often include a large amount of noise and abnormal points, and may drift working conditions, labeled data is often manually screened and labeled by experts in the process field, while unlabeled samples are not screened, so that there is a high possibility that abnormal data different from the labeled data may occur in the unlabeled data. When the distribution of the non-labeled data is inconsistent with that of the labeled data, the performance of the semi-supervised algorithm is reduced and is even lower than that of the supervised algorithm which only uses the labeled data for training. Therefore, it is desirable to provide a robust semi-supervised learning method, so that the model can still accurately implement fault classification when the labeled data and the unlabeled data have inconsistent distribution.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a semi-supervised fault classification method based on a weighted feature alignment self-encoder, which comprises the following steps:

a semi-supervised fault classification method based on weighted feature alignment self-encoder includes the following steps:

the method comprises the following steps: collecting normal working condition data and various fault data of an industrial process to obtain a training data set for modeling: sample set with labels

And unlabeled sample set

Wherein x represents an input sample, y represents a sample label, m represents the number of labeled samples, and n represents the number of unlabeled samples;

step two: constructing a stacking self-encoder model for reconstruction, and training the stacking self-encoder model by utilizing a labeled sample set;

step three: estimating the probability density distribution of the reconstruction error of the training data, calculating the weight of the label-free sample, and further constructing a weighted feature alignment self-encoder classification model;

step four: and acquiring field working data, inputting the weighted features to align the self-encoder classification model, and outputting corresponding fault categories.

Further, the second step is specifically divided into the following sub-steps:

(2.1) constructing a stacked self-encoder model for reconstruction, comprising a multi-layer encoder and a decoder, wherein the output of the model is the reconstruction of the input, and the calculation formula is as follows:

wherein x represents the input, z_kRepresenting the extracted k-th layer features, k representing the k-th layer of the stacked self-encoder,

and

representing the weight vector and the disparity vector of the encoder and decoder respectively,

reconstruction of the input by the representative model;

(2.2) training the stacked self-encoder model by adopting the labeled samples constructed in the step one and adopting a random gradient descent algorithm, wherein a model training loss function is defined as a reconstruction error of an input, and the reconstruction error is represented by the following formula:

wherein the content of the first and second substances,

representing the ith labeled input sample,

representing the reconstruction of the stacked auto-encoder;

(2.3) calculating the reconstruction error of the labeled sample by using the trained stacked self-encoder model

Wherein the reconstruction error of a single sample is calculated with reference to the following formula:

further, the third step is specifically divided into the following sub-steps:

(3.1) calculating the reconstruction error E of the labeled exemplars_lCompliance chi²Distribution of

Distribution parameters g and h of

g·h＝mean(E_l) (5)

2g²·h＝variance(E_l) (6)

(3.2) calculating reconstruction error of unlabeled exemplar

The reconstruction error calculation formula of a single sample is the same as the formula (4);

(3.3) calculating the reconstruction error E of the unlabeled exemplars_uIn distribution E_lProbability of occurrence of

To P_uNormalizing to obtain the weight of the unlabeled sample

And (3.4) constructing a weighted feature alignment self-encoder classification model, and training the weighted feature alignment self-encoder classification model by adopting a labeled sample set, an unlabeled sample set and corresponding weights. The training process comprises the following steps: unsupervised pre-training and supervised fine tuning. In the unsupervised pre-training phase, labeled samples and unlabeled samples are used together to train a stacked self-encoder. The unsupervised pre-training method is the same as the steps (2.1) - (2.3). The supervised fine tuning is formed by adding a fully-connected neural network layer on a stacked self-encoder obtained by unsupervised pre-training and using the fully-connected neural network layer as output of categories, so as to obtain deep extraction features and category labels of the labeled samples and deep extraction features and predicted category label output of the unlabeled samples, and a specific calculation formula is as follows:

wherein the content of the first and second substances,

represents the deep-extracted features of the ith labeled sample,

class labels representing the predicted ith labeled exemplar, { w_c，b_cRepresenting weight vectors and deviation vectors of the fully connected neural network layer;

represents a deep extraction feature of the unlabeled exemplar,

a class label output representing a prediction;

(3.7) assuming the number of classes as F, obtaining deep extraction features of labeled exemplars and unlabeled exemplars corresponding to each class F e F

And

and weight of unlabeled exemplars

(3.8) calculating a training loss function of the weighted feature alignment self-encoder classification model using the following formula:

wherein, crossentropy represents a cross entropy loss function,

the representative weighted Sinkhorn distance function is used for measuring the distance between the characteristic distribution of the labeled data and the characteristic distribution of the unlabeled data belonging to the same category, and meanwhile, the weight reduction of the abnormal unlabeled sample with larger reconstruction error is realized; alpha is the weight of the Sinkhorn distance,

l being a network parameter₂Regularization penalty term, β is its weight, p_ijRepresents corresponds toFeatures of labeled exemplars i of class f

Features to unlabeled sample j

Transition probability of d_ijRepresenting features of labeled exemplars i corresponding to class f

Features to unlabeled sample j

The distance of (a) to (b),

represents the weight of the unlabeled exemplar j corresponding to the class f, and mf and nf represent the number of labeled and unlabeled exemplars corresponding to the class f, respectively.

The invention has the following beneficial effects:

the invention provides a robust semi-supervised fault classification method based on a weighted feature alignment self-encoder, aiming at the problem of performance degradation of a traditional semi-supervised classification model when labeled data and unlabelled data are not distributed uniformly. The method designs a model training loss function based on a weighting and feature alignment strategy. The introduction of the weighting strategy improves the robustness of the semi-supervised classification model and reduces the problem of performance reduction of the classification model caused by inconsistent sample distribution. And the introduction of the characteristic alignment strategy enables the model to use the labeled data and the unlabeled data at the same time in the fine tuning stage, so that the deep mining of data information can be realized, and the generalization capability and classification performance of the network model can be improved.

Drawings

FIG. 1 is a schematic diagram of a stacked self-encoder;

FIG. 2 is a TE process flow diagram;

FIG. 3 is a schematic diagram of data log reconstruction errors;

FIG. 4 is a graph illustrating classification accuracy of different algorithms.

Detailed Description

The present invention will be described in detail below with reference to the accompanying drawings and preferred embodiments, and the objects and effects of the present invention will become more apparent, it being understood that the specific embodiments described herein are merely illustrative of the present invention and are not intended to limit the present invention.

The semi-supervised fault classification method based on the weighted feature alignment self-encoder comprises the steps of firstly using labeled data to carry out reconstruction pre-training on a stacked self-encoder, and estimating probability density distribution of reconstruction errors. Then, the weight of the unlabeled sample is calculated according to the probability density function of the error reconstructed by the training data. Further, a semi-supervised classification model based on the weighted feature alignment self-encoder is constructed by utilizing the labeled sample set, the unlabeled sample set and the corresponding weights. The weighted feature alignment self-encoder classification model designs a cross entropy training loss function based on weighted Sinkhorn distance, and the function enables the model to use labeled data and unlabeled data at the fine tuning stage, so that not only can deep mining of data information be realized, but also the generalization capability of a network model can be improved. Meanwhile, due to the introduction of a weighting strategy, the robustness of the model is obviously improved.

The method comprises the following specific steps:

And unlabeled sample set

step two: constructing a stacking self-encoder model for reconstruction, and training the stacking self-encoder model by utilizing a labeled sample set; the method is specifically divided into the following substeps:

and

reconstruction of the input by the representative model;

(2.2) training the stacked self-encoder model by adopting the labeled sample set constructed in the step one and adopting a random gradient descent algorithm, wherein a model training loss function is defined as an input reconstruction error, and the reconstruction error is represented by the following formula:

wherein the content of the first and second substances,

representing the ith labeled input sample,

representing the reconstruction of the stacked auto-encoder;

(2.3) good training by using trainingComputing reconstruction errors for labeled samples

the third step is specifically divided into the following substeps:

Distribution parameters g and h of

g·h＝mean(E_l) (5)

2g²·h＝variance(E_l) (6)

(3.2) calculating reconstruction error of unlabeled exemplar

To P_uNormalizing to obtain the weight of the unlabeled sample

And (3.4) constructing a weighted feature alignment self-encoder classification model, and training the weighted feature alignment self-encoder classification model by adopting a labeled sample set, an unlabeled sample set and corresponding weights. The training process can be divided into: unsupervised pre-training and supervised fine tuning.

In the unsupervised pre-training phase, labeled samples and unlabeled samples are used together to train a stacked self-encoder. The unsupervised pre-training method is the same as the steps (2.1) - (2.3), namely, a stacking self-encoder model for reconstruction is constructed firstly, and then the stacking self-encoder is trained by using the labeled sample and the unlabeled sample;

the supervised fine tuning is formed by adding a fully-connected neural network layer on a stacked self-encoder obtained by unsupervised pre-training and using the fully-connected neural network layer as output of categories, so as to obtain deep extraction features and category labels of the labeled samples and deep extraction features and predicted category label output of the unlabeled samples, and a specific calculation formula is as follows:

wherein the content of the first and second substances,

represents the deep-extracted features of the ith labeled sample,

represents a deep extraction feature of the unlabeled exemplar,

a class label output representing a prediction;

(3.7) assuming the number of classes as F, deep-extraction features of labeled and unlabeled exemplars corresponding to each class F E F are obtained according to the following formula

And

and weight of unlabeled exemplars

wherein, crossentropy represents a cross entropy loss function,

representing a weighted Sinkhorn distance function, alpha is the weight of the Sinkhorn distance,

l being a network parameter₂Regularization penalty term, β is its weight, p_ijRepresenting features of labeled exemplars i corresponding to class f

Features to unlabeled sample j

Features to unlabeled sample j

The distance of (a) to (b),

represents the weight of the unlabeled exemplar j corresponding to the class f, and mf and nf represent the number of labeled and unlabeled exemplars corresponding to the class f, respectively. The main purpose of the newly designed training loss function based on the weighted Sinkhorn distance is two. One is to align the labeled data and unlabeled data belonging to the same class in the fine tuning stage by stacking the features extracted from the encoder so that their distributions are close. And the other is that the weight reduction of the abnormal unlabeled sample with larger reconstruction error is realized through the weighted Sinkhorn characteristic distance with the unlabeled sample weight.

The validity of the method of the invention is verified below with a specific industrial process example. All data are collected on a Tennessee-Eastman (TE) chemical engineering experiment simulation platform in the United states, and the platform is widely applied to the field of fault diagnosis and fault classification as a typical chemical process research object. The TE process is schematically shown in FIG. 2, and its main equipment includes a continuous stirred tank reactor, a gas-liquid separation column, a centrifugal compressor, a partial condenser and a reboiler. The modeled process data contained 16 process variables and 10 fault categories, and the detailed process variable and fault information descriptions are shown in tables 1 and 2, respectively.

TABLE 1

TABLE 2

Fault numbering	Description of the invention	Type of failure
			1	A/C describes the feed flow ratio variation (stream 4)	Step change
5	Condenser cooling water inlet temperature change	Step change
			7	Material C pressure loss (stream 4)	Step change
10	Temperature Change of Material C (stream 4)	Random variable
			14	Cooling water valve of reactor	Viscous glue

The collected data contains a total of 3600 samples from 6 classes, 600 samples for each class. The collected data was divided into training data (containing 300 labeled data and 3000 unlabeled data) and test data (containing 300 labeled data). In order to simulate the situation that the distribution of the non-tag data is inconsistent with that of the tag data, Gaussian noise is added into the original non-tag data according to a certain proportion.

Fig. 3 shows log reconstruction errors of labeled data, normal unlabeled data, and abnormal unlabeled data that are not in accordance with the distribution of the labeled data under the stacked self-encoder reconstruction model. As is apparent from fig. 3, the reconstruction errors of the labeled data and the normal unlabeled data are relatively close, while the reconstruction error of the abnormal unlabeled data is significantly larger than the reconstruction errors of the labeled data and the normal unlabeled data. This is the basis for detecting abnormally distributed unlabeled data from the encoder based on weighted feature alignment.

Fig. 4 shows the classification accuracy of the three algorithms under different labeled and unlabeled data distribution inconsistent ratios. The MLP method is a supervised neural network classification model, the Tri-tracking method is a neural network classification model obtained based on cooperative training, and the Weighted FA-SAE method is a Weighted feature alignment-based self-encoder classification model provided by the invention. Tri-tracking and Weighted FA-SAE belong to a semi-supervised deep learning network model. As can be seen from the figure, the classification performance of most semi-supervised learning algorithms is superior to that of supervised algorithms; in addition, with the gradual expansion of the distribution inconsistency ratio of the labeled data and the unlabeled data, the performance of the semi-supervised algorithm is reduced, wherein when the distribution inconsistency reaches 90%, the classification precision of the Tri-tracking method is even lower than that of the supervised MLP method. In contrast, the Weighted FA-SAE method provided by the invention has better classification performance than MLP and Tri-tracking methods under different degrees of distribution inconsistency rate.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and although the invention has been described in detail with reference to the foregoing examples, it will be apparent to those skilled in the art that various changes in the form and details of the embodiments may be made and equivalents may be substituted for elements thereof. All modifications, equivalents and the like which come within the spirit and principle of the invention are intended to be included within the scope of the invention.

Claims

1. A semi-supervised fault classification method based on a weighted feature alignment self-encoder is characterized by comprising the following steps:

And unlabeled sample set

Wherein x represents an input sample, y represents a sample label, m represents the number of labeled samples, and n represents the number of unlabeled samples.

2. The semi-supervised fault classification method based on weighted feature alignment self-encoder according to claim 1, wherein the second step is specifically divided into the following sub-steps:

and

reconstruction of the input by the representative model;

wherein the content of the first and second substances,

representing the ith labeled input sample,

representing the reconstruction of the stacked auto-encoder;

3. the semi-supervised fault classification method based on weighted feature alignment self-encoder according to claim 2, wherein the step three is particularly divided into the following sub-steps: