CN113222045B - Semi-supervised fault classification method based on weighted feature alignment self-encoder - Google Patents

Semi-supervised fault classification method based on weighted feature alignment self-encoder Download PDF

Info

Publication number
CN113222045B
CN113222045B CN202110575307.4A CN202110575307A CN113222045B CN 113222045 B CN113222045 B CN 113222045B CN 202110575307 A CN202110575307 A CN 202110575307A CN 113222045 B CN113222045 B CN 113222045B
Authority
CN
China
Prior art keywords
unlabeled
labeled
encoder
sample
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110575307.4A
Other languages
Chinese (zh)
Other versions
CN113222045A (en
Inventor
张新民
张宏毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202110575307.4A priority Critical patent/CN113222045B/en
Publication of CN113222045A publication Critical patent/CN113222045A/en
Application granted granted Critical
Publication of CN113222045B publication Critical patent/CN113222045B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于加权特征对齐自编码器的半监督故障分类方法,该方法首先使用有标签数据对堆叠自编码器进行重构预训练,并估计重构误差的概率密度分布。然后,根据训练数据重构误差的概率密度函数,计算无标签样本的权重。进一步,利用有标签样本集、无标签样本集以及对应权重,构建基于加权特征对齐自编码器的半监督分类模型。加权特征对齐自编码器分类模型设计了基于加权Sinkhorn距离的交叉熵训练损失函数,该函数使得模型在微调阶段同时使用有标签数据和无标签数据,不仅可以实现数据信息的深度挖掘,还可以提高网络模型的泛化能力。同时,由于加权策略的引入,模型的鲁棒性显著提升。

Figure 202110575307

The invention discloses a semi-supervised fault classification method based on weighted feature alignment self-encoder. The method firstly uses labeled data to reconstruct and pre-train the stacked self-encoder, and estimates the probability density distribution of reconstruction errors. Then, the probability density function of the error is reconstructed according to the training data, and the weight of the unlabeled samples is calculated. Further, a semi-supervised classification model based on weighted feature alignment autoencoder is constructed by using the labeled sample set, the unlabeled sample set and the corresponding weights. The weighted feature alignment autoencoder classification model designs a cross-entropy training loss function based on the weighted Sinkhorn distance. This function enables the model to use both labeled data and unlabeled data in the fine-tuning stage, which can not only achieve in-depth mining of data information, but also improve The generalization ability of the network model. At the same time, due to the introduction of the weighting strategy, the robustness of the model is significantly improved.

Figure 202110575307

Description

基于加权特征对齐自编码器的半监督故障分类方法A Semi-Supervised Fault Classification Method Based on Weighted Feature Alignment Autoencoder

技术领域technical field

本发明属于工业过程控制领域,特别涉及一种基于加权特征对齐自编码器的半监督故障分类方法。The invention belongs to the field of industrial process control, in particular to a semi-supervised fault classification method based on a weighted feature alignment self-encoder.

背景技术Background technique

现代工业过程正朝着大规模、复杂化的方向发展。如何保证生产过程安全是工业过程控制领域重点关注和需要解决的关键问题之一。故障诊断是保障工业过程安全运行的关键技术,对提高产品质量和生产效率具有重要意义。故障分类属于故障诊断中的一个环节,通过从历史的故障信息中学习,实现故障类型的自动识别与判断,从而帮助生产人员快速地定位、修复故障,避免故障造成进一步的损失。随着现代测量手段的不断发展和进步,工业生产过程积累了大量的数据。数据描述了制造各生产阶段的真实情况,为读懂、分析和优化制造过程提供了宝贵的数据资源,是实现智能制造的智能来源。因此,如何合理地利用制造过程积累的数据信息,建立数据驱动的智能分析模型,以更好的为制造过程的智能决策与质量控制服务,是工业界较为关注的热点问题。数据驱动的故障分类方法利用机器学习、深度学习等智能分析技术,对工业数据深入挖掘、建模和分析,为用户和工业提供数据驱动的故障诊断模式。现有的数据驱动的故障分类方法大部分属于有监督学习的方法,在能获取充足有标签数据时,模型可以获得出色的性能。然而,在某些工业场景下很难获取大量、充足的有标签数据。因此,往往具有大量的无标签数据和少量的有标签数据。为了有效地利用无标签数据以提高模型的分类性能,基于半监督学习的故障分类方法逐渐受到了关注。然而,大部分现有的半监督故障分类方法大都依赖于某些数据假设,例如基于统计学习的半监督学习方法、基于图的半监督学习方法、以及基于协同训练、自训练等其他给无标签数据打标签的方法,这些方法都依赖一个假设,即:有标签样本与无标签样本属于同一分布。然而,这一假设存在其局限性,工业过程采集到的数据经常包含大量噪声和异常点,并且有可能会发生工况的飘移,有标签数据往往是经过过程领域的专家人工筛选、标注过的,而无标签样本则没有经过筛选,因此,无标签数据中很有可能会出现与有标签数据分布不一样的异常数据。当无标签数据与有标签数据分布不一致时,半监督算法会出现性能的下降,甚至低于只使用有标签数据进行训练的有监督算法。因此,亟需提供一种鲁棒的半监督学习方法,使得模型能够在有标签数据以及无标签数据存在分布不一致现象时仍然能够准确实施故障分类。Modern industrial processes are moving towards large-scale and complex development. How to ensure the safety of the production process is one of the key issues that the industrial process control field focuses on and needs to solve. Fault diagnosis is a key technology to ensure the safe operation of industrial processes, and it is of great significance to improve product quality and production efficiency. Fault classification is a part of fault diagnosis. By learning from historical fault information, automatic identification and judgment of fault types can be realized, thereby helping production personnel to quickly locate and repair faults and avoid further losses caused by faults. With the continuous development and progress of modern measurement methods, a large amount of data has been accumulated in the industrial production process. Data describes the real situation of each production stage of manufacturing, provides valuable data resources for understanding, analyzing and optimizing the manufacturing process, and is an intelligent source for realizing intelligent manufacturing. Therefore, how to reasonably use the data information accumulated in the manufacturing process to establish a data-driven intelligent analysis model to better serve the intelligent decision-making and quality control of the manufacturing process is a hot issue that the industry pays more attention to. The data-driven fault classification method uses intelligent analysis technologies such as machine learning and deep learning to deeply mine, model and analyze industrial data, and provide data-driven fault diagnosis models for users and industries. Most of the existing data-driven fault classification methods belong to supervised learning methods, and when sufficient labeled data is available, the model can achieve excellent performance. However, it is difficult to obtain large and sufficient labeled data in some industrial scenarios. Therefore, there is often a large amount of unlabeled data and a small amount of labeled data. In order to effectively utilize unlabeled data to improve the classification performance of the model, fault classification methods based on semi-supervised learning have gradually attracted attention. However, most of the existing semi-supervised fault classification methods mostly rely on certain data assumptions, such as statistical learning-based semi-supervised learning methods, graph-based semi-supervised learning methods, and others based on co-training, self-training, etc. Data labeling methods rely on an assumption that labeled samples and unlabeled samples belong to the same distribution. However, this assumption has its limitations. The data collected in the industrial process often contains a lot of noise and abnormal points, and the drift of the working conditions may occur. The labeled data is often manually screened and labeled by experts in the process field. , while the unlabeled samples are not filtered, so there is a high probability that there will be abnormal data in the unlabeled data distribution that is different from the labeled data. When the distribution of unlabeled data and labeled data is inconsistent, semi-supervised algorithms will suffer from performance degradation, even lower than supervised algorithms trained only with labeled data. Therefore, there is an urgent need to provide a robust semi-supervised learning method, so that the model can still accurately implement fault classification when there is distribution inconsistency between labeled data and unlabeled data.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于针对现有技术的不足,提供一种基于加权特征对齐自编码器的半监督故障分类方法,该方法包括如下步骤:The purpose of the present invention is to aim at the deficiencies of the prior art, to provide a semi-supervised fault classification method based on the weighted feature alignment autoencoder, the method comprises the following steps:

一种基于加权特征对齐自编码器的半监督故障分类方法,该方法包括以下步骤:A semi-supervised fault classification method based on weighted feature alignment autoencoder, the method includes the following steps:

步骤一:收集工业过程的正常工况数据以及各种故障数据,得到建模用的训练数据集:有标签样本集

Figure BDA0003084132720000021
和无标签样本集
Figure BDA0003084132720000022
其中,x代表输入样本,y代表样本标签,m表示有标签样本个数,n表示无标签样本个数;Step 1: Collect the normal working condition data of the industrial process and various fault data, and obtain the training data set for modeling: the labeled sample set
Figure BDA0003084132720000021
and unlabeled sample set
Figure BDA0003084132720000022
Among them, x represents the input sample, y represents the sample label, m represents the number of labeled samples, and n represents the number of unlabeled samples;

步骤二:构建用于重构的堆叠自编码器模型,并利用有标签样本集对堆叠自编码器模型进行训练;Step 2: Build a stacked autoencoder model for reconstruction, and use the labeled sample set to train the stacked autoencoder model;

步骤三:估计训练数据重构误差的概率密度分布,计算无标签样本的权重,并进一步构建加权特征对齐自编码器分类模型;Step 3: Estimate the probability density distribution of the reconstruction error of the training data, calculate the weight of unlabeled samples, and further construct a weighted feature alignment autoencoder classification model;

步骤四:采集现场工作数据并输入所述加权特征对齐自编码器分类模型,输出对应的故障类别。Step 4: Collect field work data and input the weighted feature alignment autoencoder classification model, and output the corresponding fault category.

进一步地,所述步骤二具体分为如下的子步骤:Further, the step 2 is specifically divided into the following sub-steps:

(2.1)构建用于重构的堆叠自编码器模型,包含多层编码器和解码器,模型输出是对输入的重构,计算公式如下:(2.1) Build a stacked autoencoder model for reconstruction, including multi-layer encoders and decoders, the model output is the reconstruction of the input, and the calculation formula is as follows:

Figure BDA0003084132720000023
Figure BDA0003084132720000023

Figure BDA0003084132720000024
Figure BDA0003084132720000024

其中,x表示输入,zk代表提取的第k层特征,k表示堆叠自编码器的第k层,

Figure BDA0003084132720000025
Figure BDA0003084132720000026
分别表示编码器和解码器的权重向量和偏差向量,
Figure BDA0003084132720000027
代表模型对输入的重构;where x represents the input, z k represents the extracted k-th layer features, k represents the k-th layer of the stacked autoencoder,
Figure BDA0003084132720000025
and
Figure BDA0003084132720000026
represent the weight vector and bias vector of the encoder and decoder, respectively,
Figure BDA0003084132720000027
Represents the reconstruction of the input by the model;

(2.2)采用步骤一构建的有标签样本,采用随机梯度下降算法对所述堆叠自编码器模型进行训练,其模型训练损失函数定义为对输入的重构误差,重构误差由下式表示:(2.2) Using the labeled samples constructed in step 1, the stacked autoencoder model is trained by the stochastic gradient descent algorithm, and the model training loss function is defined as the reconstruction error of the input, and the reconstruction error is represented by the following formula:

Figure BDA0003084132720000028
Figure BDA0003084132720000028

其中,

Figure BDA0003084132720000029
代表第i个有标签输入样本,
Figure BDA00030841327200000210
代表堆叠自编码器对它的重构;in,
Figure BDA0003084132720000029
represents the ith labeled input sample,
Figure BDA00030841327200000210
represents its reconstruction by stacked autoencoders;

(2.3)利用训练好的堆叠自编码器模型,计算有标签样本的重构误差

Figure BDA00030841327200000211
其中,单个样本的重构误差参照如下公式计算:(2.3) Use the trained stacked autoencoder model to calculate the reconstruction error of labeled samples
Figure BDA00030841327200000211
Among them, the reconstruction error of a single sample is calculated with reference to the following formula:

Figure BDA00030841327200000212
Figure BDA00030841327200000212

进一步地,所述步骤三具体分为如下的子步骤:Further, the step 3 is specifically divided into the following sub-steps:

(3.1)计算有标签样本的重构误差El服从χ2分布

Figure BDA0003084132720000031
的分布参数g和h(3.1) Calculating the reconstruction error E l of the labeled samples obeys the χ 2 distribution
Figure BDA0003084132720000031
The distribution parameters g and h of

g·h=mean(El) (5)g·h=mean(E l ) (5)

2g2·h=variance(El) (6)2g 2 ·h=variance(E l ) (6)

(3.2)计算无标签样本的重构误差

Figure BDA0003084132720000032
单个样本的重构误差计算公式和公式(4)相同;(3.2) Calculate the reconstruction error of unlabeled samples
Figure BDA0003084132720000032
The calculation formula of the reconstruction error of a single sample is the same as formula (4);

(3.3)计算无标签样本的重构误差Eu在分布El下发生的概率

Figure BDA0003084132720000033
对Pu进行归一化,得到无标签样本的权重
Figure BDA0003084132720000034
(3.3) Calculate the probability that the reconstruction error E u of unlabeled samples occurs under the distribution E l
Figure BDA0003084132720000033
Normalize P u to get the weight of unlabeled samples
Figure BDA0003084132720000034

(3.4)构建加权特征对齐自编码器分类模型,采用有标签样本集、无标签样本集以及对应权重,对所述加权特征对齐自编码器分类模型进行训练。训练过程分为:无监督预训练和有监督微调。在无监督预训练阶段,采用有标签样本和无标签样本一起训练一个堆叠自编码器。无监督预训练方法与步骤(2.1)~(2.3)相同。所述有监督微调是在无监督预训练获得的堆叠自编码器上增加一层全连接神经网络层并将其作为类别的输出构成,从而得到有标签样本的深层提取特征和类别标签,以及无标签样本的深层提取特征和预测的类别标签输出,具体计算公式如下:(3.4) Constructing a weighted feature alignment autoencoder classification model, using labeled sample sets, unlabeled sample sets and corresponding weights to train the weighted feature alignment autoencoder classification model. The training process is divided into: unsupervised pre-training and supervised fine-tuning. In the unsupervised pre-training stage, a stacked autoencoder is trained with labeled samples and unlabeled samples together. The unsupervised pre-training method is the same as steps (2.1)~(2.3). The supervised fine-tuning is to add a fully connected neural network layer to the stacked autoencoder obtained by unsupervised pre-training and use it as the output of the category, so as to obtain the deep extracted features and category labels of the labeled samples, and the unsupervised pre-training. The deep extraction features of the label samples and the predicted category label output, the specific calculation formula is as follows:

Figure BDA0003084132720000035
Figure BDA0003084132720000035

Figure BDA0003084132720000036
Figure BDA0003084132720000036

Figure BDA0003084132720000037
Figure BDA0003084132720000037

Figure BDA0003084132720000038
Figure BDA0003084132720000038

其中,

Figure BDA0003084132720000039
代表第i个有标签样本的深层提取特征,
Figure BDA00030841327200000310
代表预测的第i个有标签样本的类别标签,{wc,bc}表示全连接神经网络层的权重向量和偏差向量;
Figure BDA00030841327200000311
代表无标签样本的深层提取特征,
Figure BDA00030841327200000312
代表预测的类别标签输出;in,
Figure BDA0003084132720000039
represents the deep extracted features of the i-th labeled sample,
Figure BDA00030841327200000310
represents the class label of the predicted i-th labeled sample, {w c , b c } represents the weight vector and bias vector of the fully connected neural network layer;
Figure BDA00030841327200000311
represents the deep extracted features of unlabeled samples,
Figure BDA00030841327200000312
represents the predicted class label output;

(3.7)假设类别数目为F,获得对应于每一类别f∈F的有标签样本和无标签样本的深层提取特征

Figure BDA00030841327200000313
Figure BDA00030841327200000314
以及无标签样本的权重
Figure BDA00030841327200000315
(3.7) Assuming that the number of classes is F, obtain the deep extraction features of labeled samples and unlabeled samples corresponding to each class f∈F
Figure BDA00030841327200000313
and
Figure BDA00030841327200000314
and the weights of unlabeled samples
Figure BDA00030841327200000315

(3.8)采用下式计算加权特征对齐自编码器分类模型的训练损失函数:(3.8) The training loss function of the weighted feature alignment autoencoder classification model is calculated by the following formula:

Figure BDA0003084132720000041
Figure BDA0003084132720000041

Figure BDA0003084132720000042
Figure BDA0003084132720000042

Figure BDA0003084132720000043
Figure BDA0003084132720000043

其中,crossentropy代表交叉熵损失函数,

Figure BDA0003084132720000044
代表加权Sinkhorn距离函数,用于度量属于同一类别的有标签数据特征分布和无标签数据特征分布的距离,同时实现了对重构误差较大的异常无标签样本进行降权;α为Sinkhorn距离的权重,
Figure BDA0003084132720000045
为网络参数的L2正则化惩罚项,β是它的权重,pij代表对应于类别f的有标签样本i的特征
Figure BDA0003084132720000046
到无标签样本j的特征
Figure BDA0003084132720000047
的转移概率,dij代表对应于类别f的有标签样本i的特征
Figure BDA0003084132720000048
到无标签样本j的特征
Figure BDA0003084132720000049
的距离,
Figure BDA00030841327200000410
代表对应于类别f的无标签样本j的权重,mf和nf分别代表对应于类别f的有标签样本和无标签样本的数量。where crossentropy represents the cross entropy loss function,
Figure BDA0003084132720000044
Represents the weighted Sinkhorn distance function, which is used to measure the distance between the feature distribution of labeled data and the feature distribution of unlabeled data belonging to the same category, and realizes the weight reduction of abnormal unlabeled samples with large reconstruction error; α is the Sinkhorn distance Weights,
Figure BDA0003084132720000045
is the L2 regularization penalty term for the network parameters, β is its weight, and pij represents the feature of the labeled sample i corresponding to the class f
Figure BDA0003084132720000046
to the features of unlabeled sample j
Figure BDA0003084132720000047
The transition probability of d ij represents the feature of the labeled sample i corresponding to the class f
Figure BDA0003084132720000048
to the features of unlabeled sample j
Figure BDA0003084132720000049
the distance,
Figure BDA00030841327200000410
represents the weight of unlabeled sample j corresponding to class f, and mf and nf represent the number of labeled samples and unlabeled samples corresponding to class f, respectively.

本发明的有益效果如下:The beneficial effects of the present invention are as follows:

本发明针对传统半监督分类模型在有标签数据与无标签数据分布不一致时性能下降的问题,提出一种鲁棒的基于加权特征对齐自编码器的半监督故障分类方法。该方法设计了基于加权和特征对齐策略的模型训练损失函数。加权策略的引入提升了半监督分类模型的鲁棒性,减少了样本分布不一致造成的分类模型性能下降问题。而特征对齐策略的引入使得模型在微调阶段同时使用有标签数据和无标签数据,不仅可以实现数据信息的深度挖掘,还可以提高网络模型的泛化能力和分类性能。Aiming at the problem that the performance of the traditional semi-supervised classification model is degraded when the distribution of labeled data and unlabeled data is inconsistent, the invention proposes a robust semi-supervised fault classification method based on a weighted feature alignment autoencoder. This method designs a model training loss function based on weighting and feature alignment strategy. The introduction of the weighting strategy improves the robustness of the semi-supervised classification model and reduces the performance degradation of the classification model caused by inconsistent sample distribution. The introduction of the feature alignment strategy enables the model to use both labeled data and unlabeled data in the fine-tuning stage, which can not only realize the deep mining of data information, but also improve the generalization ability and classification performance of the network model.

附图说明Description of drawings

图1为堆叠自编码器示意图;Figure 1 is a schematic diagram of a stacked autoencoder;

图2为TE过程流程图;Figure 2 is a flow chart of the TE process;

图3为数据对数重构误差的示意图;Fig. 3 is the schematic diagram of data logarithm reconstruction error;

图4不同算法的分类准确率的示意图。Figure 4 is a schematic diagram of the classification accuracy of different algorithms.

具体实施方式Detailed ways

下面根据附图和优选实施例详细描述本发明,本发明的目的和效果将变得更加明白,应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。The present invention will be described in detail below according to the accompanying drawings and preferred embodiments, and the purpose and effects of the present invention will become clearer.

本发明的基于加权特征对齐自编码器的半监督故障分类方法,首先使用有标签数据对堆叠自编码器进行重构预训练,并估计重构误差的概率密度分布。然后,根据训练数据重构误差的概率密度函数,计算无标签样本的权重。进一步,利用有标签样本集、无标签样本集以及对应权重,构建基于加权特征对齐自编码器的半监督分类模型。加权特征对齐自编码器分类模型设计了基于加权Sinkhorn距离的交叉熵训练损失函数,该函数使得模型在微调阶段同时使用有标签数据和无标签数据,不仅可以实现数据信息的深度挖掘,还可以提高网络模型的泛化能力。同时,由于加权策略的引入,模型的鲁棒性显著提升。In the semi-supervised fault classification method based on the weighted feature alignment autoencoder of the present invention, the stacked autoencoder is reconstructed and pre-trained by using labeled data, and the probability density distribution of the reconstruction error is estimated. Then, the probability density function of the error is reconstructed according to the training data, and the weight of the unlabeled samples is calculated. Further, a semi-supervised classification model based on weighted feature alignment autoencoder is constructed by using the labeled sample set, the unlabeled sample set and the corresponding weights. The weighted feature alignment autoencoder classification model designs a cross-entropy training loss function based on the weighted Sinkhorn distance. This function enables the model to use both labeled data and unlabeled data in the fine-tuning stage, which can not only achieve in-depth mining of data information, but also improve The generalization ability of the network model. At the same time, due to the introduction of the weighting strategy, the robustness of the model is significantly improved.

本发明的方法具体步骤如下:The specific steps of the method of the present invention are as follows:

步骤一:收集工业过程的正常工况数据以及各种故障数据,得到建模用的训练数据集:有标签样本集

Figure BDA0003084132720000051
和无标签样本集
Figure BDA0003084132720000052
其中,x代表输入样本,y代表样本标签,m表示有标签样本个数,n表示无标签样本个数;Step 1: Collect the normal working condition data of the industrial process and various fault data, and obtain the training data set for modeling: the labeled sample set
Figure BDA0003084132720000051
and unlabeled sample set
Figure BDA0003084132720000052
Among them, x represents the input sample, y represents the sample label, m represents the number of labeled samples, and n represents the number of unlabeled samples;

步骤二:构建用于重构的堆叠自编码器模型,并利用有标签样本集对堆叠自编码器模型进行训练;具体分为如下的子步骤:Step 2: Build a stacked autoencoder model for reconstruction, and use the labeled sample set to train the stacked autoencoder model; it is divided into the following sub-steps:

(2.1)构建用于重构的堆叠自编码器模型,包含多层编码器和解码器,模型输出是对输入的重构,计算公式如下:(2.1) Build a stacked autoencoder model for reconstruction, including multi-layer encoders and decoders, the model output is the reconstruction of the input, and the calculation formula is as follows:

Figure BDA0003084132720000053
Figure BDA0003084132720000053

Figure BDA0003084132720000054
Figure BDA0003084132720000054

其中,x表示输入,zk代表提取的第k层特征,k表示堆叠自编码器的第k层,

Figure BDA0003084132720000055
Figure BDA0003084132720000056
分别表示编码器和解码器的权重向量和偏差向量,
Figure BDA0003084132720000057
代表模型对输入的重构;where x represents the input, z k represents the extracted k-th layer features, k represents the k-th layer of the stacked autoencoder,
Figure BDA0003084132720000055
and
Figure BDA0003084132720000056
represent the weight vector and bias vector of the encoder and decoder, respectively,
Figure BDA0003084132720000057
Represents the reconstruction of the input by the model;

(2.2)采用步骤一构建的有标签样本集,采用随机梯度下降算法对所述堆叠自编码器模型进行训练,其模型训练损失函数定义为对输入的重构误差,重构误差由下式表示:(2.2) Using the labeled sample set constructed in step 1, the stacked autoencoder model is trained by the stochastic gradient descent algorithm, and the model training loss function is defined as the reconstruction error of the input, and the reconstruction error is represented by the following formula :

Figure BDA0003084132720000058
Figure BDA0003084132720000058

其中,

Figure BDA0003084132720000059
代表第i个有标签输入样本,
Figure BDA00030841327200000510
代表堆叠自编码器对它的重构;in,
Figure BDA0003084132720000059
represents the ith labeled input sample,
Figure BDA00030841327200000510
represents its reconstruction by stacked autoencoders;

(2.3)利用训练好的堆叠自编码器模型,计算有标签样本的重构误差

Figure BDA00030841327200000511
其中,单个样本的重构误差参照如下公式计算:(2.3) Use the trained stacked autoencoder model to calculate the reconstruction error of labeled samples
Figure BDA00030841327200000511
Among them, the reconstruction error of a single sample is calculated with reference to the following formula:

Figure BDA00030841327200000512
Figure BDA00030841327200000512

步骤三:估计训练数据重构误差的概率密度分布,计算无标签样本的权重,并进一步构建加权特征对齐自编码器分类模型;Step 3: Estimate the probability density distribution of the reconstruction error of the training data, calculate the weight of unlabeled samples, and further construct a weighted feature alignment autoencoder classification model;

所述步骤三具体分为如下的子步骤:The third step is specifically divided into the following sub-steps:

(3.1)计算有标签样本的重构误差El服从χ2分布

Figure BDA00030841327200000513
的分布参数g和h(3.1) Calculating the reconstruction error E l of the labeled samples obeys the χ 2 distribution
Figure BDA00030841327200000513
The distribution parameters g and h of

g·h=mean(El) (5)g·h=mean(E l ) (5)

2g2·h=variance(El) (6)2g 2 ·h=variance(E l ) (6)

(3.2)计算无标签样本的重构误差

Figure BDA0003084132720000061
单个样本的重构误差计算公式和公式(4)相同;(3.2) Calculate the reconstruction error of unlabeled samples
Figure BDA0003084132720000061
The calculation formula of the reconstruction error of a single sample is the same as formula (4);

(3.3)计算无标签样本的重构误差Eu在分布El下发生的概率

Figure BDA0003084132720000062
对Pu进行归一化,得到无标签样本的权重
Figure BDA0003084132720000063
(3.3) Calculate the probability that the reconstruction error E u of unlabeled samples occurs under the distribution E l
Figure BDA0003084132720000062
Normalize P u to get the weight of unlabeled samples
Figure BDA0003084132720000063

(3.4)构建加权特征对齐自编码器分类模型,采用有标签样本集、无标签样本集以及对应权重,对所述加权特征对齐自编码器分类模型进行训练。训练过程可分为:无监督预训练和有监督微调。(3.4) Constructing a weighted feature alignment autoencoder classification model, using labeled sample sets, unlabeled sample sets and corresponding weights to train the weighted feature alignment autoencoder classification model. The training process can be divided into: unsupervised pre-training and supervised fine-tuning.

在无监督预训练阶段,采用有标签样本和无标签样本一起训练一个堆叠自编码器。无监督预训练方法与步骤(2.1)~(2.3)相同,即先构建一个用于重构的堆叠自编码器模型,然后用有标签样本和无标签样本一起训练该堆叠自编码器;In the unsupervised pre-training stage, a stacked autoencoder is trained with labeled samples and unlabeled samples together. The unsupervised pre-training method is the same as steps (2.1)~(2.3), that is, firstly construct a stacked autoencoder model for reconstruction, and then train the stacked autoencoder with labeled samples and unlabeled samples;

所述有监督微调是在无监督预训练获得的堆叠自编码器上增加一层全连接神经网络层并将其作为类别的输出构成,从而得到有标签样本的深层提取特征和类别标签,以及无标签样本的深层提取特征和预测的类别标签输出,具体计算公式如下:The supervised fine-tuning is to add a fully connected neural network layer to the stacked autoencoder obtained by unsupervised pre-training and use it as the output of the category, so as to obtain the deep extracted features and category labels of the labeled samples, and the unsupervised pre-training. The deep extraction features of the label samples and the predicted category label output, the specific calculation formula is as follows:

Figure BDA0003084132720000064
Figure BDA0003084132720000064

Figure BDA0003084132720000065
Figure BDA0003084132720000065

Figure BDA0003084132720000066
Figure BDA0003084132720000066

Figure BDA0003084132720000067
Figure BDA0003084132720000067

其中,

Figure BDA0003084132720000068
代表第i个有标签样本的深层提取特征,
Figure BDA0003084132720000069
代表预测的第i个有标签样本的类别标签,{wc,bc}表示全连接神经网络层的权重向量和偏差向量;
Figure BDA00030841327200000610
代表无标签样本的深层提取特征,
Figure BDA00030841327200000611
代表预测的类别标签输出;in,
Figure BDA0003084132720000068
represents the deep extracted features of the i-th labeled sample,
Figure BDA0003084132720000069
represents the class label of the predicted i-th labeled sample, {w c , b c } represents the weight vector and bias vector of the fully connected neural network layer;
Figure BDA00030841327200000610
represents the deep extracted features of unlabeled samples,
Figure BDA00030841327200000611
represents the predicted class label output;

(3.7)假设类别数目为F,根据下式获得对应于每一类别f∈F的有标签样本和无标签样本的深层提取特征

Figure BDA00030841327200000612
Figure BDA00030841327200000613
以及无标签样本的权重
Figure BDA00030841327200000614
(3.7) Assuming that the number of categories is F, the deep extraction features of labeled samples and unlabeled samples corresponding to each category f∈F are obtained according to the following formula
Figure BDA00030841327200000612
and
Figure BDA00030841327200000613
and the weights of unlabeled samples
Figure BDA00030841327200000614

(3.8)采用下式计算加权特征对齐自编码器分类模型的训练损失函数:(3.8) The training loss function of the weighted feature alignment autoencoder classification model is calculated by the following formula:

Figure BDA0003084132720000071
Figure BDA0003084132720000071

Figure BDA0003084132720000072
Figure BDA0003084132720000072

Figure BDA0003084132720000073
Figure BDA0003084132720000073

其中,crossentropy代表交叉熵损失函数,

Figure BDA0003084132720000074
代表加权Sinkhorn距离函数,α为Sinkhorn距离的权重,
Figure BDA0003084132720000075
为网络参数的L2正则化惩罚项,β是它的权重,pij代表对应于类别f的有标签样本i的特征
Figure BDA0003084132720000076
到无标签样本j的特征
Figure BDA0003084132720000077
的转移概率,dij代表对应于类别f的有标签样本i的特征
Figure BDA0003084132720000078
到无标签样本j的特征
Figure BDA0003084132720000079
的距离,
Figure BDA00030841327200000710
代表对应于类别f的无标签样本j的权重,mf和nf分别代表对应于类别f的有标签样本和无标签样本的数量。新设计的基于加权Sinkhorn距离的训练损失函数主要目的有两个。一个是在微调阶段对属于同一类别的有标签数据和无标签数据通过堆叠自编码器提取的特征对齐,使它们的分布接近。另一个是通过该带无标签样本权重的加权Sinkhorn特征距离,实现了对重构误差较大的异常无标签样本进行降权。where crossentropy represents the cross entropy loss function,
Figure BDA0003084132720000074
represents the weighted Sinkhorn distance function, α is the weight of the Sinkhorn distance,
Figure BDA0003084132720000075
is the L2 regularization penalty term for the network parameters, β is its weight, and pij represents the feature of the labeled sample i corresponding to the class f
Figure BDA0003084132720000076
to the features of unlabeled sample j
Figure BDA0003084132720000077
The transition probability of d ij represents the feature of the labeled sample i corresponding to the class f
Figure BDA0003084132720000078
to the features of unlabeled sample j
Figure BDA0003084132720000079
the distance,
Figure BDA00030841327200000710
represents the weight of unlabeled sample j corresponding to class f, and mf and nf represent the number of labeled samples and unlabeled samples corresponding to class f, respectively. The newly designed training loss function based on weighted Sinkhorn distance has two main purposes. One is to align the features extracted by stacking autoencoders on labeled data and unlabeled data belonging to the same class in the fine-tuning stage to make their distributions close. The other is that the weighted Sinkhorn feature distance with the weight of the unlabeled sample is used to reduce the weight of the abnormal unlabeled sample with large reconstruction error.

步骤四:采集现场工作数据并输入所述加权特征对齐自编码器分类模型,输出对应的故障类别。Step 4: Collect field work data and input the weighted feature alignment autoencoder classification model, and output the corresponding fault category.

下面以一个具体工业过程实例验证本发明的方法的有效性。所有的数据采集于美国田纳西-伊斯曼(Tennessee Eastman,TE)化工实验仿真平台,该平台作为典型的化工过程研究对象广泛应用于故障诊断与故障分类领域。TE过程的流程图如图2所示,其主要设备包括一个连续搅拌式反应釜,一个气液分离塔,一个离心式压缩机,一个分凝器和一个再沸器。建模过程数据包含16个过程变量和10个故障类别,详细的过程变量和故障信息描述分别见表1和表2。The effectiveness of the method of the present invention is verified below with a specific industrial process example. All data are collected on the Tennessee Eastman (TE) chemical experiment simulation platform, which is widely used in the field of fault diagnosis and fault classification as a typical chemical process research object. The flow chart of the TE process is shown in Figure 2, and its main equipment includes a continuous stirring reactor, a gas-liquid separation tower, a centrifugal compressor, a partial condenser and a reboiler. The modeling process data contains 16 process variables and 10 fault categories. The detailed process variables and fault information descriptions are shown in Table 1 and Table 2, respectively.

表1Table 1

Figure BDA00030841327200000711
Figure BDA00030841327200000711

Figure BDA0003084132720000081
Figure BDA0003084132720000081

表2Table 2

故障编号fault number 描述describe 故障类型Fault type 11 A/C描述进料流量比变化(流4)A/C describes the feed flow ratio change (stream 4) 阶跃step 55 冷凝器冷却水入口温度变化Condenser cooling water inlet temperature change 阶跃step 77 物料C压力损失(流4)Material C pressure loss (stream 4) 阶跃step 1010 物料C的温度变化(流4)Temperature change of material C (stream 4) 随机变量Random Variables 1414 反应器冷却水阀门Reactor cooling water valve 粘滞sticky

采集到的数据一共包含3600个样本,其来自于6个类别,每个类别各采集600个样本。采集的数据被划分为训练数据(包含300个有标签数据和3000个无标签数据)和测试数据(包含300个有标签数据)。为了模拟无标签数据出现与有标签数据分布不一致的情况,我们按照一定的比例对原始的无标签数据中加入高斯噪声。The collected data contains a total of 3600 samples, which come from 6 categories, and 600 samples are collected in each category. The collected data is divided into training data (containing 300 labeled data and 3000 unlabeled data) and test data (containing 300 labeled data). In order to simulate the inconsistency between the unlabeled data and the labeled data distribution, we add Gaussian noise to the original unlabeled data according to a certain proportion.

图3给出了有标签数据,正常的无标签数据以及与有标签数据分布不一致的异常无标签数据在堆叠自编码器重构模型下的对数重构误差。从图3中可以明显看出,有标签数据和正常无标签数据的重构误差比较接近,而异常无标签数据的重构误差明显大于有标签数据和正常无标签数据的重构误差。这是基于加权特征对齐自编码器检测异常分布无标签数据的基础。Figure 3 presents the logarithmic reconstruction errors of labeled data, normal unlabeled data, and anomalous unlabeled data that are inconsistent with the distribution of labeled data under the stacked autoencoder reconstruction model. It can be clearly seen from Figure 3 that the reconstruction errors of labeled data and normal unlabeled data are relatively close, while the reconstruction error of abnormal unlabeled data is significantly larger than that of labeled data and normal unlabeled data. This is the basis for detecting abnormally distributed unlabeled data based on weighted feature alignment autoencoders.

图4展示了不同有标签无标签数据分布不一致比例下,三种算法的分类准确率。其中,MLP方法为有监督神经网络分类模型,Tri-traing方法为基于协同训练获得的神经网络分类模型,Weighted FA-SAE方法为本发明所提出的基于加权特征对齐自编码器分类模型。Tri-traing和Weighted FA-SAE属于半监督深度学习网络模型。从图中可以看出,大部分半监督学习算法的分类性能优于有监督算法;另外,随着有标签数据和无标签数据分布不一致比例的逐渐扩大,半监督算法的性能都出现了下降,其中当分布不一致率达到90%时,Tri-traing方法的分类精度甚至低于有监督的MLP方法。相比之下,本发明所提出的Weighted FA-SAE方法,在不同程度的分布不一致率下,分类性能优于MLP和Tri-traing方法。Figure 4 shows the classification accuracy of the three algorithms under different proportions of inconsistent labeled and unlabeled data distributions. Among them, the MLP method is a supervised neural network classification model, the Tri-training method is a neural network classification model obtained based on collaborative training, and the Weighted FA-SAE method is a weighted feature alignment-based autoencoder classification model proposed by the present invention. Tri-traing and Weighted FA-SAE are semi-supervised deep learning network models. As can be seen from the figure, the classification performance of most semi-supervised learning algorithms is better than that of supervised algorithms; in addition, with the gradual expansion of the proportion of inconsistent distribution of labeled data and unlabeled data, the performance of semi-supervised algorithms has declined. Among them, when the distribution inconsistency rate reaches 90%, the classification accuracy of the Tri-training method is even lower than that of the supervised MLP method. In contrast, the Weighted FA-SAE method proposed in the present invention has better classification performance than MLP and Tri-training methods under different degrees of distribution inconsistency rates.

本领域普通技术人员可以理解,以上所述仅为发明的优选实例而已,并不用于限制发明,尽管参照前述实例对发明进行了详细的说明,对于本领域的技术人员来说,其依然可以对前述各实例记载的技术方案进行修改,或者对其中部分技术特征进行等同替换。凡在发明的精神和原则之内,所做的修改、等同替换等均应包含在发明的保护范围之内。Those of ordinary skill in the art can understand that the above are only preferred examples of the invention and are not intended to limit the invention. Although the invention has been described in detail with reference to the foregoing examples, those skilled in the art can still Modifications are made to the technical solutions described in the foregoing examples, or equivalent replacements are made to some of the technical features. All modifications and equivalent replacements made within the spirit and principle of the invention shall be included within the protection scope of the invention.

Claims (2)

1. A semi-supervised fault classification method based on a weighted feature alignment self-encoder is characterized by comprising the following steps:
the method comprises the following steps: collecting normal working condition data and various fault data of an industrial process to obtain a training data set for modeling: sample set with labels
Figure FDA0003565086410000011
And unlabeled sample set
Figure FDA0003565086410000012
Wherein x represents an input sample, y represents a sample label, m represents the number of labeled samples, and n represents the number of unlabeled samples;
step two: constructing a stacking self-encoder model for reconstruction, and training the stacking self-encoder model by utilizing a labeled sample set;
step three: estimating the probability density distribution of the reconstruction error of the training data, calculating the weight of the label-free sample, and further constructing a weighted feature alignment self-encoder classification model;
the third step is specifically divided into the following substeps:
(3.1) calculating the reconstruction error E of the labeled exemplarslCompliance chi2Distribution of
Figure FDA0003565086410000013
Distribution parameters g and h of
g·h=mean(El)
2g2·h=variance(El)
(3.2) calculating reconstruction error of unlabeled exemplar
Figure FDA0003565086410000014
The reconstruction error calculation formula for a single sample is as follows:
Figure FDA0003565086410000015
wherein,
Figure FDA0003565086410000016
representing the reconstruction of the model to the input;
(3.3) calculating the reconstruction error E of the unlabeled exemplarsuIn distribution ElProbability of occurrence of
Figure FDA0003565086410000017
To PuNormalizing to obtain the weight of the unlabeled sample
Figure FDA0003565086410000018
(3.4) constructing a weighted feature alignment self-encoder classification model, and training the weighted feature alignment self-encoder classification model by adopting a labeled sample set, an unlabeled sample set and corresponding weights; the training process comprises the following steps: unsupervised pre-training and supervised fine tuning; in an unsupervised pre-training stage, a stack self-encoder is trained by adopting a labeled sample and an unlabeled sample together; the supervised fine tuning is formed by adding a fully-connected neural network layer on a stacked self-encoder obtained by unsupervised pre-training and using the fully-connected neural network layer as output of categories, so as to obtain deep extraction features and category labels of the labeled samples and deep extraction features and predicted category label output of the unlabeled samples, and a specific calculation formula is as follows:
Figure FDA0003565086410000019
Figure FDA00035650864100000110
Figure FDA00035650864100000111
Figure FDA0003565086410000021
wherein,
Figure FDA0003565086410000022
represents the deep-extracted features of the ith labeled sample,
Figure FDA0003565086410000023
class label representing predicted ith labeled sample, { wc,bcRepresenting weight vectors and deviation vectors of the fully connected neural network layer;
Figure FDA0003565086410000024
represents a deep extraction feature of the unlabeled exemplar,
Figure FDA0003565086410000025
a class label output representing a prediction;
(3.5) the number of classes is F, and deep extraction features of labeled samples and unlabeled samples corresponding to each class F epsilon F are obtained
Figure FDA0003565086410000026
And
Figure FDA0003565086410000027
and weight of unlabeled exemplars
Figure FDA0003565086410000028
(3.6) calculating a training loss function of the weighted feature alignment self-encoder classification model using the following formula:
Figure FDA0003565086410000029
Figure FDA00035650864100000210
Figure FDA00035650864100000211
wherein, crossentropy represents a cross entropy loss function,
Figure FDA00035650864100000212
the representative weighted Sinkhorn distance function is used for measuring the distance between the characteristic distribution of the labeled data and the characteristic distribution of the unlabeled data belonging to the same category, and meanwhile, the weight reduction of the abnormal unlabeled sample with larger reconstruction error is realized; alpha is the weight of the Sinkhorn distance,
Figure FDA00035650864100000213
l being a network parameter2Regularization penalty term, β is its weight, pijRepresenting features of labeled exemplars i corresponding to category f
Figure FDA00035650864100000214
Features to unlabeled sample j
Figure FDA00035650864100000215
Transition probability of dijRepresenting features of labeled exemplars i corresponding to class f
Figure FDA00035650864100000216
Features to unlabeled sample j
Figure FDA00035650864100000217
The distance of (a) to (b),
Figure FDA00035650864100000218
represents the weight of the unlabeled exemplar j corresponding to the class f, and mf and nf represent the number of labeled and unlabeled exemplars corresponding to the class f, respectively;
step four: and acquiring field working data, inputting the weighted features to align the self-encoder classification model, and outputting corresponding fault categories.
2. The semi-supervised fault classification method based on weighted feature alignment self-encoder according to claim 1, wherein the second step is specifically divided into the following sub-steps:
(2.1) constructing a stacked self-encoder model for reconstruction, comprising a multi-layer encoder and a decoder, wherein the output of the model is the reconstruction of the input, and the calculation formula is as follows:
Figure FDA00035650864100000219
Figure FDA0003565086410000031
wherein x represents the input, zkRepresenting the extracted k-th layer features, k representing the k-th layer of the stacked self-encoder,
Figure FDA0003565086410000032
and
Figure FDA0003565086410000033
representing the weight vector and the disparity vector of the encoder and decoder respectively,
Figure FDA0003565086410000034
reconstruction of the input by the representative model;
(2.2) training the stacked self-encoder model by adopting the labeled samples constructed in the step one and adopting a random gradient descent algorithm, wherein a model training loss function is defined as a reconstruction error of an input, and the reconstruction error is represented by the following formula:
Figure FDA0003565086410000035
wherein,
Figure FDA0003565086410000036
representing the ith labeled input sample,
Figure FDA0003565086410000037
representing the reconstruction of the stacked auto-encoder;
(2.3) calculating the reconstruction error of the labeled sample by using the trained stacked self-encoder model
Figure FDA0003565086410000038
Wherein the reconstruction error of a single sample is calculated with reference to the following formula:
Figure FDA0003565086410000039
CN202110575307.4A 2021-05-26 2021-05-26 Semi-supervised fault classification method based on weighted feature alignment self-encoder Active CN113222045B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110575307.4A CN113222045B (en) 2021-05-26 2021-05-26 Semi-supervised fault classification method based on weighted feature alignment self-encoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110575307.4A CN113222045B (en) 2021-05-26 2021-05-26 Semi-supervised fault classification method based on weighted feature alignment self-encoder

Publications (2)

Publication Number Publication Date
CN113222045A CN113222045A (en) 2021-08-06
CN113222045B true CN113222045B (en) 2022-06-24

Family

ID=77098569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110575307.4A Active CN113222045B (en) 2021-05-26 2021-05-26 Semi-supervised fault classification method based on weighted feature alignment self-encoder

Country Status (1)

Country Link
CN (1) CN113222045B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113705729B (en) * 2021-09-27 2024-06-25 中原动力智能机器人有限公司 Garbage classification model modeling method, garbage classification device and medium
CN115184054B (en) * 2022-05-30 2022-12-27 深圳技术大学 Mechanical equipment semi-supervised fault detection and analysis method, device, terminal and medium
CN114819108B (en) * 2022-06-22 2022-10-04 中国电力科学研究院有限公司 A kind of comprehensive energy system fault identification method and device
CN117988823B (en) * 2024-01-31 2024-12-03 成都理工大学 Real-time warning method for casing damage during drilling process based on semi-supervised autoencoder

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111026058A (en) * 2019-12-16 2020-04-17 浙江大学 Semi-supervised deep learning fault diagnosis method based on Watherstein distance and self-encoder
CN112183581A (en) * 2020-09-07 2021-01-05 华南理工大学 Semi-supervised mechanical fault diagnosis method based on self-adaptive migration neural network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111026058A (en) * 2019-12-16 2020-04-17 浙江大学 Semi-supervised deep learning fault diagnosis method based on Watherstein distance and self-encoder
CN112183581A (en) * 2020-09-07 2021-01-05 华南理工大学 Semi-supervised mechanical fault diagnosis method based on self-adaptive migration neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Semi-Supervised Bearing Fault Diagnosis and Classification Using Variational Autoencoder-Based Deep Generative Models;Shen Zhang et al.;《IEEE SENSORS JOURNAL》;20210301;第6476-6486页 *
基于循环神经网络的半监督动态软测量建模方法;邵伟明等;《电子测量与仪器学报》;20191115(第11期);全文 *

Also Published As

Publication number Publication date
CN113222045A (en) 2021-08-06

Similar Documents

Publication Publication Date Title
CN113222045B (en) Semi-supervised fault classification method based on weighted feature alignment self-encoder
CN113496262B (en) Data-driven active power distribution network abnormal state sensing method and system
CN107941537B (en) A method for evaluating the health status of mechanical equipment
CN107272644B (en) DBN Network Fault Diagnosis Method for Submersible Reciprocating Pumping Unit
CN111914897A (en) A fault diagnosis method based on twinning long short-term memory network
CN108875772A (en) A kind of failure modes model and method being limited Boltzmann machine and intensified learning based on the sparse Gauss Bernoulli Jacob of stacking
CN112904810B (en) Process industry nonlinear process monitoring method based on effective feature selection
CN114692507B (en) Soft-sensing modeling method for count data based on stacked Poisson autoencoder network
CN113222046B (en) Feature alignment self-encoder fault classification method based on filtering strategy
CN109298633A (en) Fault monitoring method in chemical production process based on adaptive block non-negative matrix decomposition
CN109829561B (en) Accident prediction method based on smoothing processing and network model machine learning
CN115563563A (en) Fault diagnosis method and device based on transformer oil chromatographic analysis
CN115618708A (en) Equipment health state prediction method based on incremental inform algorithm
CN113608968A (en) An abnormal detection method of power dispatch monitoring data based on comprehensive decision-making of density and distance
WO2023231374A1 (en) Semi-supervised fault detection and analysis method and apparatus for mechanical device, terminal, and medium
CN116661410A (en) Large-scale industrial process fault detection and diagnosis method based on weighted directed graph
CN117349786A (en) Evidence fusion transformer fault diagnosis method based on data balancing
CN108830006A (en) Linear-nonlinear industrial processes fault detection method based on the linear evaluation factor
CN113283546B (en) Furnace condition abnormity alarm method and system of heating furnace integrity management centralized control device
CN118940122A (en) A dual noise autoencoder process fault classification method
CN102830624A (en) Semi-supervised monitoring method of production process of polypropylene based on self-learning statistic analysis
CN102880151A (en) Double-layer data model-driven plant-level chemical process monitoring method
CN114298413B (en) A method for predicting the swing trend of hydropower units
CN116796894A (en) An efficient deep learning weather prediction model construction method
CN115293221A (en) Anomaly detection method of power dispatch monitoring data based on rate of change of directional density ratio

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant