CN111859798A - Process industry fault diagnosis method based on bidirectional long and short time neural network - Google Patents
Process industry fault diagnosis method based on bidirectional long and short time neural network Download PDFInfo
- Publication number
- CN111859798A CN111859798A CN202010675680.2A CN202010675680A CN111859798A CN 111859798 A CN111859798 A CN 111859798A CN 202010675680 A CN202010675680 A CN 202010675680A CN 111859798 A CN111859798 A CN 111859798A
- Authority
- CN
- China
- Prior art keywords
- fault
- model
- neural network
- short
- bidirectional long
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000002457 bidirectional effect Effects 0.000 title claims abstract description 32
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 30
- 230000008569 process Effects 0.000 title claims abstract description 28
- 238000003745 diagnosis Methods 0.000 title claims abstract description 20
- 238000000605 extraction Methods 0.000 claims abstract description 9
- 238000002474 experimental method Methods 0.000 claims abstract description 8
- 238000012544 monitoring process Methods 0.000 claims abstract description 5
- 238000003062 neural network model Methods 0.000 claims abstract description 5
- 238000002360 preparation method Methods 0.000 claims abstract description 4
- 238000011160 research Methods 0.000 claims abstract description 4
- 238000012549 training Methods 0.000 claims description 16
- 238000012360 testing method Methods 0.000 claims description 15
- 230000015654 memory Effects 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000005094 computer simulation Methods 0.000 claims description 3
- 238000009826 distribution Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 3
- 238000012795 verification Methods 0.000 claims description 3
- 230000008034 disappearance Effects 0.000 abstract description 4
- 238000004880 explosion Methods 0.000 abstract description 4
- 230000006870 function Effects 0.000 description 26
- 238000005457 optimization Methods 0.000 description 9
- 210000004027 cell Anatomy 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 238000011161 development Methods 0.000 description 6
- 230000004913 activation Effects 0.000 description 4
- 230000000875 corresponding effect Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000011478 gradient descent method Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- WXFIGDLSSYIKKV-RCOVLWMOSA-N L-Metaraminol Chemical compound C[C@H](N)[C@H](O)C1=CC=CC(O)=C1 WXFIGDLSSYIKKV-RCOVLWMOSA-N 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000005008 domestic process Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000010327 methods by industry Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000002945 steepest descent method Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- Testing Of Devices, Machine Parts, Or Other Structures Thereof (AREA)
Abstract
本发明公开了基于双向长短时神经网络的流程工业故障诊断方法,包括如下步骤:S1:数据集准备:TE模型从160组数据开始引入故障,数据集用来建立监控模型;S2:特征提取:特征提取采用的是基于梯度提升机提取数据特征,在梯度下降的方向上,寻找出针对多个分类器组成的可加性模型;S3:建立实验平台;S4:实验进行:采用Keras框架搭建双向长短时神经网络模型,研究对象为田纳西伊斯曼模型;S5:实验结果。本发明利用双向长短时神经网络泛化能力强,能够避免长序列发生梯度消失和梯度爆炸的缺点,解决了流程工业故障诊断中准确率低,时常出现漏报和误报现象并且泛化能力低的问题。
The invention discloses a process industry fault diagnosis method based on a bidirectional long-short-term neural network, comprising the following steps: S1: data set preparation: TE model starts to introduce faults from 160 sets of data, and the data set is used to establish a monitoring model; S2: feature extraction: Feature extraction is based on the gradient lifting machine to extract data features. In the direction of gradient descent, an additivity model for multiple classifiers is found; S3: Establish an experimental platform; S4: Experiment: use Keras framework to build a two-way Long and short-term neural network model, the research object is the Tennessee Eastman model; S5: experimental results. The invention utilizes the strong generalization ability of the bidirectional long and short-term neural network, can avoid the shortcomings of gradient disappearance and gradient explosion in long sequences, and solves the problem of low accuracy in process industry fault diagnosis, frequent occurrences of missed and false positives, and low generalization ability. The problem.
Description
技术领域technical field
本发明涉及故障诊断技术领域,尤其涉及基于双向长短时神经网络的流程工业故障诊断方法。The invention relates to the technical field of fault diagnosis, in particular to a process industry fault diagnosis method based on a bidirectional long-short-term neural network.
背景技术Background technique
随着计算机技术和现代工业的飞速发展,工业越来越趋向于智能化和复杂化。对于生产过程的安全性提出了更高的要求。故障的发生会造成大量的人员伤亡和财产损失。因此,及时地检测或诊断出系统的故障就显得尤为重要。故障诊断的发展经历了三个过程,第一个阶段主要依靠专家和维修人员的经验、感官和简单的数据因为这一阶段生产设备都比较简单,所以故障诊断和监控也比较简单。第二个阶段,伴随着传感器和信号技术的发展,故障诊断与检测侧重于仪器仪表,并且在维修中得到了广泛的应用。第三个阶段随着计算机技术和人工智能技术的发展,故障诊断与检测进入了智能化阶段。With the rapid development of computer technology and modern industry, the industry is becoming more and more intelligent and complex. Higher requirements are placed on the safety of the production process. The occurrence of failure will cause a large number of casualties and property losses. Therefore, it is particularly important to detect or diagnose system failures in time. The development of fault diagnosis has gone through three processes. The first stage mainly relies on the experience, senses and simple data of experts and maintenance personnel. Because the production equipment in this stage is relatively simple, the fault diagnosis and monitoring are relatively simple. In the second stage, with the development of sensor and signal technology, fault diagnosis and detection focus on instrumentation, and it has been widely used in maintenance. The third stage With the development of computer technology and artificial intelligence technology, fault diagnosis and detection have entered the intelligent stage.
故障诊断技术按照建模的方法不同可以分为定量模型,定性模型和数据驱动模型。其中定量模型有状态估计法、参数估计法以及分析冗余法,这些方法的特点都是需要精确的机理模型的,但在工业过程由于非线性、时变、变量耦合、时间相关性、多模态、间歇等特性使其难以建立精确的模型。定性模型主要利用专家知识,因果关系通过演绎推理的方式来实现故障诊断与定位。随着工业的发展未知的故障会越来越多,此种方法无法获取完整的知识,因此有此种方法有局限性。数据驱动模型主要是利用正常工况的数据建立离散模型,通过大量的数据来细化模型使其更好的适应模型,不需要精准的机理模型,非常适合当下复杂的流程工业,近几年随着传感器技术和数据实时存储技术的发展,大量的数据得以保存下来,也为数据驱动模型打下了基础。数据驱动的方法主要分为多元统计分析、机器学习、信号处理、信息融合。According to different modeling methods, fault diagnosis technology can be divided into quantitative model, qualitative model and data-driven model. Among them, quantitative models include state estimation method, parameter estimation method and analytical redundancy method. The characteristics of these methods all require accurate mechanism models, but in industrial processes due to nonlinear, time-varying, variable coupling, time correlation, multi-mode The characteristics such as state and intermittent make it difficult to establish an accurate model. Qualitative models mainly use expert knowledge, and causality is used to diagnose and locate faults through deductive reasoning. With the development of the industry, there will be more and more unknown failures, and this method cannot obtain complete knowledge, so this method has limitations. The data-driven model mainly uses the data of normal working conditions to establish a discrete model, and uses a large amount of data to refine the model to better adapt to the model. It does not require an accurate mechanism model and is very suitable for the current complex process industry. With the development of sensor technology and data real-time storage technology, a large amount of data has been saved, which has also laid the foundation for data-driven models. Data-driven methods are mainly divided into multivariate statistical analysis, machine learning, signal processing, and information fusion.
传统的流程工业故障诊断方法缺点为准确率低,时常出现漏报和误报现象并且泛化能力低,随着技术手段的提升,大量的故障数据得以保存,因此,我们提出了基于双向长短时神经网络的流程工业故障诊断方法。The shortcomings of traditional process industry fault diagnosis methods are low accuracy, frequent false negatives and false positives, and low generalization ability. With the improvement of technical means, a large amount of fault data can be saved. Process industry fault diagnosis method based on neural network.
发明内容SUMMARY OF THE INVENTION
本发明提出的基于双向长短时神经网络的流程工业故障诊断方法,解决了上述背景技术中提出的问题。The process industry fault diagnosis method based on the bidirectional long-short-term neural network proposed by the present invention solves the problems raised in the above-mentioned background art.
为了实现上述目的,本发明采用了如下技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:
基于双向长短时神经网络的流程工业故障诊断方法,包括如下步骤:The process industry fault diagnosis method based on bidirectional long-short-time neural network includes the following steps:
S1:数据集准备:TE模型从160组数据开始引入故障,数据集用来建立监控模型,故障为21个预定义的故障和1个正常工况的数据集,正常情况下的测试集保存在d00_te.txt,训练集为d00.txt故障1的测试集保存在d01_te.txt训练集为d01.txt,……,故障21的测试集d21_te.txt,训练集d21.txt,选用正常工况下的520条,训练数据集d00_txt的第1条到第520条,进行建模,在此选取了故障8,故障12,故障13,故障17,故障20来验证模型的准确性;S1: Data set preparation: The TE model starts to introduce faults from 160 sets of data. The data set is used to build the monitoring model. The faults are a data set of 21 predefined faults and 1 normal working condition. The test set under normal conditions is stored in d00_te.txt, the training set is d00.txt The test set of
S2:特征提取:特征提取采用的是基于梯度提升机提取数据特征,在梯度下降的方向上,寻找出针对多个分类器组成的可加性模型;S2: Feature extraction: Feature extraction uses a gradient boosting machine to extract data features, and in the direction of gradient descent, finds an additivity model composed of multiple classifiers;
S3:建立实验平台:以电脑型号为联想thinkpad,操作系统为Windows10家庭中文版(64位),CPU为(英特尔)Intel(R)Core(TM)i5-8250UCPU@1.60GHz(8CPU),内存为8192MBRAM建立试验平台,在python3.7语言环境下进行的故障实验;S3: Establish an experimental platform: the computer model is Lenovo thinkpad, the operating system is Windows 10 Home Chinese Edition (64-bit), the CPU is (Intel) Intel(R) Core(TM) i5-8250UCPU@1.60GHz(8CPU), and the memory is 8192MBRAM establishes a test platform, and the failure experiment is carried out in the python3.7 language environment;
S4:实验进行:采用Keras框架搭建双向长短时神经网络模型,研究对象为田纳西伊斯曼模型;S4: The experiment is carried out: the Keras framework is used to build a bidirectional long-short-term neural network model, and the research object is the Tennessee Eastman model;
采用故障8、故障12、故障13、故障17、故障20进行验证,故障数据集和正常工况数据集通过梯度提升机进行特征提取;The fault 8, fault 12, fault 13, fault 17, and fault 20 are used for verification, and the fault data set and the normal working condition data set are extracted by gradient boosting machine;
之后将提取的特征作为双向长短时神经网络的输入,由双向长短时神经网络进行二分类,实验判断依据为准确率,实验结果通过箱型图进行展示;Then, the extracted features are used as the input of the bidirectional long-short-term neural network, and the bidirectional long-short-term neural network is used for binary classification. The experimental judgment is based on the accuracy rate, and the experimental results are displayed by box plots;
S5:实验结果:故障8、故障12、故障17,故障20的上下限准确率几乎为100%,而且异常值相对较少表明准确率相对稳定浮动很小,精度高,故障13由故障类型可知其类型为反应动力学中的缓慢漂移相关的故障,这类故障往往波动较大,而在双向LSTM中上限准确率为100%,中位数为0.89其下限为0.55,实验结果表明效果较好。S5: Experimental results: fault 8, fault 12, fault 17, and fault 20. The upper and lower limits of the accuracy rate are almost 100%, and the relatively few outliers indicate that the accuracy rate is relatively stable and fluctuates very small, and the accuracy is high. Fault 13 can be known by the fault type The type is the slow drift related fault in the reaction dynamics, which tends to fluctuate greatly, while in the bidirectional LSTM the upper limit is 100% accuracy, the median is 0.89 and the lower limit is 0.55, the experimental results show that the effect is better. .
优选的,所述步骤S4中样本数据的选择,选用田纳西伊士曼数据集中的故障8,故障12,故障13,故障17,故障20作为故障集,将故障样本按照7:3的比例划分为训练集和测试集。Preferably, in the selection of the sample data in the step S4, fault 8, fault 12, fault 13, fault 17, and fault 20 in the Tennessee Eastman data set are selected as the fault set, and the fault samples are divided according to the ratio of 7:3 into training set and test set.
优选的,所述样本数据建模的方式为在线建模,Time-Step步长为350,每次在pycharm中运行一次会得到其对应的准确率,将得到的准确率输入进箱型图代码即得到BILSTM的箱型图,箱型图可以不受到异常值的影响能够稳定地描绘出准确率分布的情况。Preferably, the method of modeling the sample data is online modeling, the Time-Step step is 350, and the corresponding accuracy rate is obtained each time it is run in pycharm, and the obtained accuracy rate is input into the box plot code That is, the box plot of BILSTM is obtained, and the box plot can stably depict the accuracy distribution without being affected by outliers.
本发明的有益效果为:利用双向长短时神经网络泛化能力强,能够避免长序列发生梯度消失和梯度爆炸的缺点,解决了流程工业故障诊断中准确率低,时常出现漏报和误报现象并且泛化能力低的问题;双向长短时神经网络应用在流程工领域,符合技术革新的要求;通过双向长短时神经网络提前诊断出流程工业中的故障并制定相应的措施,能够大大减少人员的伤亡和财产的损失。The beneficial effects of the invention are as follows: the use of the bidirectional long-short-term neural network has strong generalization ability, can avoid the shortcomings of gradient disappearance and gradient explosion in long sequences, and solves the problem of low accuracy in process industry fault diagnosis, and the phenomenon of false alarms and false alarms often occur. And the problem of low generalization ability; the application of bidirectional long-short-term neural network in the field of process engineering meets the requirements of technological innovation; the faults in the process industry can be diagnosed in advance through the bidirectional long-short-term neural network and corresponding measures can be formulated, which can greatly reduce the number of personnel. Casualties and property damage.
附图说明Description of drawings
图1为本发明的遗忘门的运算过程示意图;Fig. 1 is the operation process schematic diagram of the forget gate of the present invention;
图2为本发明的输入门的运算过程示意图;Fig. 2 is the operation process schematic diagram of the input gate of the present invention;
图3为本发明的输出门的运算过程示意图;Fig. 3 is the operation process schematic diagram of the output gate of the present invention;
图4为本发明的双向LSTM运算模型示意图。FIG. 4 is a schematic diagram of the bidirectional LSTM operation model of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments.
参照图1-4,基于双向长短时神经网络的流程工业故障诊断方法,包括如下步骤:1-4, the process industry fault diagnosis method based on bidirectional long-short-term neural network includes the following steps:
S1:数据集准备:TE模型从160组数据开始引入故障,数据集用来建立监控模型,故障为21个预定义的故障和1个正常工况的数据集,正常情况下的测试集保存在d00_te.txt,训练集为d00.txt故障1的测试集保存在d01_te.txt训练集为d01.txt,……,故障21的测试集d21_te.txt,训练集d21.txt,选用正常工况下的520条,训练数据集d00_txt的第1条到第520条,进行建模,在此选取了故障8,故障12,故障13,故障17,故障20来验证模型的准确性;S1: Data set preparation: The TE model starts to introduce faults from 160 sets of data. The data set is used to build the monitoring model. The faults are a data set of 21 predefined faults and 1 normal working condition. The test set under normal conditions is stored in d00_te.txt, the training set is d00.txt The test set of
S2:特征提取:特征提取采用的是基于梯度提升机提取数据特征,在梯度下降的方向上,寻找出针对多个分类器组成的可加性模型;S2: Feature extraction: Feature extraction uses a gradient boosting machine to extract data features, and in the direction of gradient descent, finds an additivity model composed of multiple classifiers;
S3:建立实验平台:以电脑型号为联想thinkpad,操作系统为Windows10家庭中文版(64位),CPU为(英特尔)Intel(R)Core(TM)i5-8250UCPU@1.60GHz(8CPU),内存为8192MBRAM建立试验平台,在python3.7语言环境下进行的故障实验;S3: Establish an experimental platform: the computer model is Lenovo thinkpad, the operating system is Windows 10 Home Chinese Edition (64-bit), the CPU is (Intel) Intel(R) Core(TM) i5-8250UCPU@1.60GHz(8CPU), and the memory is 8192MBRAM establishes a test platform, and the failure experiment is carried out in the python3.7 language environment;
S4:实验进行:采用Keras框架搭建双向长短时神经网络模型,研究对象为田纳西伊斯曼模型;S4: The experiment is carried out: the Keras framework is used to build a bidirectional long-short-term neural network model, and the research object is the Tennessee Eastman model;
采用故障8、故障12、故障13、故障17、故障20进行验证,故障数据集和正常工况数据集通过梯度提升机进行特征提取;The fault 8, fault 12, fault 13, fault 17, and fault 20 are used for verification, and the fault data set and the normal working condition data set are extracted by gradient boosting machine;
之后将提取的特征作为双向长短时神经网络的输入,由双向长短时神经网络进行二分类,实验判断依据为准确率,实验结果通过箱型图进行展示;Then, the extracted features are used as the input of the bidirectional long-short-term neural network, and the bidirectional long-short-term neural network is used for binary classification. The experimental judgment is based on the accuracy rate, and the experimental results are displayed by box plots;
S5:实验结果:故障8、故障12、故障17,故障20的上下限准确率几乎为100%,而且异常值相对较少表明准确率相对稳定浮动很小,精度高,故障13由故障类型可知其类型为反应动力学中的缓慢漂移相关的故障,这类故障往往波动较大,而在双向LSTM中上限准确率为100%,中位数为0.89其下限为0.55,实验结果表明效果较好。S5: Experimental results: fault 8, fault 12, fault 17, and fault 20. The upper and lower limits of the accuracy rate are almost 100%, and the relatively few outliers indicate that the accuracy rate is relatively stable and fluctuates very small, and the accuracy is high. Fault 13 can be known by the fault type The type is the slow drift related fault in the reaction dynamics, which tends to fluctuate greatly, while in the bidirectional LSTM the upper limit is 100% accuracy, the median is 0.89 and the lower limit is 0.55, the experimental results show that the effect is better. .
具体的,在步骤S4的实验中双向长短时神经网络模型的搭建是基于长短时神经网络(LSTM)基础之上的,其中长短时神经网络(LSTM)是循环神经网络的变种,解决了循环神经网络在较长序列中出现梯度消失和梯度爆炸的缺点,在隐藏层(hidden)的基础上引入细胞状态(cellstate),主要通过门这种方式对数据起到一个选择的作用,将一些有用的信息传递给下一个神经元,舍弃一些无关紧要的信息,LSTM主要分为遗忘门(forgetgate)、输入门(inputgate)、输出门(outputgate),三个门一般通过sigmoid或者tanh网络层后取值为(0,1)中的任意值,输出值为1时表示当前样本的任何信息都可用于下一样本,输出为0时表示下一样本与当前样本无关;Specifically, in the experiment of step S4, the construction of the bidirectional long-short-term neural network model is based on the long-short-term neural network (LSTM). The network has the shortcomings of gradient disappearance and gradient explosion in a long sequence. On the basis of the hidden layer (hidden), the cell state (cellstate) is introduced, and the gate is mainly used to select the data. The information is passed to the next neuron, and some irrelevant information is discarded. LSTM is mainly divided into forget gate, input gate, and output gate. The three gates generally take values through the sigmoid or tanh network layer. is any value in (0,1), when the output value is 1, it means that any information of the current sample can be used for the next sample, and when the output value is 0, it means that the next sample has nothing to do with the current sample;
遗忘门就是以一定的概率控制是否遗忘上一层的隐藏细胞状态,其构成主要是输入的有上一个隐藏状态h(t-1),和本序列数据x(t)通过一个激活函数sigmoid,得到一个f(t),输出的f(t)在0~1之间,0~1之间的映射就是记忆衰减系数,越接近零表示信息越应该丢弃,越接近1表示越应该保存,遗忘门的存在也是LSTM能够记住长期记忆的保证。遗忘门的运算过程如图1所示;The forget gate is to control whether to forget the hidden cell state of the previous layer with a certain probability. Get a f(t), the output f(t) is between 0 and 1. The mapping between 0 and 1 is the memory attenuation coefficient. The closer to zero, the more information should be discarded. The closer to 1, the more should be saved and forgotten. The presence of gates is also a guarantee that LSTMs can hold long-term memory. The operation process of the forget gate is shown in Figure 1;
遗忘门的输出结果为:The output of the forget gate is:
ft=σ(Wfht-1+UfXt+bf) (1)f t =σ(W f h t-1 +U f X t +b f ) (1)
σ表示激活函数sigmoid,Wf,Uf,bf表示表示神经网络的权值,遗忘门首先会读取上一样本隐藏层的输出信息ht-1和当前样本的Xt,然后输出特定的数值,作为细胞状态Ct-1的一部分。σ represents the activation function sigmoid, W f , U f , b f represents the weight of the neural network, the forget gate will first read the output information h t-1 of the hidden layer of the previous sample and the X t of the current sample, and then output a specific The value of , as part of the cell state C t-1 .
输入门:LSTM有三个输入,当前时刻网络的输入值X(t),上一个时刻LSTM的输出值,上一个的细胞状态。输入门负责处理当前序列位置,输入门主要由两部分组成,1)使用了sigmoid激活函数,输出i(t),这一层决定了我们将更新哪些值2)使用了tanh激活函数,输出at,二者的结果相乘更新细胞的状态。输入门的运算过程如图2所示。Input gate: LSTM has three inputs, the input value X(t) of the network at the current moment, the output value of the LSTM at the previous moment, and the cell state of the previous moment. The input gate is responsible for processing the current sequence position. The input gate is mainly composed of two parts. 1) The sigmoid activation function is used to output i(t). This layer determines which values we will update. 2) The tanh activation function is used to output a t , the results of the two are multiplied to update the state of the cell. The operation process of the input gate is shown in Figure 2.
输入门的输出结果为:The output of the input gate is:
t=σ(Wiht-1+UiXt+bi) (2)t=σ(W i h t-1 +U i X t +b i ) (2)
at=tanh(Wmht-1+UmXt+bm) (3)a t =tanh(W m h t-1 +U m X t +b m ) (3)
输出门:LSTM的输出有两个,两条线一条是cellstate细胞状态的输出,一条是hidden隐藏状态的输出。Output gate: There are two outputs of LSTM, one is the output of the cellstate cell state, and the other is the output of the hidden state.
细胞状态的更新是遗忘门和输出门共同作用的结果,由两部分组成,第一个部分是ct-1和遗忘门的ft乘积,第二部分是输入门it和at的乘积。The update of the cell state is the result of the combined action of the forget gate and the output gate. It consists of two parts. The first part is the product of c t -1 and the forget gate f t , and the second part is the product of the input gate it and a t . .
ct=ct-1ft+itat (4)c t = c t - 1 f t + i t a t (4)
隐藏状态的输出由两部分输出,第一部分由Ot,它由上一序列的隐藏状态ht-1和本序列数据Xt以及激活函数sigmoid得到,第二部分由隐藏状态ct和tanh激活函数构成。输出门的运算过程如图3所示。The output of the hidden state is output by two parts, the first part is output by O t , which is obtained by the hidden state h t-1 of the previous sequence, the data X t of this sequence and the activation function sigmoid, and the second part is activated by the hidden state c t and tanh function composition. The operation process of the output gate is shown in Figure 3.
双向LSTM(bi-directionLSTM,BiLSTM)是在LSTM的基础上,仿照人类理解文字前后的联想意识,引进了正负时间方向概念的变形结构,BiLSTM神经网络结构如图4BILSTM神经网络结构所示,每个隐层单元保存两个信息,即A和A*,A参与正向运算,A*参与反向运算,正向LSTM处理从X1到Xn,而后向LSTM处理从Xn到X1,二者共同决定输出y,通过这种方式,输出同时来自过去和未来的特征,当进行正向运算时,隐层单元St和St-1相关,反向运算时隐层单元和相关,对于与因果无关的时间序列,双向长短时神经网络利用已知的时间序列和反向的位置序列,通过正反两个方向对原序列特征提取层次,显著地提升了模型的精度,因此在处理时序问题上双向长短时神经网络往往比单向长短时神经网络效果要好,在处理复杂任务上潜力更大,双向LSTM运算模型如图4。Bi-directional LSTM (bi-directionLSTM, BiLSTM) is based on LSTM, imitating the human's associative consciousness before and after understanding the text, and introduces the deformation structure of the concept of positive and negative time directions. The BiLSTM neural network structure is shown in Figure 4. A hidden layer unit saves two pieces of information, namely A and A * , A participates in forward operation, A * participates in reverse operation, forward LSTM processing from X 1 to X n , and backward LSTM processing from X n to X 1 , The two jointly determine the output y. In this way, the output comes from both the past and the future. When the forward operation is performed, the hidden layer unit S t and S t-1 are correlated, and the hidden layer unit is in the reverse operation. and Correlation, for time series unrelated to causality, the bidirectional long-short-term neural network uses the known time series and the reverse position sequence to extract the features of the original sequence in both positive and negative directions, which significantly improves the accuracy of the model. Therefore, The bidirectional long-short-term neural network is often better than the unidirectional long-short-term neural network in dealing with timing problems, and has greater potential in dealing with complex tasks. The bidirectional LSTM operation model is shown in Figure 4.
具体的,在步骤S2数据特征提取中,采用的梯度提升算法(Gradient Boosting)是一种组合模型,具体算法过程如下:Specifically, in the data feature extraction in step S2, the gradient boosting algorithm (Gradient Boosting) used is a combination model, and the specific algorithm process is as follows:
F*(x)=araminEy,x(y,F(x)) (5)F * (x)=araminE y,x (y,F(x)) (5)
根据现有数据训练模型的问题可以看做是优化问题,优化的目标是满足公式5的模型F*(x),此时优化问题的解空间是在函数空间中而不是数值空间,如果将F*(x)改变形式,定义为一个含有参数的模型即F(x,p),那么函数优化问题就转变成了参数优化问题,公式为:The problem of training the model according to the existing data can be regarded as an optimization problem. The goal of optimization is to satisfy the model F * (x) of formula 5. At this time, the solution space of the optimization problem is in the function space instead of the numerical space. If F * (x) changes the form and is defined as a model with parameters, namely F(x, p), then the function optimization problem becomes a parameter optimization problem, and the formula is:
F*(x)=F*(x,p*) (7)F * (x)=F * (x,p * )(7)
其中p为模型的参数,代表基于参数p的损失函数,参数优化的问题,当要估计最优参数p*时,通常采用步进寻优的方式,即首先猜测一个参数初始值p0,然后通过M步的迭代提升求得最优参数,最优参数p*表示为累加形式:其中p0为参数初始值,{p1,p2,…pm}为每步迭代的提升量,梯度下降法又称为最速下降法,是一种典型的数值优化求解方法,其思想是如果实值函数为在a处是可微的而且有定义,那么在梯度相反的方向下降最快即更容易找到函数的最小值,如果采用梯度下降法对最优参数p*求解,其步骤如下:where p is the parameter of the model, Represents the loss function based on the parameter p, the problem of parameter optimization. When the optimal parameter p * is to be estimated, the step-by-step optimization method is usually adopted, that is, the initial value of a parameter p 0 is first guessed, and then it is obtained through the iterative improvement of M steps. The optimal parameter is obtained, and the optimal parameter p * is expressed in the cumulative form: Among them, p 0 is the initial value of the parameter, {p 1 , p 2, …p m } is the amount of improvement in each step of iteration. The gradient descent method, also known as the steepest descent method, is a typical numerical optimization solution method. The idea is to If the real-valued function is is differentiable and defined at a, then It is easier to find the minimum value of the function in the opposite direction of the gradient. If the gradient descent method is used to solve the optimal parameter p * , the steps are as follows:
对于可以用参数表示的模型,梯度下降法的参数优化过程是非常有好的求解方式,但是对于非参数模型是不可行的,一种替代思路就是将F(x)整体看成是一个参数,在函数空间中找出最优化函数,但由于样本数据是有限的因此函数F(x)值域受限于样本集,所以样本上的数据点不一定能得到准确估计,为了解决函数空间局限性的问题,Friedman提出了梯度提升算法(GradientBoosting),其算法流程如下:提升算法本质上是建立一个多个基函数组成的累加模型,表示为:For models that can be represented by parameters, the parameter optimization process of the gradient descent method is a very good solution, but it is not feasible for non-parametric models. An alternative idea is to regard F(x) as a parameter as a whole, Find the optimal function in the function space, but because the sample data is limited, the value range of the function F(x) is limited by the sample set, so the data points on the sample may not be accurately estimated. In order to solve the limitation of the function space The problem of , Friedman proposed the gradient boosting algorithm (GradientBoosting), the algorithm flow is as follows: The boosting algorithm is essentially to establish an accumulation model composed of multiple basis functions, which is expressed as:
其中βm为展开式系数,b(X,γm)为基函数b0(X,γm)为初始化的基函数,通过M步的提升得到F(x)最优估计值F*(x),其要求是where β m is the expansion coefficient, b(X,γ m ) is the basis function b 0 (X,γ m ) is the initialized basis function, and the optimal estimated value of F(x) F * (x) is obtained through M steps of improvement ), which requires
因此求解(8)的问题可转化为求得下面参数,即对于m=1,2……,Therefore, the problem of solving (8) can be transformed into obtaining the following parameters, that is, for m=1, 2...,
由(9)、(10)得到对于提升模型Fm-1(x),βmb(X,γm)要满足(8)的约束即使得损失函数最小,换言之,每次迭代添加的基函数就是使得(8)减少最多的基函数,那么在这个意义上,新加入的基函数就相当于梯度下降法中的pm=-ρmkm,用平方误差来度量基函数和负梯度即公式It is obtained from (9) and (10) that for the boosted model F m-1 (x), β m b(X,γ m ) must satisfy the constraints of (8), that is, the loss function is minimized, in other words, the basis added by each iteration The function is the basis function that reduces (8) the most. In this sense, the newly added basis function is equivalent to p m = -ρ m k m in the gradient descent method, and the square error is used to measure the basis function and negative gradient. i.e. formula
式(11)意味着使用βb(xi,γ)来拟合负梯度-km(xi),用平方误差来度量接近程度,当拟合误差最小时求得基函数参数γm;Equation (11) means using βb(x i , γ) to fit the negative gradient -km ( xi ), using the squared error to measure the closeness, and obtaining the basis function parameter γ m when the fitting error is the smallest ;
梯度提升算法使用决策树当做基函数,表示为V个端节点的树可以表示为其中θ=(Rv,γv)为树的参数,Rv为树建立过程在输入变量x空间划分的区域,γv为每个端节点的赋值,θm对应式(12)中γm,以式(8)的模型表示方式,梯度提升算法可以表示为M棵树累加的形式The gradient boosting algorithm uses the decision tree as the basis function, and the tree represented as V end nodes can be expressed as where θ=(R v ,γ v ) is the parameter of the tree, R v is the area divided in the input variable x space during the tree building process, γ v is the assignment of each end node, and θ m corresponds to γ m in formula (12) , in the model representation of formula (8), the gradient boosting algorithm can be expressed as the accumulation of M trees
对于每一棵加入模型的基树Tm(x,θm),要求θm符合For each base tree T m (x, θ m ) added to the model, θ m is required to satisfy
由式(12)式同理可得:From the formula (12), it can be obtained in the same way:
使用基树对当前损失函数进行拟合,利用当前的损失函数梯度值和输入变量x为新的样本,进而更新梯度提升模型:Use the base tree to fit the current loss function, use the current loss function gradient value and the input variable x as a new sample, and then update the gradient boosting model:
其中η为一个系数,作为正则化处理措施防止出现过拟合现象。where η is a coefficient used as a regularization measure to prevent overfitting.
具体的,在步骤S4中样本数据的选择,样本数据的选择:选用田纳西伊士曼数据集中的故障8,故障12,故障13,故障17,故障20作为故障集,将故障样本按照7:3的比例划分为训练集和测试集,故障编号及故障类型如表1所示:Specifically, in step S4, the selection of sample data and the selection of sample data: select fault 8, fault 12, fault 13, fault 17, and fault 20 in the Tennessee Eastman data set as the fault set, and divide the fault samples according to 7:3 The proportion of the fault is divided into training set and test set, and the fault number and fault type are shown in Table 1:
表1故障编号及故障类型Table 1 Fault number and fault type
具体的,样本数据的建模方式为在线建模,Time-Step步长为350,每次在pycharm中运行一次会得到其对应的准确率,将得到的准确率输入进箱型图代码即得到BILSTM的箱型图,箱型图可以不受到异常值的影响能够稳定地描绘出准确率分布的情况。Specifically, the modeling method of the sample data is online modeling, the Time-Step step is 350, and the corresponding accuracy rate is obtained by running it once in pycharm, and the obtained accuracy rate is entered into the box plot code to get The box plot of BILSTM, the box plot can stably depict the accuracy distribution without being affected by outliers.
综上所述,利用双向长短时神经网络泛化能力强,能够避免长序列发生梯度消失和梯度爆炸的缺点,解决了流程工业故障诊断中准确率低,时常出现漏报和误报现象并且泛化能力低的问题;双向长短时神经网络在多个领域取得了显著的成果,但目前在国内流程工业领域应用较少,因此将双向长短时神经网络应用在流程工领域,符合技术革新的要求;通过双向长短时神经网络提前诊断出流程工业中的故障并制定相应的措施,能够大大减少人员的伤亡和财产的损失。In summary, the use of bidirectional long and short-term neural networks has strong generalization ability, can avoid the shortcomings of gradient disappearance and gradient explosion in long sequences, and solves the problem of low accuracy in process industry fault diagnosis, frequent false negatives and false positives. The two-way long-short-term neural network has achieved remarkable results in many fields, but it is rarely used in the domestic process industry. Therefore, the application of the two-way long-short-term neural network in the process industry meets the requirements of technological innovation. ; Diagnose faults in the process industry in advance and formulate corresponding measures through the bidirectional long-short-term neural network, which can greatly reduce casualties and property losses.
以上所述,仅为本发明较佳的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,根据本发明的技术方案及其发明构思加以等同替换或改变,都应涵盖在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention, but the protection scope of the present invention is not limited to this. The equivalent replacement or change of the inventive concept thereof shall be included within the protection scope of the present invention.
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010675680.2A CN111859798A (en) | 2020-07-14 | 2020-07-14 | Process industry fault diagnosis method based on bidirectional long and short time neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010675680.2A CN111859798A (en) | 2020-07-14 | 2020-07-14 | Process industry fault diagnosis method based on bidirectional long and short time neural network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111859798A true CN111859798A (en) | 2020-10-30 |
Family
ID=72983909
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010675680.2A Pending CN111859798A (en) | 2020-07-14 | 2020-07-14 | Process industry fault diagnosis method based on bidirectional long and short time neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111859798A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113027684A (en) * | 2021-03-24 | 2021-06-25 | 明阳智慧能源集团股份公司 | Intelligent control system for improving clearance state of wind generating set |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107992597A (en) * | 2017-12-13 | 2018-05-04 | 国网山东省电力公司电力科学研究院 | A kind of text structure method towards electric network fault case |
CN109783997A (en) * | 2019-03-12 | 2019-05-21 | 华北电力大学 | A Transient Stability Evaluation Method of Power System Based on Deep Neural Network |
CN109931678A (en) * | 2019-03-13 | 2019-06-25 | 中国计量大学 | Air-conditioning fault diagnosis method based on deep learning LSTM |
CN110132598A (en) * | 2019-05-13 | 2019-08-16 | 中国矿业大学 | Noise Diagnosis Algorithm for Rolling Bearing Faults in Rotating Equipment |
CN110232395A (en) * | 2019-03-01 | 2019-09-13 | 国网河南省电力公司电力科学研究院 | A kind of fault diagnosis method of electric power system based on failure Chinese text |
CN110261109A (en) * | 2019-04-28 | 2019-09-20 | 洛阳中科晶上智能装备科技有限公司 | A kind of Fault Diagnosis of Roller Bearings based on bidirectional memory Recognition with Recurrent Neural Network |
-
2020
- 2020-07-14 CN CN202010675680.2A patent/CN111859798A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107992597A (en) * | 2017-12-13 | 2018-05-04 | 国网山东省电力公司电力科学研究院 | A kind of text structure method towards electric network fault case |
CN110232395A (en) * | 2019-03-01 | 2019-09-13 | 国网河南省电力公司电力科学研究院 | A kind of fault diagnosis method of electric power system based on failure Chinese text |
CN109783997A (en) * | 2019-03-12 | 2019-05-21 | 华北电力大学 | A Transient Stability Evaluation Method of Power System Based on Deep Neural Network |
CN109931678A (en) * | 2019-03-13 | 2019-06-25 | 中国计量大学 | Air-conditioning fault diagnosis method based on deep learning LSTM |
CN110261109A (en) * | 2019-04-28 | 2019-09-20 | 洛阳中科晶上智能装备科技有限公司 | A kind of Fault Diagnosis of Roller Bearings based on bidirectional memory Recognition with Recurrent Neural Network |
CN110132598A (en) * | 2019-05-13 | 2019-08-16 | 中国矿业大学 | Noise Diagnosis Algorithm for Rolling Bearing Faults in Rotating Equipment |
Non-Patent Citations (3)
Title |
---|
JEROME H.FRIEDMAN: "GREEDY FUNCTION APPROXIMATION:A GRADIENT BOOSTING MACHINE", 《THE ANNALS OF STATISTICS》 * |
王邵鹏: "基于深度学习的广告点击预测研究", 《中国优秀硕士学位论文全文数据库》 * |
金余丰: "基于深度学习的滚动轴承故障诊断方法研究", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113027684A (en) * | 2021-03-24 | 2021-06-25 | 明阳智慧能源集团股份公司 | Intelligent control system for improving clearance state of wind generating set |
CN113027684B (en) * | 2021-03-24 | 2022-05-03 | 明阳智慧能源集团股份公司 | Intelligent control system for improving clearance state of wind generating set |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Arunthavanathan et al. | A deep learning model for process fault prognosis | |
Bi et al. | A novel orthogonal self-attentive variational autoencoder method for interpretable chemical process fault detection and identification | |
Saeed et al. | Novel fault diagnosis scheme utilizing deep learning networks | |
CN113486578B (en) | Method for predicting residual life of equipment in industrial process | |
CN111902781B (en) | Apparatus and method for controlling a system | |
Yang et al. | An accident diagnosis algorithm using long short-term memory | |
Ye et al. | Health condition monitoring of machines based on long short-term memory convolutional autoencoder | |
Zeng et al. | Multivariate time series anomaly detection with adversarial transformer architecture in the Internet of Things | |
Li et al. | Remaining useful life prediction of turbofan engines using CNN-LSTM-SAM approach | |
Wang et al. | Advanced fault diagnosis method for nuclear power plant based on convolutional gated recurrent network and enhanced particle swarm optimization | |
Liu et al. | Anomaly detection in manufacturing systems using structured neural networks | |
CN109766583A (en) | Aero-engine life prediction method based on unlabeled, unbalanced and uncertain initial value data | |
CN113642754B (en) | Complex industrial process fault prediction method based on RF noise reduction self-coding information reconstruction and time convolution network | |
Razavi-Far et al. | Model-based fault detection and isolation of a steam generator using neuro-fuzzy networks | |
CN114282443B (en) | Remaining service life prediction method based on MLP-LSTM supervised joint model | |
Yan et al. | Deep multistage multi-task learning for quality prediction of multistage manufacturing systems | |
Lucke et al. | Fault detection and identification combining process measurements and statistical alarms | |
Liu et al. | Long–short-term memory encoder–decoder with regularized hidden dynamics for fault detection in industrial processes | |
Liu et al. | Correntropy long short term memory soft sensor for quality prediction in industrial polyethylene process | |
CN111046961A (en) | Fault classification method based on bidirectional long-and-short-term memory unit and capsule network | |
Liu et al. | Graph attention network with Granger causality map for fault detection and root cause diagnosis | |
Song et al. | Explicit representation and customized fault isolation framework for learning temporal and spatial dependencies in industrial processes | |
Tang et al. | A process monitoring and fault isolation framework based on variational autoencoders and branch and bound method | |
CN117725360A (en) | A temperature transmitter fault detection method based on unsupervised machine learning | |
Mbogu et al. | Data-Driven Root Cause Analysis Via Causal Discovery Using Time-To-Event Data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201030 |