CN105739489B - A kind of batch process fault detection method based on ICA KNN - Google Patents

A kind of batch process fault detection method based on ICA KNN Download PDF

Info

Publication number
CN105739489B
CN105739489B CN201610313490.XA CN201610313490A CN105739489B CN 105739489 B CN105739489 B CN 105739489B CN 201610313490 A CN201610313490 A CN 201610313490A CN 105739489 B CN105739489 B CN 105739489B
Authority
CN
China
Prior art keywords
matrix
mrow
msub
knn
multiplied
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201610313490.XA
Other languages
Chinese (zh)
Other versions
CN105739489A (en
Inventor
何建
章文
邹见效
凡时财
张刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201610313490.XA priority Critical patent/CN105739489B/en
Publication of CN105739489A publication Critical patent/CN105739489A/en
Application granted granted Critical
Publication of CN105739489B publication Critical patent/CN105739489B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B23/00Testing or monitoring of control systems or parts thereof
    • G05B23/02Electric testing or monitoring
    • G05B23/0205Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults
    • G05B23/0218Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterised by the fault detection method dealing with either existing or incipient faults
    • G05B23/0221Preprocessing measurements, e.g. data collection rate adjustment; Standardization of measurements; Time series or signal analysis, e.g. frequency analysis or wavelets; Trustworthiness of measurements; Indexes therefor; Measurements using easily measured parameters to estimate parameters difficult to measure; Virtual sensor creation; De-noising; Sensor fusion; Unconventional preprocessing inherently present in specific fault detection methods like PCA-based methods
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B23/00Testing or monitoring of control systems or parts thereof
    • G05B23/02Electric testing or monitoring
    • G05B23/0205Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults
    • G05B23/0218Electric testing or monitoring by means of a monitoring system capable of detecting and responding to faults characterised by the fault detection method dealing with either existing or incipient faults
    • G05B23/0224Process history based detection method, e.g. whereby history implies the availability of large amounts of data
    • G05B23/024Quantitative history assessment, e.g. mathematical relationships between available data; Functions therefor; Principal component analysis [PCA]; Partial least square [PLS]; Statistical classifiers, e.g. Bayesian networks, linear regression or correlation analysis; Neural networks

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Complex Calculations (AREA)
  • Testing Or Measuring Of Semiconductors Or The Like (AREA)

Abstract

本发明公开了一种基于ICA‑KNN的间歇过程故障检测方法,通过应用ICA处理训练数据集,再选取较少的独立主元取代原始的高维数据,同时提取原始数据的主要特征,之后,在独立主元空间中应用KNN方法求取相应的统计控制限用于故障检测。这样使非高斯非线性的间歇生产过程具有较高的故障检测率,同时相比KICA还减少了计算复杂度。

The present invention discloses an intermittent process fault detection method based on ICA-KNN. By applying ICA to process the training data set, select fewer independent pivots to replace the original high-dimensional data, and extract the main features of the original data at the same time. After that, In the independent pivot space, the KNN method is applied to obtain the corresponding statistical control limits for fault detection. In this way, the non-Gaussian nonlinear batch production process has a high fault detection rate, and at the same time reduces the computational complexity compared with KICA.

Description

一种基于ICA-KNN的间歇过程故障检测方法A Fault Detection Method for Batch Process Based on ICA-KNN

技术领域technical field

本发明属于间歇过程技术领域,更为具体地讲,涉及一种基于ICA-KNN的间歇过程故障检测方法。The invention belongs to the field of batch process technology, and more specifically relates to an ICA-KNN-based batch process fault detection method.

背景技术Background technique

间歇过程,又被称为批处理过程。由于其操作灵活而被广泛应用与小批量、高附加值产品的生产中。如今间歇过程已经成为精细化工、生物制药以及农产品深加工等行业的主要生产方式。半导体批次生产过程存在批次不等长、进程中心漂移、变量非线性和多工况等特点,为减少半导体晶片生成过程中的报废率,故障检测方法已经成为一个重点课题。Batch process, also known as batch process. Because of its flexible operation, it is widely used in the production of small batches and high value-added products. Nowadays, the batch process has become the main production method in industries such as fine chemical industry, biopharmaceutical and deep processing of agricultural products. The semiconductor batch production process has the characteristics of batch unequal length, process center drift, variable nonlinearity and multiple working conditions. In order to reduce the scrap rate in the semiconductor wafer production process, fault detection methods have become a key topic.

多元统计分析,如主元分析(PCA)和偏最小二乘(PLS)以及独立主元分析(ICA)等在化工产业中有着广泛的应用。PCA是多元统计过程监测的重要工具,同时也是数据压缩和信息提取的有效工具。由于PCA算法假定过程是线性的,对于具有强非线性的生产过程,在线监测的结果十分不可靠,存在误报率过高的现象。特别是PCA进行故障检测时使用的统计量T2和SPE确定控制限时需要进行多元高斯分布的假设,这种假设要求训练集中的变量符合多元高斯分布,对于多数半导体批次过程这种假设是不成立的。与主元分析(PCA)方法不同,独立成分分析(ICA)并不要求观测变量数据服从高斯分布,同时基于高阶统计信息分离或估计出统计独立的源信号,其统计意义更强,而且这些隐含的信号通常具有实际物理意义,或者是所研究对象的本质特征反映,因此ICA在分析非高斯分布过程数据方面具有更好的特征提取能力。然而ICA方法本身也是一种线性方法,因此对于间歇过程中存在的非线性数据监测效果也不尽人意。基于此,有学者提出了基于核函数方法的核独立成分分析(KICA)方法用于间歇过程故障检测,并取得较好的效果。其基本思想是首先将输入数据通过一个非线性映射投影到高维特征空间,然后再在高维特征空间应用线性ICA处理。但KICA方法需要计算核矩阵,核矩阵的维数是样本数的平方,当样本数很大时,会增加计算的复杂性。Q.P.He和J.Wang提出一种基于K近邻规则的故障检测方法(FD-KNN),这种方法并不在乎所处理的数据线性与否,在故障检测过程中能克服半导体数据非线性和多工况特点,实际应用中取得较好的效果。然而,FD-KNN方法存在相应的缺陷,例如当批次进程数据展开后变量规模会迅速增长,致使FD-KNN消耗大量时间用于数据信息的计算,同时占用大量的存储空间记录数据,庞大的数据规模使应用FD-KNN变得困难。Multivariate statistical analysis, such as principal component analysis (PCA) and partial least squares (PLS) and independent principal component analysis (ICA), has been widely used in the chemical industry. PCA is an important tool for multivariate statistical process monitoring, as well as an effective tool for data compression and information extraction. Since the PCA algorithm assumes that the process is linear, for a production process with strong nonlinearity, the results of online monitoring are very unreliable, and there is a phenomenon that the false alarm rate is too high. In particular, the statistic T 2 used by PCA for fault detection and the assumption of multivariate Gaussian distribution are required to determine the control limit by SPE. This assumption requires that the variables in the training set conform to the multivariate Gaussian distribution. For most semiconductor batch processes, this assumption is not true. of. Different from the principal component analysis (PCA) method, the independent component analysis (ICA) does not require the observed variable data to obey the Gaussian distribution, and at the same time separate or estimate statistically independent source signals based on high-order statistical information, which has stronger statistical significance, and these The implied signal usually has actual physical meaning, or reflects the essential characteristics of the research object, so ICA has better feature extraction ability in analyzing non-Gaussian distribution process data. However, the ICA method itself is also a linear method, so the monitoring effect for the nonlinear data existing in the batch process is not satisfactory. Based on this, some scholars proposed the Kernel Independent Component Analysis (KICA) method based on the kernel function method for fault detection in batch processes, and achieved good results. The basic idea is to first project the input data to a high-dimensional feature space through a nonlinear mapping, and then apply linear ICA processing in the high-dimensional feature space. However, the KICA method needs to calculate the kernel matrix. The dimension of the kernel matrix is the square of the number of samples. When the number of samples is large, the complexity of the calculation will be increased. QPHe and J.Wang proposed a fault detection method based on the K nearest neighbor rule (FD-KNN). This method does not care whether the processed data is linear or not, and can overcome the non-linearity and multiple working conditions of semiconductor data in the fault detection process. characteristics, and achieve good results in practical applications. However, the FD-KNN method has corresponding defects. For example, when the batch process data is expanded, the variable scale will increase rapidly, causing FD-KNN to consume a lot of time for the calculation of data information, and occupy a large amount of storage space to record data. Huge The data scale makes it difficult to apply FD-KNN.

发明内容Contents of the invention

本发明的目的在于克服现有技术的不足,提供一种基于ICA-KNN的间歇过程故障检测方法,针对具有非线性和多工况等特点的半导体生产过程中,在减少计算复杂度的基础上,有效提高故障检测的准确性。The purpose of the present invention is to overcome the deficiencies of the prior art, to provide an ICA-KNN-based intermittent process fault detection method, aiming at the semiconductor production process with the characteristics of nonlinearity and multiple working conditions, on the basis of reducing computational complexity , effectively improving the accuracy of fault detection.

为实现上述发明目的,本发明一种基于ICA-KNN的间歇过程故障检测方法,其特征在于,包含以下步骤:In order to realize the foregoing invention object, a kind of intermittent process fault detection method based on ICA-KNN of the present invention is characterized in that, comprises the following steps:

(1)、数据预处理(1), data preprocessing

将间歇过程采集的三维样本矩阵X(I×J×K)先进行基于批次个数展开,得到二维矩阵X(I×KJ),再对二维矩阵X(I×KJ)在批次方向上做标准化处理,使该二维矩阵X(I×KJ)的每列的均值为0、方差为1,最后将标准化处理后的二维矩阵X(I×KJ)纵向重新排列成矩阵X(KI×J);其中,I表示批次个数,J表示观测变量个数,K表示采样次数;The three-dimensional sample matrix X(I×J×K) collected in the batch process is first expanded based on the number of batches to obtain the two-dimensional matrix X(I×KJ), and then the two-dimensional matrix X(I×KJ) in the batch Do normalization in the direction, so that the mean value of each column of the two-dimensional matrix X (I×KJ) is 0, and the variance is 1. Finally, the normalized two-dimensional matrix X (I×KJ) is vertically rearranged into a matrix X (KI×J); Among them, I represents the number of batches, J represents the number of observed variables, and K represents the number of samples;

(2)、对矩阵X(KI×J)进行ICA降维处理,得到反映间歇过程信息的d个独立成分Sd和主部分离矩阵Wd (2) Perform ICA dimensionality reduction on the matrix X(KI×J) to obtain d independent components S d reflecting batch process information and main part separation matrix W d

(2.1)、先对矩阵X(KI×J)进行白化处理,得到白化向量Z;(2.1), first carry out whitening process to matrix X (KI×J), obtain whitening vector Z;

Z=QXZ=QX

其中,Q为白化矩阵,Q=Λ-1/2UT,Λ=diag(λ1,…,λn),λi(i=1,…,n)为协方差矩阵E{XXT}的前n个特征值,U为n个特征值对应的特征向量组成的矩阵;Among them, Q is the whitening matrix, Q=Λ -1/2 U T , Λ=diag(λ 1 ,…,λ n ), λ i (i=1,…,n) is the covariance matrix E{XX T } The first n eigenvalues of , U is a matrix composed of eigenvectors corresponding to n eigenvalues;

(2.2)、对白化向量Z进行分解,得到反映间歇过程信息的d个独立成分Sd和主部分离矩阵Wd(2.2), decompose the whitening vector Z, obtain d independent components S d and the main part separation matrix W d reflecting the intermittent process information;

(2.2.1)、构建初始随机矢量值bk,并令k=1,k∈[1,n];(2.2.1), construct the initial random vector value b k , and set k=1, k∈[1,n];

bk=E{Zg(bk TZ)}-E{g'(bk TZ)}bk b k =E{Zg(b k T Z)}-E{g'(b k T Z)}b k

其中,函数g()为已选定的非二次函数G的一阶导数,g'()表示函数g()的导数,E{}表示求期望;Among them, the function g() is the first-order derivative of the selected non-quadratic function G, g'() represents the derivative of the function g(), and E{} represents expectation;

(2.2.2)、对bk进行迭代;(2.2.2), iterating b k ;

(2.2.3)、对迭代后的bk进行归一化处理,即bk=bk/||bk||,其中,||bk||表示求bk的范数;(2.2.3), carry out normalization process to b k after the iteration, i.e. b k =b k /||b k ||, wherein, ||b k || represents seeking the norm of b k ;

(2.2.4)、对归一化处理后的bk进行判断,如果|bk Tbk|=1±5%,则输出矢量值bk,并进入步骤(2.2.5);否则,k=k+1,并返回到步骤(2.2.2)继续迭代直到满足|bk Tbk|=1±5%时,再进入步骤(2.2.5);(2.2.4), judge b k after the normalization process, if |b k T b k |=1 ± 5%, then output vector value b k , and enter step (2.2.5); Otherwise, k=k+1, and return to step (2.2.2) to continue iteration until satisfying | b k T b k |=1 ± 5%, then enter step (2.2.5);

(2.2.5)、构造矩阵B=[b1,…,bn]T,利用公式S=BTZ求得独立成分,利用公式W=BTQ求得分离矩阵;再将独立成分S按非高斯程度大小排列,选取前d个作为独立成分Sd,其对应的前d个作为主部分离矩阵Wd(2.2.5), construct matrix B=[b 1 ,...,b n ] T , use the formula S=B T Z to obtain the independent components, use the formula W=B T Q to obtain the separation matrix; then the independent components S Arranged according to the degree of non-Gaussian, select the first d as independent components S d , and the corresponding first d as the main part separation matrix W d ;

(3)、在独立成分Sd中使用KNN算法,求取统计控制限CL(3), use the KNN algorithm in the independent component S d to obtain the statistical control limit CL

在独立成分Sd=[s1,…,sd]中,计算行与行之间的平方和距离,通过距离大小以此确定每一行的m近邻,并计算其KNN平方距离DsIn the independent component S d =[s 1 ,…,s d ], calculate the square sum distance between rows, determine the m neighbors of each row through the distance, and calculate its KNN square distance D s ;

其中,表示Sd中第i行与距离它第j近的行的欧氏距离的平方;in, Indicates the square of the Euclidean distance between the i-th row and the j-th closest row in S d ;

由于Ds近似服从χ2分布,依据显著性水平可以确定控制限α为置信水平,N为独立成分Sd行数;Since D s approximately obeys the χ 2 distribution, the control limits can be determined according to the significance level α is the confidence level, N is the number of rows of independent components S d ;

(4)、将待检测数据x'按照步骤(1)进行标准化处理,得到数据x,再根据主部分离矩阵Wd计算数据x的独立成分 (4), standardize the data x' to be detected according to step (1) to obtain the data x, and then calculate the independent components of the data x according to the main part separation matrix W d

(5)、将独立成分按照步骤(3)计算KNN平方距离Dx;将Dx与控制限CL进行比较,如果Dx>CL,则认为该样本是故障样本,反之,则认为该样本是正常样本。(5), the independent components Calculate the KNN square distance D x according to step (3); compare D x with the control limit CL, if D x > CL, the sample is considered to be a fault sample, otherwise, the sample is considered to be a normal sample.

本发明的发明目的是这样实现的:The purpose of the invention of the present invention is achieved like this:

本发明基于ICA-KNN的间歇过程故障检测方法,通过应用ICA处理训练数据集,再选取较少的独立主元取代原始的高维数据,同时提取原始数据的主要特征,之后,在独立主元空间中应用KNN方法求取相应的统计控制限用于故障检测。这样使非高斯非线性的间歇生产过程具有较高的故障检测率,同时相比KICA还减少了计算复杂度。The present invention is based on the ICA-KNN intermittent process fault detection method, by applying ICA to process the training data set, then selecting less independent pivots to replace the original high-dimensional data, and extracting the main features of the original data at the same time, after that, in the independent pivots The KNN method is applied in the space to obtain the corresponding statistical control limits for fault detection. In this way, the non-Gaussian nonlinear batch production process has a high fault detection rate, and at the same time reduces the computational complexity compared with KICA.

附图说明Description of drawings

图1是基于ICA-KNN的间歇过程故障检测方法流程图;Figure 1 is a flow chart of the intermittent process fault detection method based on ICA-KNN;

图2是KICA方法的I2检测图;Fig. 2 is the I2 detection figure of KICA method;

图3是KICA方法的SPE检测图;Fig. 3 is the SPE detection figure of KICA method;

图4是KNN方法的检测图;Figure 4 is a detection map of the KNN method;

图5是ICA-KNN方法的检测图。Figure 5 is the detection map of the ICA-KNN method.

具体实施方式Detailed ways

下面结合附图对本发明的具体实施方式进行描述,以便本领域的技术人员更好地理解本发明。需要特别提醒注意的是,在以下的描述中,当已知功能和设计的详细描述也许会淡化本发明的主要内容时,这些描述在这里将被忽略。Specific embodiments of the present invention will be described below in conjunction with the accompanying drawings, so that those skilled in the art can better understand the present invention. It should be noted that in the following description, when detailed descriptions of known functions and designs may dilute the main content of the present invention, these descriptions will be omitted here.

实施例Example

为了方便描述,先对具体实施方式中出现的相关专业术语进行说明:For the convenience of description, the relevant technical terms appearing in the specific implementation are explained first:

ICA(Independent Component Analysis)独立成分分析;ICA (Independent Component Analysis) independent component analysis;

KNN(K-Nearest Neighbor)K近邻;KNN (K-Nearest Neighbor) K nearest neighbor;

FD-KNN(Fault Detection based on K-Nearest Neighbor)基于K近邻的故障检测方法;FD-KNN (Fault Detection based on K-Nearest Neighbor) is a fault detection method based on K-nearest neighbors;

KICA(Kernel Independent Component Analysis)核独立成分分析;KICA (Kernel Independent Component Analysis) nuclear independent component analysis;

图1是基于ICA-KNN的间歇过程故障检测方法流程图。Figure 1 is a flow chart of the fault detection method for intermittent processes based on ICA-KNN.

在本实施例中,如图1所示,本发明一种基于ICA-KNN的间歇过程故障检测方法,包含以下步骤:In the present embodiment, as shown in Figure 1, a kind of intermittent process fault detection method based on ICA-KNN of the present invention comprises the following steps:

S1、数据预处理S1. Data preprocessing

将间歇过程采集的三维样本矩阵X(I×J×K)先进行基于批次个数展开,得到二维矩阵X(I×KJ),这样消除量纲的影响;再对二维矩阵X(I×KJ)在批次方向上做标准化处理,使该二维矩阵X(I×KJ)的每列的均值为0、方差为1,这样去掉了所有批次的平均运行轨迹;最后将标准化处理后的二维矩阵X(I×KJ)纵向重新排列成矩阵X(KI×J);其中,I表示批次个数,J表示观测变量个数,K表示采样次数。The three-dimensional sample matrix X(I×J×K) collected in the batch process is first expanded based on the number of batches to obtain the two-dimensional matrix X(I×KJ), so as to eliminate the influence of dimension; then the two-dimensional matrix X( I×KJ) is standardized in the batch direction, so that the mean value of each column of the two-dimensional matrix X(I×KJ) is 0 and the variance is 1, so that the average running track of all batches is removed; finally, the normalized The processed two-dimensional matrix X (I×KJ) is vertically rearranged into a matrix X(KI×J); where I represents the number of batches, J represents the number of observed variables, and K represents the number of samples.

在本实施例中,在Lam 9600上进行的半导体铝蚀反应,其中包含107批次的正常数据和20批次的故障数据。将82批次的正常数据作为训练样本,25批次的正常数据作为测试样本,最终检测20批次的故障数据,看20批次的故障数据是否可及时检测出来。In this embodiment, the semiconductor aluminum etching reaction performed on Lam 9600 contains 107 batches of normal data and 20 batches of fault data. Take 82 batches of normal data as training samples, 25 batches of normal data as test samples, and finally detect 20 batches of faulty data to see if the 20 batches of faulty data can be detected in time.

设最终所有训练样本的批次数据为(82×18×90)的三维样本矩阵,基于批次个数展开,得到二维样本矩阵(82×1620),之后对二维样本矩阵减去均值除以标准差来进行标准化处理,最后将标准化后的样本矩阵(82×1620)纵向重新排列成二维矩阵(7380×18),用于随后的ICA降维处理。Let the final batch data of all training samples be a three-dimensional sample matrix of (82×18×90), expand based on the number of batches, and obtain a two-dimensional sample matrix (82×1620), and then subtract the mean value from the two-dimensional sample matrix The standard deviation is used for standardization, and finally the normalized sample matrix (82×1620) is vertically rearranged into a two-dimensional matrix (7380×18) for subsequent ICA dimension reduction.

S2、对矩阵X(KI×J)进行ICA降维处理,得到反映间歇过程信息的d个独立成分Sd和主部分离矩阵Wd S2. Carry out ICA dimension reduction processing on matrix X(KI×J), and obtain d independent components S d reflecting batch process information and main part separation matrix W d

S2.1、为去除样本数据之间的相关性,简化独立分量提取过程,因此需要对S2.1. In order to remove the correlation between sample data and simplify the independent component extraction process, it is necessary to

矩阵X(KI×J)进行白化处理,得到白化向量Z;The matrix X (KI×J) is whitened to obtain the whitening vector Z;

Z=QXZ=QX

其中,Q为白化矩阵,Q=Λ-1/2UT,Λ=diag(λ1,…,λn),λi(i=1,…,n)为协方差矩阵E{XXT}的前n个特征值,U为n个特征值对应的特征向量组成的矩阵;Among them, Q is the whitening matrix, Q=Λ -1/2 U T , Λ=diag(λ 1 ,…,λ n ), λ i (i=1,…,n) is the covariance matrix E{XX T } The first n eigenvalues of , U is a matrix composed of eigenvectors corresponding to n eigenvalues;

S2.2、对白化向量Z进行分解,得到反映间歇过程信息的d个独立成分Sd和主部分离矩阵WdS2.2. Decompose the whitening vector Z to obtain d independent components S d reflecting the information of the batch process and the main part separation matrix W d ;

S2.2.1、构建初始随机矢量值bk,k∈[1,n];S2.2.1. Construct the initial random vector value b k ,k∈[1,n];

bk=E{Zg(bk TZ)}-E{g'(bk TZ)}bk b k =E{Zg(b k T Z)}-E{g'(b k T Z)}b k

其中,函数g()为已选定的非二次函数G的一阶导数,g'()表示函数g()的导数,E{}表示求期望;Among them, the function g() is the first-order derivative of the selected non-quadratic function G, g'() represents the derivative of the function g(), and E{} represents expectation;

在本实施例中,非二次函数G可以选多种形式,如:In this embodiment, the non-quadratic function G can be selected in various forms, such as:

其中,1≤a1≤2,a2=1;cosh()表示一个函数用来返回参数的双曲余弦值Among them, 1≤a 1 ≤2, a 2 =1; cosh() indicates a function used to return the hyperbolic cosine value of the parameter

S2.2.2、从k=1开始对bk进行迭代;S2.2.2, starting to iterate b k from k=1;

S2.2.3、对迭代后的bk进行归一化处理,即bk=bk/||bk||,其中,||bk||表示求bk的范数;S2.2.3. Perform normalization processing on the iterated b k , that is, b k = b k /||b k ||, where ||b k || represents seeking the norm of b k ;

S2.2.4、对归一化处理后的bk进行判断,如果|bk Tbk|=1±5%,则输出矢量值bk,并进入步骤S2.2.5;否则,k=k+1,并返回到步骤S2.2.2继续迭代直到满足|bk Tbk|=1±5%时,再进入步骤S2.2.5;S2.2.4. Judge b k after normalization processing, if |b k T b k |=1±5%, then output vector value b k and enter step S2.2.5; otherwise, k=k+ 1, and return to step S2.2.2 to continue iterating until |b k T b k |=1±5% is satisfied, then enter step S2.2.5;

S2.2.5、构造矩阵B=[b1,…,bn]T,利用公式S=BTZ求得独立成分,利用公式W=BTQ求得分离矩阵;再将独立成分S按非高斯程度大小排列,选取前d个作为独立成分Sd,其对应的前d个作为主部分离矩阵WdS2.2.5. Construct matrix B=[b 1 ,...,b n ] T , use formula S=B T Z to obtain independent components, use formula W=B T Q to obtain separation matrix; Arrangement of Gaussian degree, select the first d as the independent components S d , and the corresponding first d as the main part separation matrix W d ;

S3、在独立成分Sd中使用KNN算法,求取统计控制限CLS3. Use the KNN algorithm in the independent component S d to obtain the statistical control limit CL

在独立成分Sd=[s1,…,sd]中,计算行与行之间的平方和距离,通过距离大小以此确定每一行的m近邻,并计算其KNN平方距离DsIn the independent component S d =[s 1 ,…,s d ], calculate the square sum distance between rows, determine the m neighbors of each row through the distance, and calculate its KNN square distance D s ;

其中,表示Sd中第i行与距离它第j近的行的欧氏距离的平方;in, Indicates the square of the Euclidean distance between the i-th row and the j-th closest row in S d ;

在本实施例中,则第1行和第2行之间的平方和距离为:In this example, Then the sum of squares distance between row 1 and row 2 is:

(1-1)2+(1-2)2+(1-1)2=1,同理第1行和其它行之间的平方和距离分别为:1,1,1,3,12;本实施例中,取m=3,再分别求取每一行的KNN平方距离Ds(1-1) 2 +(1-2) 2 +(1-1) 2 =1, similarly the square sum distances between the first row and other rows are: 1,1,1,3,12; In this embodiment, take m=3, and then calculate the KNN square distance D s of each row;

由于Ds近似服从χ2分布,依据显著性水平可以确定控制限α为置信水平,N为独立成分Sd行数;Since D s approximately obeys the χ 2 distribution, the control limits can be determined according to the significance level α is the confidence level, N is the number of rows of independent components S d ;

S4、将待检测数据x'按照步骤S1进行标准化处理,得到数据x,再根据主部分离矩阵Wd计算数据x的独立成分 S4. Standardize the data x' to be detected according to step S1 to obtain the data x, and then calculate the independent components of the data x according to the main part separation matrix W d

S5、将独立成分按照步骤S3计算KNN平方距离Dx;将Dx与控制限CL进行比较,如果Dx>CL,则认为该样本是故障样本,反之,则认为该样本是正常样本。S5, the independent components Calculate the KNN square distance D x according to step S3; compare D x with the control limit CL, if D x > CL, the sample is considered to be a faulty sample, otherwise, the sample is considered to be a normal sample.

为了验证提出方法的有效性,我们采用半导体生产中铝堆蚀刻工艺过程数据进行仿真,并与KICA、FD-KNN方法进行对比说明。图2为KICA的I2统计线检测图;图3为KICA的SPE检测图;图4为KNN检测图;图5为ICA-KNN检测图。通过对比,可以看到ICA-KNN方法有着更高的故障检测率,在误报率方面,几种方法差异不大。从算法时间上对比来看,ICA-KNN算法相比FD-KNN算法和KICA算法可以有效降低复杂度,表明了该方法的优越性。仿真实验表明,ICA-KNN方法简单有效,具有很好的应用前景。In order to verify the effectiveness of the proposed method, we used the data of the aluminum stack etching process in semiconductor production to simulate, and compared it with KICA and FD-KNN methods. Fig. 2 is the I 2 statistical line detection diagram of KICA; Fig. 3 is the SPE detection diagram of KICA; Fig. 4 is the KNN detection diagram; Fig. 5 is the ICA-KNN detection diagram. By comparison, it can be seen that the ICA-KNN method has a higher fault detection rate, and there is little difference between the several methods in terms of false alarm rate. From the comparison of algorithm time, ICA-KNN algorithm can effectively reduce the complexity compared with FD-KNN algorithm and KICA algorithm, which shows the superiority of this method. The simulation experiment shows that the ICA-KNN method is simple and effective, and has a good application prospect.

尽管上面对本发明说明性的具体实施方式进行了描述,以便于本技术领域的技术人员理解本发明,但应该清楚,本发明不限于具体实施方式的范围,对本技术领域的普通技术人员来讲,只要各种变化在所附的权利要求限定和确定的本发明的精神和范围内,这些变化是显而易见的,一切利用本发明构思的发明创造均在保护之列。Although the illustrative specific embodiments of the present invention have been described above, so that those skilled in the art can understand the present invention, it should be clear that the present invention is not limited to the scope of the specific embodiments. For those of ordinary skill in the art, As long as various changes are within the spirit and scope of the present invention defined and determined by the appended claims, these changes are obvious, and all inventions and creations using the concept of the present invention are included in the protection list.

Claims (2)

1. An ICA-KNN-based intermittent process fault detection method is characterized by comprising the following steps:
(1) data preprocessing
Expanding a three-dimensional sample matrix X (I multiplied by J multiplied by K) acquired in an intermittent process based on the number of batches to obtain a two-dimensional matrix X (I multiplied by KJ), standardizing the two-dimensional matrix X (I multiplied by KJ) in the batch direction to enable the mean value of each row of the two-dimensional matrix X (I multiplied by KJ) to be 0 and the variance to be 1, and finally longitudinally rearranging the standardized two-dimensional matrix X (I multiplied by KJ) into a matrix X (KI multiplied by J); wherein I represents the number of batches, J represents the number of observation variables, and K represents the sampling times;
(2) ICA dimension reduction processing is carried out on the matrix X (KI multiplied by J) to obtain d independent components S reflecting intermittent process informationdAnd a main part separation matrix Wd
(2.1) whitening the matrix X (KI multiplied by J) to obtain a whitening vector Z;
Z=QX
wherein Q is a whitening matrix, Q ═ Λ-1/2UT,Λ=diag(λ1,…,λn),λi(i ═ 1, …, n) is the covariance matrix E { XXTThe first n eigenvalues of the matrix are obtained, and U is a matrix formed by eigenvectors corresponding to the n eigenvalues;
(2.2) decomposing the whitening vector Z to obtain d independent components S reflecting the information of the intermittent processdAnd a main part separation matrix Wd
(2.2.1) constructing an initial random vector value bk,k∈[1,n];
bk=E{Zg(bkTZ)}-E{g'(bk TZ)}bk
Wherein, the function G () is the first derivative of the selected non-quadratic function G, G' () represents the derivative of the function G (), E { } represents the expectation;
(2.2.2) let k equal 1, pair bkCarrying out iteration;
<mrow> <msub> <mi>b</mi> <mi>k</mi> </msub> <mo>=</mo> <msub> <mi>b</mi> <mi>k</mi> </msub> <mo>-</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>k</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <mrow> <mo>(</mo> <msup> <msub> <mi>b</mi> <mi>k</mi> </msub> <mi>T</mi> </msup> <msub> <mi>b</mi> <mi>i</mi> </msub> <mo>)</mo> </mrow> <msub> <mi>b</mi> <mi>i</mi> </msub> </mrow>
(2.2.3) for b after iterationkPerforming a normalization process, i.e. bk=bk/||bkI, wherein i bkI represents solving bkNorm of (d);
(2.2.4) normalization of bkMaking a judgment if | bk Tbk1 +/-5%, then outputting vector value bkAnd entering step (2.2.5); otherwise, k is k +1 and returns to step (2.2.2) to continue the iteration until | b is satisfiedk TbkIf | ═ 1 ± 5%, then go to step (2.2.5);
(2.2.5) construction matrix B ═ B1,…,bn]TUsing the formula S ═ BTZ is the independent component, and the formula W is BTQ, solving a separation matrix; then arranging the independent components S according to the non-Gaussian degree, and selecting the first d as the independent components SdThe first d corresponding to it as main part separation matrix Wd
(3) In the independent component SdIn the method, a KNN algorithm is used to obtain a statistical control limit CL
In the independent component Sd=[s1,…,sd]Calculating square distance between rows, determining m neighbor of each row according to distance, and calculating KNN square distance Ds
<mrow> <msub> <mi>D</mi> <mi>s</mi> </msub> <mo>=</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msubsup> <mi>d</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> <mn>2</mn> </msubsup> </mrow>
Wherein,denotes SdThe square of the euclidean distance between the ith row and the row closest to it;
due to DsApproximate compliance chi2Distribution, from which a control limit can be determinedα confidence level, N is independent component SdA number of rows;
(4) carrying out standardization processing on the data x' to be detected according to the step (1) to obtain data x, and then separating the matrix W according to the main partdCalculating the independent components of the data x
(5) The independent componentsCalculating the KNN squared distance D according to the step (3)x(ii) a Will DxCompared with the control limit CL, if DxIf the sample is more than CL, the sample is considered to be a fault sample, otherwise, the sample is considered to be a normal sample.
2. The ICA-KNN based intermittent process fault detection method of claim 1, wherein the non-quadratic function G can be selected from two forms:
G(x)=logcosh(a1x)/a1or g (x) -exp (-a)2x2/2)/a2
Wherein, a1、a2Being constant, cosh () represents a hyperbolic cosine value that a function uses to return arguments.
CN201610313490.XA 2016-05-12 2016-05-12 A kind of batch process fault detection method based on ICA KNN Expired - Fee Related CN105739489B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610313490.XA CN105739489B (en) 2016-05-12 2016-05-12 A kind of batch process fault detection method based on ICA KNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610313490.XA CN105739489B (en) 2016-05-12 2016-05-12 A kind of batch process fault detection method based on ICA KNN

Publications (2)

Publication Number Publication Date
CN105739489A CN105739489A (en) 2016-07-06
CN105739489B true CN105739489B (en) 2018-04-13

Family

ID=56288489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610313490.XA Expired - Fee Related CN105739489B (en) 2016-05-12 2016-05-12 A kind of batch process fault detection method based on ICA KNN

Country Status (1)

Country Link
CN (1) CN105739489B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106444665B (en) * 2016-09-22 2018-09-11 宁波大学 A kind of failure modes diagnostic method based on non-gaussian similarity mode
CN106599450B (en) * 2016-12-12 2020-04-07 东北大学 Priori knowledge-based fault monitoring method for flexible manifold embedded electric smelting magnesium furnace
CN107704863B (en) * 2017-05-22 2021-06-15 上海海事大学 Fault Feature Representation Method for PCA Pivot Rearrangement
CN107065843B (en) * 2017-06-09 2019-04-05 东北大学 Multi-direction KICA batch process fault monitoring method based on Independent subspace
CN107357275B (en) * 2017-07-27 2019-08-27 中南大学 Non-Gaussian industrial process fault detection method and system
CN107831662B (en) * 2017-11-13 2022-01-04 辽宁石油化工大学 Design method of random 2D controller for intermittent process with actuator fault
CN108255656B (en) * 2018-02-28 2020-12-22 湖州师范学院 A fault detection method applied to batch process
CN108759745B (en) * 2018-06-05 2020-02-18 上汽大众汽车有限公司 Body-in-white fault detection method and detection device based on multivariate statistical analysis
CN109116834B (en) * 2018-09-04 2021-02-19 湖州师范学院 A Deep Learning-Based Method for Intermittent Process Fault Detection
CN109669412B (en) * 2018-12-13 2021-03-26 宁波大学 A Non-Gaussian Process Monitoring Method Based on Novel Dynamic Independent Component Analysis
CN111898313B (en) * 2020-06-30 2022-05-20 电子科技大学 A fault detection method based on integrated learning of ICA and SVM

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6952657B2 (en) * 2003-09-10 2005-10-04 Peak Sensor Systems Llc Industrial process fault detection using principal component analysis
CN101403923A (en) * 2008-10-31 2009-04-08 浙江大学 Course monitoring method based on non-gauss component extraction and support vector description
CN103488091A (en) * 2013-09-27 2014-01-01 上海交通大学 Data-driving control process monitoring method based on dynamic component analysis
CN104062968A (en) * 2014-06-10 2014-09-24 华东理工大学 Continuous chemical process fault detection method
CN105425779A (en) * 2015-12-24 2016-03-23 江南大学 ICA-PCA multi-working condition fault diagnosis method based on local neighborhood standardization and Bayesian inference

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6952657B2 (en) * 2003-09-10 2005-10-04 Peak Sensor Systems Llc Industrial process fault detection using principal component analysis
CN101403923A (en) * 2008-10-31 2009-04-08 浙江大学 Course monitoring method based on non-gauss component extraction and support vector description
CN103488091A (en) * 2013-09-27 2014-01-01 上海交通大学 Data-driving control process monitoring method based on dynamic component analysis
CN104062968A (en) * 2014-06-10 2014-09-24 华东理工大学 Continuous chemical process fault detection method
CN105425779A (en) * 2015-12-24 2016-03-23 江南大学 ICA-PCA multi-working condition fault diagnosis method based on local neighborhood standardization and Bayesian inference

Also Published As

Publication number Publication date
CN105739489A (en) 2016-07-06

Similar Documents

Publication Publication Date Title
CN105739489B (en) A kind of batch process fault detection method based on ICA KNN
Cai et al. A new fault detection method for non-Gaussian process based on robust independent component analysis
Li et al. Data-driven root cause diagnosis of faults in process industries
Huang et al. Double-layer distributed monitoring based on sequential correlation information for large-scale industrial processes in dynamic and static states
CN103150498B (en) Based on the hardware Trojan horse recognition method of single category support vector machines
Li et al. Diffusion maps based k-nearest-neighbor rule technique for semiconductor manufacturing process fault detection
Hsu et al. Intelligent ICA–SVM fault detector for non-Gaussian multivariate process monitoring
CN103488091A (en) Data-driving control process monitoring method based on dynamic component analysis
CN111949012A (en) An Intermittent Process Fault Detection Method Based on Double-weight Multi-Neighborhood Preserving Embedding Algorithm
CN108549908A (en) Chemical process fault detection method based on more sampled probability core principle component models
CN107861492A (en) A kind of broad sense Non-negative Matrix Factorization fault monitoring method based on nargin statistic
CN111506041A (en) Neighborhood preserving embedding intermittent process fault detection method based on diffusion distance improvement
Deng et al. Nonlinear multimode industrial process fault detection using modified kernel principal component analysis
US20220159021A1 (en) Anomaly detection method based on iot and apparatus thereof
CN104536996B (en) Calculate node method for detecting abnormality under a kind of homogeneous environment
CN105740212A (en) Sensor exception detection method based on regularized vector autoregression model
CN110263826A (en) The construction method and its detection method of Noise non-linear procedure fault detection model
Khalid et al. Tropical wood species recognition system based on multi-feature extractors and classifiers
CN104503436A (en) Quick fault detection method based on random projection and k-nearest neighbor method
He et al. Statistics pattern analysis: a statistical process monitoring tool for smart manufacturing
Rong et al. Fault diagnosis by locality preserving discriminant analysis and its kernel variation
Archimbaud et al. ICS for multivariate functional anomaly detection with applications to predictive maintenance and quality control
Liu et al. Siamese DeNPE network framework for fault detection of batch process
Hui et al. Batch process monitoring based on WGNPE–GSVDD related and independent variables
Zhang et al. Multiway principal polynomial analysis for semiconductor manufacturing process fault detection

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180413

Termination date: 20210512