CN116361722A - Multi-fault classification method for improving linear local cut space arrangement model - Google Patents

Multi-fault classification method for improving linear local cut space arrangement model Download PDF

Info

Publication number
CN116361722A
CN116361722A CN202310314884.7A CN202310314884A CN116361722A CN 116361722 A CN116361722 A CN 116361722A CN 202310314884 A CN202310314884 A CN 202310314884A CN 116361722 A CN116361722 A CN 116361722A
Authority
CN
China
Prior art keywords
samples
local
space
neighborhood
fault
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310314884.7A
Other languages
Chinese (zh)
Inventor
卢春红
章雅娟
王建祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nantong University
Original Assignee
Nantong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nantong University filed Critical Nantong University
Priority to CN202310314884.7A priority Critical patent/CN116361722A/en
Publication of CN116361722A publication Critical patent/CN116361722A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24143Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • G06F18/256Fusion techniques of classification results, e.g. of results related to same input data of results relating to different input data, e.g. multimodal recognition

Abstract

The invention relates to the technical field of industrial process monitoring, in particular to a multi-fault classification method for improving a linear local tangent space arrangement model. According to the invention, a new weight matrix is introduced to represent the local position relation between different class labels and the same class label samples in the adjacent domain on the process potential manifold, and the local geometric characteristics of the multi-fault process data manifold are maintained through the improved WLLSTA, so that the distorted manifold structure of the original WLLSTA, which is generated by not distinguishing the class labels of the samples in the adjacent domain, is improved; the method is based on WLLTSA, integrates the locality of manifold geometry and the discrimination of global data, builds a fault classification model, captures the intrinsic characteristics of different operation modes in the process, reflects the characteristic discrimination information of multi-fault process data, and identifies the fault type of a test sample by minimizing Euclidean distance from the test sample to various known training samples on a low-dimensional potential manifold. The method is suitable for multi-fault classification detection in the high-dimensional multi-mode industrial process.

Description

Multi-fault classification method for improving linear local cut space arrangement model
Technical Field
The invention relates to the technical field of industrial process monitoring, in particular to a multi-fault classification method for improving a linear local tangent space arrangement model.
Background
The process monitoring in the modern industrial production has important roles in guaranteeing production safety, improving yield and the like. With the development of distributed control systems, the production scale and the operational complexity have increased dramatically, and industrial processes have collected a large amount of high-dimensional process data. Moreover, as the grade and yield of the produced products can be continuously adjusted along with market demands and seasonal effects, the technological parameters such as product components, process set values, feeding proportions and the like can also fluctuate, and modern industrial processes can be switched among a plurality of different operation modes. The random variation in these manufacturing processes causes the process data to exhibit non-gaussian, multi-modal characteristics, etc. Engineers need to classify specific fault data generated in the production process so as to identify different fault types and further determine the fault sending source in the process runaway state. Thus, effective fault classification detection can ensure stable industrial processes and product quality. Although the data-driven multivariate statistical process control (Multivariate Statistical Process Control, MSPC) approach has been successfully applied to process monitoring, the mean and covariance of the multi-modal, non-gaussian process data have all changed significantly, and the conventional MSPC approach ignores the non-gaussian distribution and multi-modal characteristics existing between different process variables, possibly leading to degradation of the monitoring effect. At present, some manifold learning methods are used in process monitoring, so that the local relation among data samples is well explained, and a more accurate monitoring model is constructed. The partial cut space arrangement (Local Tangent space alignment, LTSA) has found wide application due to its simple geometry and ease of implementation. LTSA firstly approximates local cutting space in the neighborhood of each sample by utilizing a principle ComponentAnalysis, PCA method, the obtained local coordinates are arranged to form a global coordinate system, and finally monitoring indexes are built, so that a good monitoring effect is obtained. LTSA pairs are an effective dimension reduction method for training sets, but it does not yield a clear mapping relationship for new test sets. The linear local cut space arrangement (Linear Local Tangent space alignment, LLTSA) is a linearization variant of LTSA, establishes a clear mapping relationship between the original data space and the feature space, and achieves a better monitoring effect than LTSA.
However, when the process data is sparse, non-gaussian distributed or noisy, the local coordinate system obtained by approximating the local tangential space by using the PCA is not accurate enough, so that the constructed tangential space approximation model cannot describe potential manifold geometric features, cannot keep local structural information of the low-dimensional space, and reduces the efficiency of fault detection.
Recently, zohaib et al in literature (IEEETransactions on Industrial Informatics,2023 (1)) propose a method for linear local cut space arrangement (WeightedLinear Local Tangent SpaceAlignment, WLLTSA) of weights, which uses a thermonuclear matrix as a weight to describe the importance of different samples in the neighborhood, gives a small weight value to a neighborhood sample far from the center and gives a large weight value to a neighborhood sample near the center, so as to obtain more reliable local cut space coordinates, provide clear mapping relation for a test set, establish a more accurate fault detection model, and obtain better monitoring performance. However, when multiple fault types continuously occur in the production process, if the same weight value is given to similar neighbor samples and different types of neighbor samples in the neighborhood, the similar and different types of neighbor samples are projected to the same position of the cutting space, so that the extracted local coordinates of the cutting space are not reliable any more, and the WLLTSA method mainly focuses on the local relationship in the same class of targets and ignores the similarity measurement between different class targets, so that the designed training model cannot necessarily generate accurate classification performance for the test set. Thus, extracting reliable process operation information and important process data features is particularly important for identifying the class to which the fault belongs.
Disclosure of Invention
In order to overcome the above-mentioned shortcomings of the prior art, the present invention provides a multi-fault classification method that improves the linear local cut-space arrangement model. The supervision weight linear local cut space arrangement (Supervised Weighted Linear Local Tangent SpaceAlignment, SWLLTSA) model provided by the invention can fully mine various types of data characteristic information collected in the production process, introduce class mark information and extract important process multi-type operation information according to the local structural relationship of samples and the divergence relationship of different classes of samples; compared with the WLLTSA monitoring method, higher classification precision and stronger fault discrimination capability can be obtained.
The technical scheme adopted by the invention is as follows: a multi-fault classification method for improving a linear local cut space arrangement model, comprising the steps of:
step 1, identifying k neighbors of a center sample according to Euclidean distance, forming a neighborhood of the center sample, and constructing a neighborhood library; defining a weight matrix of the samples in the neighborhood, wherein the added class mark information reflects the possibility that the samples in the neighborhood belong to a certain class;
step 2, establishing a local coordinate system of a tangent space by using defined weights, and constructing an improved WLLTSA method, wherein the remote samples of which the neighborhood is far away from a center sample in the class and the close samples of which the neighborhood is close to the center sample have different weight values, so that the position relation of the neighbor samples in the neighborhood is refined, and the dissimilarity measure of the neighbor samples of different classes is paid attention to, so that the positions of the similar samples and the different classes in the neighborhood in the local tangent space are distinguished;
step 3, establishing an intra-class objective function and an inter-class objective function, maintaining a local structural relationship of production process data and mining global divergence information among different class samples;
and 4, constructing a SWLLSTA fault classification model, fusing global divergence information and local structure relation information in the process, designing multi-fault classification indexes in the process, evaluating an operation mode of the production process, and identifying various fault types.
As a preferable technical scheme of the invention: the specific process of the step 1 is as follows:
step 1.1, sampling to generate N standard samples to form a high-dimensional multi-mode process data set
Figure BDA0004149960360000021
These samples are from the low-dimensional feature space +.>
Figure BDA0004149960360000022
On the potential manifold in, wherein P and P (P < P) represent the dimensions of the high-dimensional original input space and the low-dimensional feature map space, respectively; for each sample, its k neighbor samples are determined by the nearest euclidean distance, forming a center sample x i Is a neighborhood of (2):
Figure BDA0004149960360000031
step 1.2, the process dataset has C+1 class labels: { l 1 ,l 2 ,...,l C+1 The class labels respectively correspond to normal operation samples collected in the industrial process and C different kinds of process fault classes, and are respectively:
Figure BDA0004149960360000032
for sample x i ,l(x i ) E {1,2,.. i Defining two samples x i And x j Weights w between ij :
Figure BDA0004149960360000033
Wherein the parameter adjustment beta is set as the average euclidean distance of the pairs of samples in the neighborhood. The weight reflects that similar close-range samples in the neighborhood have larger weight values than long-range samples, the neighborhood structure of the local cut space is mainly determined by close-range neighbors similar to the center sample in the neighborhood, and original process data can be properly mapped to the local cut space; in addition, dissimilarity measurement between neighborhood samples of different types of targets is added, dissimilarity characteristics between the different types of target samples are described, distances between the different types of target samples are stretched, and different positions of the same type of samples and the different types of samples in the neighborhood in a local cutting space are represented. By the weight, the distance of the similar samples in the neighborhood is compressed, the distance of the different samples in the neighborhood is stretched, and the correct positions of the similar samples and the different samples in the neighborhood in the local cutting space are reflected.
As a preferable technical scheme of the invention: the specific process of constructing the improved WLLTSA method in step 2 is as follows:
step a, based on newly established weight, a local PCA method is applied to approach the tangent space of each neighborhood, and a local transformation matrix Q is introduced i Each x is i Mapping the neighborhood to a local cutting space, establishing an optimization function of the local cutting space, extracting local coordinate information, and solving local coordinates theta of the cutting space i
Figure BDA0004149960360000034
Wherein H is k =I-ee T K is the centering matrix, I is the identity matrix, e is the column vector with all elements 1, w j =[w 1 ,w 2 ,...,w k ]Is the weight of each neighbor sample in the neighborhood; q (Q) i Is an orthogonal base matrix of tangent space, defined by matrix x i H k w i Feature vector composition, w, corresponding to the first p maximum feature values of the decomposition i Is a k x k diagonal matrix w i (j,j)=w ij ),Θ i Is x i Is described, local coordinate system information of the process data:
Figure BDA0004149960360000041
and b, after extracting the local structure information, arranging local coordinates of all obtained samples to a global low-dimensional feature space, and solving global coordinates Y of all samples X.
As a preferable technical scheme of the invention: the step of establishing the intra-class objective function and the inter-class objective function in the step 3 is specifically as follows:
step A, arranging local coordinates Θ of a tangent space i Solving a global coordinate Y, establishing an intra-class objective function, and keeping a local geometric relationship:
Figure BDA0004149960360000042
in the formula, a selection matrix S= [ S ] is introduced 1 ,...,S N ]Its element S i (i=1,., N) is a 0 to 1 selection vector, Y i =YS i ,
Figure BDA0004149960360000043
Is x i Global coordinates; l (L) i Is a global transformation matrix, optimal->
Figure BDA0004149960360000044
Figure BDA0004149960360000045
Is theta i Moore-Penrose generalized inverse array; f=diag (F 1 ,...,F N ),F i By solving for the value of->
Figure BDA0004149960360000046
Obtaining;
step B, an improved WLLTSA method, namely firstly accurately describing local structure information of each sample neighborhood on a local cut space by utilizing improved weights, then rearranging the local cut space of all samples to a global low-dimensional characteristic space, and finally, finding a projection matrix A, and mapping a high-dimensional process data set X to a low-dimensional data set
Figure BDA0004149960360000047
Y=A T XH N (6);
Step C, maintaining local relations among similar samples in the low-dimensional feature space by minimizing local structure information of process data, so that similar original samples are mapped on a tighter low-dimensional manifold;
according to equation (6), equation (5) can be transformed into:
Figure BDA0004149960360000051
wherein b=sff T S T ,A T XH N BH N X T A=I p I.e. YY T =I p For uniquely determining Y;
step D, maximizing the global separation degree of the data, separating different standard samples of the low-dimensional feature space, expanding the intervals of different types of samples in the feature space, and establishing an inter-class objective function:
Figure BDA0004149960360000052
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004149960360000053
is a global inter-class divergence matrix of the low-dimensional feature space dataset Y, expressed as:
Figure BDA0004149960360000054
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004149960360000058
and->
Figure BDA0004149960360000059
The average value of the g type samples and the average value of all samples in the low-dimensional feature space are respectively; the elements of matrix G satisfy:
Figure BDA0004149960360000055
according to YY T =I p And y=a T XH N The divergence matrix becomes:
Figure BDA0004149960360000056
maximizing the global inter-class divergence matrix, formula (11) is rewritten as:
Figure BDA0004149960360000057
as a preferable technical scheme of the invention: the step of constructing the SWLLSTA fault classification model in the step 4 is specifically as follows:
s1, constructing a SWLLTSA model, fusing local structure retention information and global separation degree information, extracting effective low-dimensional space characteristic information, and solving the following optimization function:
Figure BDA0004149960360000061
the optimization problem of equation (13) translates into solving the following generalized eigenvalue problem:
XH N (G-B)H N X T α=λXH N X T α (14)
according to y=a T XH N The constraint-containing optimization problem formula (13) is restated as:
Figure BDA0004149960360000062
i.e. solving the following generalized eigenvalue problem:
(G-B)y T =λy T (16)
y T is a feature vector corresponding to a feature value lambda of the feature problem solving of the above formula (16); if alpha is T XH N =y, α is a feature vector corresponding to the same feature value λ for solving the feature problem of formula (14);
if the characteristic problem of the formula (16) is solved to obtain the characteristic value after sequencing, lambda 12 >...>λ p The feature vectors corresponding to the feature values are
Figure BDA0004149960360000063
And (3) calculating:
Figure BDA0004149960360000064
wherein, delta is more than or equal to 0, delta is a regularization parameter;
given a given
Figure BDA0004149960360000065
The projection matrix A of the available model is
A=[α 1 ,α 2 ,...,α p ]=(XH N X T +δI) -1 XH N Y p (18)
A, determining the category direction of the SWLLTSA model, wherein the separation degree of different targets is orderly reduced; for test set X new The low dimension in the feature space is expressed as:
Z=A T X new H N (19);
s2, designing multi-fault classification indexes of the SWLLSTA model to realize process monitoring;
in the modeling process, a normal operation sample and a plurality of fault samples with different types of targets in the production process are collected to form a training set
Figure BDA0004149960360000066
In the off-line modeling stage, the proposed SWLLSTA model is utilized to maintain local structure information in original data and separate process sampling data of different types in a mapping space, a training set is projected to an optimized low-dimensional characteristic space, normal operation samples and different types of process fault types are identified, a target optimization function constructed according to formulas (13) to (17) is obtained, and a transformation matrix of the original space is obtained>
Figure BDA0004149960360000067
Low-dimensional representation of training set X
Figure BDA0004149960360000071
In the real-time fault detection phase, a new measurement data set X is regularized first new Then a transformation matrix is used to obtain a low-dimensional representation z=a of the new measured dataset T X new H N To identify the operation type of these new data, Z is calculated new And Euclidean distance between the training set in the low dimensional representation of the mapping space, expressed as:
Figure BDA0004149960360000072
in the method, in the process of the invention,
Figure BDA0004149960360000073
the representation is from the first r The ith data sample of the class label, in order to determine Z new The following discrimination functions are designed for the fault types:
Figure BDA0004149960360000074
if and only if Z new And (3) with
Figure BDA0004149960360000075
When the distance between the two is the smallest, the new observation sample belongs to the first r And the class marks are identified, and whether the process sample is a normal operation sample or a specific fault type is identified through the sample operation type of each class mark, so that multi-fault classification detection of real-time data in the production process is realized.
As a preferable technical scheme of the invention: the neighborhood range takes a value of k=12.
Compared with the prior art, the invention has the beneficial effects that: according to the invention, a new weight matrix is introduced to represent the local position relation between different class labels and the same class label samples in the adjacent domain on the process potential manifold, and the local geometric characteristics of the multi-fault process data manifold are maintained through the improved WLLSTA, so that the distorted manifold structure of the original WLLSTA, which is generated by not distinguishing the class labels of the samples in the adjacent domain, is improved; based on WLLTSA, the method fuses the locality of manifold geometry and the discrimination of global data, builds a SWLLTSA fault classification model, captures the intrinsic characteristics of different operation modes in the process, reflects the characteristic discrimination information of multi-fault process data, and identifies the fault type of a test sample by minimizing the Euclidean distance from the test sample to various known training samples on a low-dimensional potential manifold; the method is suitable for multi-fault classification detection in a high-dimensional multi-mode process.
Description of the drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention.
FIG. 1 is a schematic diagram of a Tennessee Isman industrial process;
FIG. 2 is a schematic diagram of a method embodiment of the present invention;
FIG. 3 is a test result obtained for TEP by SWLLTSA according to the method of the present invention;
FIG. 4 is a test result obtained for TEP by the WLLTSA method in the literature;
fig. 5 is a test result obtained for TEP by classical FDA method.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. Of course, the specific embodiments described herein are for purposes of illustration only and are not intended to limit the invention.
As shown in fig. 2, a multi-fault classification method for improving a linear local cut space arrangement model includes the steps of:
step 1, identifying k neighbors of a center sample according to Euclidean distance, forming a neighborhood of the center sample, and constructing a neighborhood library; defining a weight matrix of the samples in the neighborhood, wherein the added class mark information reflects the possibility that the samples in the neighborhood belong to a certain class;
step 2, establishing a local coordinate system of a tangent space by using defined weights, and constructing an improved WLLTSA method, wherein the remote samples of which the neighborhood is far away from a center sample in the class and the close samples of which the neighborhood is close to the center sample have different weight values, so that the position relation of the neighbor samples in the neighborhood is refined, and the dissimilarity measure of the neighbor samples of different classes is paid attention to, so that the positions of the similar samples and the different classes in the neighborhood in the local tangent space are distinguished;
step 3, establishing an intra-class objective function and an inter-class objective function, maintaining a local structural relationship of production process data and mining global divergence information among different class samples;
and 4, constructing a SWLLSTA fault classification model, fusing global divergence information and local structure relation information in the process, designing multi-fault classification indexes in the process, evaluating an operation mode of the production process, and identifying various fault types.
The specific process of the step 1 is as follows: step 1.1, sampling to generate N standard samples to form a high-dimensional multi-mode process data set
Figure BDA0004149960360000081
These samples are from the low-dimensional feature space +.>
Figure BDA0004149960360000082
On the potential manifold in, wherein P and P (P < P) represent the dimensions of the high-dimensional original input space and the low-dimensional feature map space, respectively; for each sample, its k neighbor samples are determined by the nearest euclidean distance, forming a center sample x i Is a neighborhood of (2):
Figure BDA0004149960360000083
step 1.2, the process dataset has C+1 class labels: { l 1 ,l 2 ,...,l C+1 The class labels respectively correspond to normal operation samples collected in the industrial process and C different kinds of process fault classes, and are respectively:
Figure BDA0004149960360000084
for sample x i ,l(x i ) E {1,2,.. i Defining two samples x i And x j Weights w between ij :
Figure BDA0004149960360000085
Wherein the parameter adjustment beta is set as the average euclidean distance of the pairs of samples in the neighborhood. The weight reflects that similar close-range samples in the neighborhood have larger weight values than long-range samples, the neighborhood structure of the local cut space is mainly determined by close-range neighbors similar to the center sample in the neighborhood, and original process data can be properly mapped to the local cut space; in addition, dissimilarity measurement between neighborhood samples of different types of targets is added, dissimilarity characteristics between the different types of target samples are described, distances between the different types of target samples are stretched, and different positions of the same type of samples and the different types of samples in the neighborhood in a local cutting space are represented. By the weight, the distance of the similar samples in the neighborhood is compressed, the distance of the different samples in the neighborhood is stretched, and the correct positions of the similar samples and the different samples in the neighborhood in the local cutting space are reflected.
In order to maintain the local structure information of each neighborhood, an improved WLLTSA method is provided, and the specific process of constructing the improved WLLTSA method in the step 2 is as follows:
step a, based on newly established weight, a local PCA method is applied to approach the tangent space of each neighborhood, and a local transformation matrix Q is introduced i Each x is i Mapping the neighborhood to a local cutting space, establishing an optimization function of the local cutting space, extracting local coordinate information, and solving local coordinates theta of the cutting space i
Figure BDA0004149960360000091
Wherein H is k =I-ee T K is the centering matrix and I is the identity matrixE is the column vector with elements all 1, w j =[w 1 ,w 2 ,...,w k ]Is the weight of each neighbor sample in the neighborhood; q (Q) i Is an orthogonal base matrix of tangent space, defined by matrix x i H k w i Feature vector composition, w, corresponding to the first p maximum feature values of the decomposition i Is a k x k diagonal matrix w i (i,j)=w ij ),Θ i Is x i Is described, local coordinate system information of the process data:
Figure BDA0004149960360000092
and b, after extracting the local structure information, arranging local coordinates of all obtained samples to a global low-dimensional feature space, and solving global coordinates Y of all samples X.
The step of establishing the intra-class objective function and the inter-class objective function in the step 3 is specifically as follows:
step A, arranging local coordinates Θ of a tangent space i Solving a global coordinate Y, establishing an intra-class objective function, and keeping a local geometric relationship:
Figure BDA0004149960360000093
in the formula, a selection matrix S= [ S ] is introduced 1 ,...,S N ]Its element S i (i=1,., N) is a 0 to 1 selection vector, Y i =YS i ,
Figure BDA00041499603600001011
Is x i Global coordinates; l (L) i Is a global transformation matrix, optimal->
Figure BDA0004149960360000101
Figure BDA0004149960360000102
Is theta i Moore-Penrose generalized inverse array of (E);F=diag(F 1 ,...,F N ),F i By solving for the value of->
Figure BDA0004149960360000103
Obtaining;
step B, an improved WLLTSA method, namely firstly accurately describing local structure information of each sample neighborhood on a local cut space by utilizing improved weights, then rearranging the local cut space of all samples to a global low-dimensional characteristic space, and finally, finding a projection matrix A, and mapping a high-dimensional process data set X to a low-dimensional data set
Figure BDA0004149960360000104
Y=A T XH N (6);
Step C, maintaining local relations among similar samples in the low-dimensional feature space by minimizing local structure information of process data, so that similar original samples are mapped on a tighter low-dimensional manifold;
according to equation (6), equation (5) can be transformed into:
Figure BDA0004149960360000105
wherein b=sff T S T ,A T XH N BH N X T A=I p I.e. YY T =I p For uniquely determining Y;
step D, maximizing the global separation degree of the data, separating different standard samples of the low-dimensional feature space, expanding the intervals of different types of samples in the feature space, and establishing an inter-class objective function:
Figure BDA0004149960360000106
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004149960360000107
is a global inter-class divergence matrix of the low-dimensional feature space dataset Y, expressed as:
Figure BDA0004149960360000108
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004149960360000109
and->
Figure BDA00041499603600001010
The average value of the g type samples and the average value of all samples in the low-dimensional feature space are respectively; the elements of matrix G satisfy:
Figure BDA0004149960360000111
according to YY T =I p And y=a T XH N The divergence matrix becomes:
Figure BDA0004149960360000112
maximizing the global inter-class divergence matrix, formula (11) is rewritten as:
Figure BDA0004149960360000113
the step of constructing the SWLLSTA fault classification model in the step 4 is specifically as follows:
s1, constructing a SWLLTSA model, fusing local structure retention information and global separation degree information, extracting effective low-dimensional space characteristic information, and solving the following optimization function:
Figure BDA0004149960360000114
the optimization problem of equation (13) translates into solving the following generalized eigenvalue problem:
XH N (G-B)H N X T α=λXH N X T α (14)
according to y=a T XH N The constraint-containing optimization problem formula (13) is restated as:
Figure BDA0004149960360000115
i.e. solving the following generalized eigenvalue problem:
(G-B)y T =λy T (16)
y T is a feature vector corresponding to a feature value lambda of the feature problem solving of the above formula (16); if alpha is T XH N =y, α is a feature vector corresponding to the same feature value λ for solving the feature problem of formula (14);
if the characteristic problem of the formula (16) is solved to obtain the characteristic value after sequencing, lambda 12 >...>λ p The feature vectors corresponding to the feature values are
Figure BDA0004149960360000116
And (3) calculating:
Figure BDA0004149960360000117
wherein, delta is more than or equal to 0, delta is a regularization parameter;
given a given
Figure BDA0004149960360000121
The projection matrix A of the available model is
A=[α 1 ,α 2 ,...,α p ]=(XH N X T +δI) -1 XH N Y P (18)
A, determining the category direction of the SWLLTSA model, wherein the separation degree of different targets is orderly reduced; for test set X new The low dimension in the feature space is expressed as:
Z=A T X new H N (19);
s2, designing multi-fault classification indexes of the SWLLSTA model to realize process monitoring;
in the modeling process, a normal operation sample and a plurality of fault samples with different types of targets in the production process are collected to form a training set
Figure BDA0004149960360000122
In the off-line modeling stage, the proposed SWLLSTA model is utilized to maintain local structure information in original data and separate process sampling data of different types in a mapping space, a training set is projected to an optimized low-dimensional characteristic space, normal operation samples and different types of process fault types are identified, a target optimization function constructed according to formulas (13) to (17) is obtained, and a transformation matrix of the original space is obtained>
Figure BDA0004149960360000123
Low-dimensional representation of training set X
Figure BDA0004149960360000124
In the real-time fault detection phase, a new measurement data set X is regularized first new Then a transformation matrix is used to obtain a low-dimensional representation z=a of the new measured dataset T X new H N To identify the operation type of these new data, Z is calculated new And Euclidean distance between the training set in the low dimensional representation of the mapping space, expressed as:
Figure BDA0004149960360000125
in the method, in the process of the invention,
Figure BDA0004149960360000126
the representation is from the first r The ith data sample of the class label, in order to determine Z new The following discrimination functions are designed for the fault types:
Figure BDA0004149960360000127
if and only if Z new And (3) with
Figure BDA0004149960360000128
When the distance between the two is the smallest, the new observation sample belongs to the first r And the class marks are identified, and whether the process sample is a normal operation sample or a specific fault type is identified through the sample operation type of each class mark, so that multi-fault classification detection of real-time data in the production process is realized.
The neighborhood range takes a value of k=12.
The effectiveness of the invention is described below in connection with the tanacisman industrial process (Tennessee Eastman process, TEP) examples. The platform is a simulation test platform developed by Eastman company in the United states according to practical chemical industry combined reaction, and comprises five main operation units of a continuous stirring reaction kettle, a condenser, a centrifugal compressor, a vapor/liquid separator and a stripping tower, wherein a schematic diagram of a TEP process is shown in figure 1. In the experiment, 22 continuous measurement variables and 11 manipulated variables (excluding stirring speed) were selected as process monitoring variables, and the sampling interval of the dataset was 3 minutes.
TABLE 1 TEP data collected and description of faults
Figure BDA0004149960360000131
Typical faults F4, F8, F13, F15 in faults and normal operation samples are selected in the TEP process scenario shown in table 1, two operation modes are selected, two types of fault data are collected in each mode, and the collected process data are used for evaluating and comparing classification performances of different monitoring models. 1600 data samples were collected for each pattern in the training set and 800 data samples were collected for each pattern in the test set.
Table 2 presents the proposed SWLLTSA, WLLSTA in literature, and classification results of TEP by conventional LLSTA and FDA (Fisher discriminant analysis) methods.
TABLE 2 classification results of TEP by SWLLTSA, WLLTSA, LLTSA and FDA
Figure BDA0004149960360000132
Figure BDA0004149960360000141
The larger values in the table illustrate better classification performance. These process data have non-gaussian, multi-modal, high dimensional, etc. characteristics and the FDA approach produces poor classification when identifying these several different types of faults. In comparison, SWLLTSA and WLLSTA are better classified, and the local structural relation of the samples in the same class can be maintained, so that the characteristic enhances the classification effect; moreover, the SWLLTSA performs divergence measurement on samples of different types, and simultaneously refines the geometric relationship between neighbor samples and remote samples in the same type of inner neighborhood, and has stronger discrimination capability than WLLSTA, so that the SWLLSTA method provided obtains better classification effect. Fig. 3 is a graph of SWLLSTA, WLLSTA and FDA test results. Overall, SWLLSTA method misclassified samples are less numerous.
According to the invention, a new weight matrix is introduced to represent the local position relation between different class labels and the same class label samples in the adjacent domain on the process potential manifold, and the local geometric characteristics of the multi-fault process data manifold are maintained through the improved WLLSTA, so that the distorted manifold structure of the original WLLSTA, which is generated by not distinguishing the class labels of the samples in the adjacent domain, is improved; the method is based on WLLTSA, integrates the locality of manifold geometry and the discrimination of global data, builds a fault classification model, captures the intrinsic characteristics of different operation modes in the process, reflects the characteristic discrimination information of multi-fault process data, and identifies the fault type of a test sample by minimizing Euclidean distance from the test sample to various known training samples on a low-dimensional potential manifold. Therefore, the method is suitable for multi-fault classification detection in a high-dimensional multi-mode process.
The method provided by the invention not only pays attention to the measurement of neighbor samples in the same class, but also pays attention to the dissimilarity measurement of neighbor samples in different classes, particularly distributes different weight values to the long-distance samples of the neighborhood far from the center sample and the short-distance samples close to the center sample, further refines the geometric relationship between the neighbor samples in the neighborhood, accords with the nonlinear and multi-mode multi-fault production process characteristics of the original process data, pays attention to the dissimilarity measurement of the neighbor samples in different classes, gives larger weight to the neighbor samples in different classes, and enlarges the interval between the neighbor samples in different classes so as to distinguish the positions of the same class and the different class samples in the neighborhood in the local cutting space; meanwhile, divergence measurement among different classes of marks is introduced, so that identification information of different types of process data can be fully utilized, and the multi-fault identification capability of a low-dimensional mapping space is improved; in general, the method utilizes class mark information to generate a weight matrix in a local neighborhood, gives different weight values to various types of neighbor samples, grabs the characteristic information of each sample, enhances the extraction of local structure information, and fuses the class mark information of data with the separation degree between different class marks, so that the sensitivity and the discrimination of multi-fault classification in a low-dimensional projection space are improved; compared with the WLLTSA monitoring method, higher diagnosis precision and stronger fault discrimination capability can be obtained.
While the foregoing is directed to embodiments of the present invention, other and further details of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.

Claims (6)

1. A multi-fault classification method for improving a linear local cut space arrangement model, comprising the steps of:
step 1, identifying k neighbors of a center sample according to Euclidean distance, forming a neighborhood of the center sample, and constructing a neighborhood library; defining a weight matrix of the samples in the neighborhood, wherein the added class mark information reflects the possibility that the samples in the neighborhood belong to a certain class;
step 2, establishing a local coordinate system of a tangent space by using defined weights, and constructing an improved WLLTSA method, wherein the remote samples of which the neighborhood is far away from a center sample in the class and the close samples of which the neighborhood is close to the center sample have different weight values, so that the position relation of the neighbor samples in the neighborhood is refined, and the dissimilarity measure of the neighbor samples of different classes is paid attention to, so that the positions of the similar samples and the different classes in the neighborhood in the local tangent space are distinguished;
step 3, establishing an intra-class objective function and an inter-class objective function, maintaining a local structural relationship of production process data and mining global divergence information among different class samples;
and 4, constructing a SWLLSTA fault classification model, fusing global divergence information and local structure relation information in the process, designing multi-fault classification indexes in the process, evaluating an operation mode of the production process, and identifying various fault types.
2. The method for classifying multiple faults in an improved linear local tangential spatial arrangement model of claim 1 in which step 1 is as follows:
step 1.1, sampling to generate N standard samples to form a high-dimensional multi-mode process data set
Figure FDA0004149960340000011
These samples are from the low-dimensional feature space +.>
Figure FDA0004149960340000012
On the potential manifold in, wherein P and P (P < P) represent the dimensions of the high-dimensional original input space and the low-dimensional feature map space, respectively; for each sample, its k neighbor samples are determined by the nearest euclidean distance, forming a center sample x i Is a neighborhood of (2):
Figure FDA0004149960340000013
step 1.2, the process dataset has C+1 class labels: { l 1 ,l 2 ,…,l C+1 The class labels respectively correspond to normal operation samples collected in the industrial process and C different kinds of process fault classes, and are respectively:
Figure FDA0004149960340000014
for sample x i ,l(x i ) E {1,2,.. i Defining two samples x i And x j Weights w between ij :
Figure FDA0004149960340000015
Wherein the parameter adjustment beta is set as the average euclidean distance of the pairs of samples in the neighborhood.
3. The method for multiple fault classification for improved linear local cut space arrangement model as claimed in claim 2, wherein the specific procedure for constructing the improved WLLTSA method in step 2 is as follows:
step a, based on newly established weight, a local PCA method is applied to approach the tangent space of each neighborhood, and a local transformation matrix Q is introduced i Each x is i Mapping the neighborhood to a local cutting space, establishing an optimization function of the local cutting space, extracting local coordinate information, and solving local coordinates theta of the cutting space i
Figure FDA0004149960340000021
Wherein H is k =I-ee T K is the centering matrix, I is the identity matrix, e is the column vector with all elements 1, w j =[w 1 ,w 2 ,...,w k ]Is the weight of each neighbor sample in the neighborhood; q (Q) i Is an orthogonal base matrix of tangent space, defined by matrix x i H k w i Feature vector composition, w, corresponding to the first p maximum feature values of the decomposition i Is a k x k diagonal matrix w i (j,j)=w ij ),Θ i Is x i Is described, local coordinate system information of the process data:
Figure FDA0004149960340000022
and b, after extracting the local structure information, arranging local coordinates of all obtained samples to a global low-dimensional feature space, and solving global coordinates Y of all samples X.
4. The method for multiple fault classification for improved linear local tangential spatial arrangement model as claimed in claim 3, wherein the step of establishing an intra-class objective function and an inter-class objective function in step 3 is specifically as follows:
step A, arranging local coordinates Θ of a tangent space i Solving a global coordinate Y, establishing an intra-class objective function, and keeping a local geometric relationship:
Figure FDA0004149960340000023
in the formula, a selection matrix S= [ S ] is introduced 1 ,…,S N ]Its element S i (i=1, …, N) is a 0 to 1 selection vector, Y i =YS i ,
Figure FDA0004149960340000027
Is x i Global coordinates; l (L) i Is a global transformation matrix, optimal->
Figure FDA0004149960340000024
Figure FDA0004149960340000025
Is theta i Moore-Penrose generalized inverse array; f=diag (F 1 ,...,F N ),F i By solving for the value of->
Figure FDA0004149960340000026
Obtaining;
step B, an improved WLLTSA method, namely firstly accurately describing local structure information of each sample neighborhood on a local cut space by utilizing improved weights, then rearranging the local cut space of all samples to a global low-dimensional characteristic space, and finally, finding a projection matrix A, and mapping a high-dimensional process data set X to a low-dimensional data set
Figure FDA0004149960340000031
Y=A T XH N (6);
Step C, maintaining local relations among similar samples in the low-dimensional feature space by minimizing local structure information of process data, so that similar original samples are mapped on a tighter low-dimensional manifold;
according to equation (6), equation (5) can be transformed into:
Figure FDA0004149960340000032
wherein b=sff T S T ,A T XH N BH N X T A=I p I.e. YY T =I p For uniquely determining Y;
step D, maximizing the global separation degree of the data, separating different standard samples of the low-dimensional feature space, expanding the intervals of different types of samples in the feature space, and establishing an inter-class objective function:
Figure FDA0004149960340000033
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure FDA0004149960340000034
is a global inter-class divergence matrix of the low-dimensional feature space dataset Y, expressed as:
Figure FDA0004149960340000035
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure FDA0004149960340000036
and->
Figure FDA0004149960340000037
The average value of the g type samples and the average value of all samples in the low-dimensional feature space are respectively; the elements of matrix G satisfy:
Figure FDA0004149960340000038
according to YY T =I p And y=a T XH N The divergence matrix becomes:
Figure FDA0004149960340000041
maximizing the global inter-class divergence matrix, formula (11) is rewritten as:
Figure FDA0004149960340000042
5. the method for multi-fault classification for improved linear local tangential spatial arrangement model as claimed in claim 4, wherein the step of constructing SWLLSTA fault classification model in step 4 is specifically as follows:
s1, constructing a SWLLTSA model, fusing local structure retention information and global separation degree information, extracting effective low-dimensional space characteristic information, and solving the following optimization function:
Figure FDA0004149960340000043
the optimization problem of equation (13) translates into solving the following generalized eigenvalue problem:
XH N (G-B)H N X T α=λXH N X T α (14)
according to y=a T XH N The constraint-containing optimization problem formula (13) is restated as:
Figure FDA0004149960340000044
i.e. solving the following generalized eigenvalue problem:
(G-B)y T =λy T (16)
y T is a feature vector corresponding to a feature value lambda of the feature problem solving of the above formula (16); if alpha is T XH N =y, α is a feature vector corresponding to the same feature value λ for solving the feature problem of formula (14);
if the characteristic problem of the formula (16) is solved to obtain the characteristic value after sequencing, lambda 12 >…>λ p The feature vectors corresponding to the feature values are
Figure FDA0004149960340000045
And (3) calculating:
Figure FDA0004149960340000046
wherein, delta is more than or equal to 0, delta is a regularization parameter;
given a given
Figure FDA0004149960340000047
The projection matrix A of the obtainable model is A= [ alpha ] 1 ,α 2 ,...,α p ]=(XH N X T +δI) - 1 XH N Y p (18)
A, determining the category direction of the SWLLTSA model, wherein the separation degree of different targets is orderly reduced; for test set X new The low dimension in the feature space is expressed as:
Z=A T X new H N (19);
s2, designing multi-fault classification indexes of the SWLLSTA model to realize process monitoring;
in the modeling process, a normal operation sample and a plurality of fault samples with different types of targets in the production process are collected to form a training set
Figure FDA0004149960340000051
In the off-line modeling stage, the proposed SWLLSTA model is utilized to maintain local structure information in original data and separate process sampling data of different types in a mapping space, a training set is projected to an optimized low-dimensional characteristic space, normal operation samples and different types of process fault types are identified, a target optimization function constructed according to formulas (13) to (17) is obtained, and a transformation matrix of the original space is obtained>
Figure FDA0004149960340000052
Low-dimensional representation of training set X
Figure FDA0004149960340000053
In the real-time fault detection phase, a new measurement data set X is regularized first new Then a transformation matrix is used to obtain a low-dimensional representation z=a of the new measured dataset T X new H N To identify the operation type of these new data, Z is calculated new And Euclidean distance between the training set in the low dimensional representation of the mapping space, expressed as:
Figure FDA0004149960340000054
in the method, in the process of the invention,
Figure FDA0004149960340000055
the representation is from the first r The ith data sample of the class label, in order to determine Z new The following discrimination functions are designed for the fault types:
Figure FDA0004149960340000056
if and only if Z new And (3) with
Figure FDA0004149960340000057
When the distance between the two is the smallest, the new observation sample belongs to the first r And the class marks are identified, and whether the process sample is a normal operation sample or a specific fault type is identified through the sample operation type of each class mark, so that multi-fault classification detection of real-time data in the production process is realized.
6. The method of claim 2, wherein the neighborhood range has a value of k=12.
CN202310314884.7A 2023-03-28 2023-03-28 Multi-fault classification method for improving linear local cut space arrangement model Pending CN116361722A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310314884.7A CN116361722A (en) 2023-03-28 2023-03-28 Multi-fault classification method for improving linear local cut space arrangement model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310314884.7A CN116361722A (en) 2023-03-28 2023-03-28 Multi-fault classification method for improving linear local cut space arrangement model

Publications (1)

Publication Number Publication Date
CN116361722A true CN116361722A (en) 2023-06-30

Family

ID=86935848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310314884.7A Pending CN116361722A (en) 2023-03-28 2023-03-28 Multi-fault classification method for improving linear local cut space arrangement model

Country Status (1)

Country Link
CN (1) CN116361722A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116610927A (en) * 2023-07-21 2023-08-18 傲拓科技股份有限公司 Fan gear box bearing fault diagnosis method and diagnosis module based on FPGA

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116610927A (en) * 2023-07-21 2023-08-18 傲拓科技股份有限公司 Fan gear box bearing fault diagnosis method and diagnosis module based on FPGA
CN116610927B (en) * 2023-07-21 2023-10-13 傲拓科技股份有限公司 Fan gear box bearing fault diagnosis method and diagnosis module based on FPGA

Similar Documents

Publication Publication Date Title
CN107515895B (en) Visual target retrieval method and system based on target detection
CN110008584B (en) GitHub-based semi-supervised heterogeneous software defect prediction method
Paclík et al. Building road-sign classifiers using a trainable similarity measure
CN101140624A (en) Image matching method
Chen et al. Using improved self-organizing map for fault diagnosis in chemical industry process
US20080212880A1 (en) Identification and Classification of Virus Particles in Textured Electron Micrographs
CN111160401A (en) Abnormal electricity utilization judging method based on mean shift and XGboost
CN116361722A (en) Multi-fault classification method for improving linear local cut space arrangement model
CN101738998B (en) System and method for monitoring industrial process based on local discriminatory analysis
CN110765587A (en) Complex petrochemical process fault diagnosis method based on dynamic regularization judgment local retention projection
CN110136779B (en) Sample feature extraction and prediction method for key difference nodes of biological network
CN111667135B (en) Load structure analysis method based on typical feature extraction
CN114564982A (en) Automatic identification method for radar signal modulation type
CN103616889B (en) A kind of chemical process Fault Classification of reconstructed sample center
CN109784142B (en) Hyperspectral target detection method based on conditional random projection
Luqman et al. Subgraph spotting through explicit graph embedding: An application to content spotting in graphic document images
CN102930291B (en) Automatic K adjacent local search heredity clustering method for graphic image
CN111796576B (en) Process monitoring visualization method based on dual-core t-distribution random neighbor embedding
CN111426657B (en) Identification comparison method of three-dimensional fluorescence spectrogram of soluble organic matter
Song et al. A multi-SOM with canonical variate analysis for chemical process monitoring and fault diagnosis
CN113033683B (en) Industrial system working condition monitoring method and system based on static and dynamic joint analysis
CN114118292B (en) Fault classification method based on linear discriminant neighborhood preserving embedding
CN110647922B (en) Layered non-Gaussian process monitoring method based on public and special feature extraction
Yang et al. Adaptive density peak clustering for determinging cluster center
CN112183569A (en) FDA and SOM based intermittent industrial process reaction phase clustering and fault classification visualization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination