CN113077013A - High-dimensional data fault anomaly detection method and system based on generation countermeasure network - Google Patents

High-dimensional data fault anomaly detection method and system based on generation countermeasure network Download PDF

Info

Publication number
CN113077013A
CN113077013A CN202110468859.5A CN202110468859A CN113077013A CN 113077013 A CN113077013 A CN 113077013A CN 202110468859 A CN202110468859 A CN 202110468859A CN 113077013 A CN113077013 A CN 113077013A
Authority
CN
China
Prior art keywords
network
data
generating
distribution
generation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110468859.5A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Lianlu Semiconductor Technology Co ltd
Original Assignee
Shanghai Lianlu Semiconductor Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Lianlu Semiconductor Technology Co ltd filed Critical Shanghai Lianlu Semiconductor Technology Co ltd
Priority to CN202110468859.5A priority Critical patent/CN113077013A/en
Publication of CN113077013A publication Critical patent/CN113077013A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a high-dimensional data fault anomaly detection method and system based on a generation countermeasure network, and relates to the technical field of anomaly detection in multi-dimensional data, wherein the method comprises the following steps: step S1: generating a countermeasure network architecture; step S2: after the confrontation network architecture is generated, the confrontation network training is stably generated, and a training model is obtained; step S3: and setting a scoring function according to the training model, and performing abnormal scoring on the countermeasure network. The invention can detect abnormal data which does not appear before, and process the abnormal detection of the two-dimensional and three-dimensional image data of the semiconductor wafer.

Description

High-dimensional data fault anomaly detection method and system based on generation countermeasure network
Technical Field
The invention relates to the technical field of anomaly detection in multidimensional data, in particular to a high-dimensional data fault anomaly detection method and system based on a generation countermeasure network.
Background
Anomaly detection in multidimensional data is a problem of great practical significance, and comprises a large number of practical applications in the real world, including network security manufacturing, fraud detection, medical imaging and the like. A typical anomaly detection method requires modeling of the pattern of normal data to identify anomalous samples that do not conform to the normal data pattern. Although anomaly detection has been subject to a great deal of research, there are still significant challenges to developing efficient methods suitable for complex and high-dimensional data.
Generating a countermeasure network is a powerful, high-dimensional data modeling framework that can address this challenge. The standard generative confrontation network consists of two neural networks, one is the generative network (G) and one is the discriminative network (J) which learns the mapping pattern by learning mapping from hidden data variables (z) (assumed to obey gaussian or uniform distribution) to the virtual real data space during training, while the discriminative network is used to learn samples that distinguish real data from virtual real data generated by the generative network. Generation of countermeasure networks has enjoyed great success in the application of virtual image generation and is increasingly used in speech and medical imaging applications.
The invention patent publication US6292582B1 discloses a method and system for identifying defects in semiconductors that is capable of classifying specific types of anomalies, including image acquisition and processing, but searches for pre-extracted features based on a nearest neighbor database to find nearest neighbor anomalies and fail to detect new anomalies.
The invention patent with publication number US8126681B2 discloses a method for identifying semiconductor outliers using a sequential combined data transformation process, which is based on simple statistical techniques, has a fast evaluation speed and a certain theoretical basis, but needs to use electrical test data with higher acquisition cost, and uses classical statistical methods to detect outliers, which may not be enough to capture outliers in complex image data.
Disclosure of Invention
In view of the defects in the prior art, the present invention provides a method and a system for detecting fault abnormality of high-dimensional data based on a generative countermeasure network, so as to solve the above problems.
According to the high-dimensional data fault abnormity detection method and system based on the generation countermeasure network provided by the invention, the scheme is as follows:
in a first aspect, a method for detecting fault abnormality of high-dimensional data based on generation of a countermeasure network is provided, the method including:
constructing and generating an antagonistic network architecture;
after a generated confrontation network architecture is constructed, stably generating confrontation network training to obtain a training model;
and setting a scoring function according to the training model, performing anomaly scoring on the generation countermeasure network, and performing anomaly detection on the high-dimensional data by using the generation countermeasure network.
Preferably, the specific steps of generating the countermeasure network architecture are as follows:
the standard generation countermeasure network comprises a generation network G and a discrimination network J, and the generation network G and the discrimination network J are arranged in a group of M data samples
Figure BDA0003044976570000021
Training, wherein i ═ 1, 2, …, M;
in a hidden data space which obeys specific distribution, a generated network G maps a collected random hidden variable z to an input data space X;
discrimination network J attempts to combine actual data samples x(i)Judging with a sample G (z) generated by G;
p is to bex(x) Is defined as the probability of the distribution of the real data X in the sample space X, and
Figure BDA0003044976570000022
for hidden data z in hidden data space
Figure BDA0003044976570000023
The distribution probability of (1); p is a radical ofG(x) Defined as generating the distribution probability of the network G in the sample space X;
generating a Confrontation network model to distribute p jointlyG(x,z)=p(z)pG(x | z) and pE(x,z)=pX(x)pECountermeasure decision network J with (z | x) and x and z as inputsxzMatching;
generating a countermeasure network would identify network JxzThe generating network G and the coding network E are determined as a saddle point problem MING,EMAX Jxz V(JxzSolutions of E, G) in which V: (Jxz, E, G) is defined as:
Figure BDA0003044976570000024
wherein the content of the first and second substances,
Figure BDA0003044976570000025
probability expectation functions representing the data distribution in the X and Z data spaces, respectively;
for fixed values of the encoding network E and the generating network G, optimally judging the network
Figure BDA0003044976570000026
Comprises the following steps:
Figure BDA0003044976570000027
for optimal discriminant networks
Figure BDA0003044976570000028
If and only if pE(x,z)=pG(x, z), implementing training criterion C (E, G) ═ MAX Jxz V(DxzE, G).
Preferably, the specific steps of acquiring the training model are as follows:
implicit in the spatial condition H using an additional antagonistic learning discriminant network Jzzπ(x|z)=-Επ(x,z)[logπ(x|z)]Regularization, where π (x, z) is a joint distribution over x and z, with the following saddle point targets:
Figure BDA0003044976570000031
Figure BDA0003044976570000032
wherein, V (J)XZ,JXX,JZZE, G) are defined as
V(Jxz,Jxx,Jzz,E,G)=V(Jxz,E,G)+V(Jxx,E,G)+V(Jzz,E,G)
Wherein, Jzz、JxzAnd JxxEach represents a discriminating network, G represents a generating network, and E represents an encoder.
Preferably, the abnormality scoring specifically comprises the following steps:
performing effective modeling on data distribution, and learning normal data p by using generation network GG(x)=pX(x) In which
Figure BDA0003044976570000033
Learning the distribution of the data so as to accurately recover the re-expression of the latent data space;
ensuring that a normal sample can be accurately reconstructed;
preferably, the normal sample is reconstructed as follows:
is calculated at JxxDistance between two vectors projected in learned feature space, a (x) | | fxx(x,x)-fxx(x,G(E(x)))||1
Wherein f () is the model JxxThe last fully connected layer of (a);
training a model on normal data to provide E, G, Jxz、JxxAnd JzzThen, a scoring function u (x) is defined, which measures the degree of abnormality of the example x according to the difference between the reconstructed sample and the abnormal sample:
U(x)=||fxx(x,x)-fxx(x,G(E(x)))||2
u (x) samples with large values are considered to have a large probability of being anomalous data.
In a second aspect, a system for detecting fault and anomaly of high-dimensional data based on generation of a countermeasure network is provided, the system comprising:
model M1: constructing and generating an antagonistic network architecture;
model M2: after a generated confrontation network architecture is constructed, stably generating confrontation network training to obtain a training model;
model M3: and setting a scoring function according to the training model, performing anomaly scoring on the generation countermeasure network, and performing anomaly detection on the high-dimensional data by using the generation countermeasure network.
Preferably, the module M1 includes:
the standard generation countermeasure network comprises a generation network G and a discrimination network J, and the generation network G and the discrimination network J are arranged in a group of M data samples
Figure BDA0003044976570000034
Training, wherein i ═ 1, 2, …, M;
in a hidden data space which obeys specific distribution, a generated network G maps a collected random hidden variable z to an input data space X;
discrimination network J attempts to combine actual data samples x(i)Judging with a sample G (z) generated by G;
p is to bex(x) Is defined as the probability of the distribution of the real data X in the sample space X, and
Figure BDA0003044976570000041
for hidden data z in hidden data space
Figure BDA0003044976570000042
The distribution probability of (1); p is a radical ofG(x) Defined as generating the distribution probability of the network G in the sample space X;
generating a Confrontation network model to distribute p jointlyG(x,z)=p(z)pG(x | z) and pE(x,z)=pX(x)pECountermeasure decision network J with (z | x) and x and z as inputsxzMatching;
generating a countermeasure network would identify network JxzThe generating network G and the coding network E are determined as a saddle point problem MING,E MAX Jxz V(JxzE, G), where V (Jxz, E, G) is defined as:
Figure BDA0003044976570000043
wherein the content of the first and second substances,
Figure BDA0003044976570000044
which are the expected functions of the data distribution in the X and Z data spaces, respectively.
For fixed values of the encoding network E and the generating network G, optimally judging the network
Figure BDA0003044976570000045
Comprises the following steps:
Figure BDA0003044976570000046
for optimal discriminant networks
Figure BDA0003044976570000047
If and only if pE(x,z)=pG(x, z), implementing training criterion C (E, G) ═ MAX Jxz V(DxzE, G).
Preferably, the module M2 includes:
implicit in the spatial condition H using an additional antagonistic learning discriminant network Jzzπ(x|z)=-Επ(x,z)[logπ(x|z)]Regularization, where π (x, z) is a joint distribution over x and z, with the following saddle point targets:
Figure BDA0003044976570000048
Figure BDA0003044976570000049
wherein, V (J)XZ,JXX,JZZE, G) are defined as
V(Jxz,Jxx,jzz,E,G)=V(Jxz,E,G)+V(Jxx,E,G)+V(Jzz,E,G)
Wherein, Jzz、JxzAnd JxxEach represents a discriminating network, G represents a generating network, and E represents an encoder.
Preferably, the module M3 includes:
performing effective modeling on data distribution, and learning normal data p by using generation network GG(x)=pX(x) In which
Figure BDA00030449765700000410
Learning the distribution of the data so as to accurately recover the re-expression of the latent data space;
ensuring that a normal sample can be accurately reconstructed;
preferably, the method for ensuring accurate reconstruction of a normal sample specifically comprises the following steps:
is calculated at JxxDistance between two vectors projected in learned feature space, a (x) | | fxx(x,x)-fxx(x,G(E(x)))||1
Wherein f () is the model JxxThe last fully connected layer of (a);
training a model on normal data to provide E, G, Jxz、JxxAnd JzzThen, a scoring function u (x) is defined, which measures the degree of abnormality of the example x according to the difference between the reconstructed sample and the abnormal sample:
U(x)=||fxx(x,x)-fxx(x,G(E(x)))||2
u (x) samples with large values are considered to have a large probability of being anomalous data.
Compared with the prior art, the invention has the following beneficial effects:
1. the deep learning anomaly detection model provided by the invention does not need known anomaly data during training;
2. the invention can detect the abnormal data which does not appear before;
3. the invention can process the abnormal detection of two-dimensional and three-dimensional image data of the semiconductor wafer.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a final generated countermeasure network model;
FIG. 2 is a graph of test data using outliers, encoded expression, and reconstruction.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
The embodiment of the invention provides a high-dimensional data fault abnormity detection method based on a generated countermeasure network, and as shown in figure 1, firstly, a framework for generating the countermeasure network is constructed:
the standard generation countermeasure network comprises a generation network G and a discrimination network J, and the generation network G and the discrimination network J are arranged in a group of M data samples
Figure BDA0003044976570000051
Training, wherein i ═ 1, 2, …, M; in a hidden data space which obeys specific distribution, a generated network G maps a collected random hidden variable z to an input data space X; discrimination network J attempts to combine actual data samples x(i)And G (z) is distinguished from the sample G (z) generated by G.
In the overall structure, the two networks, the generation network G and the discrimination network J compete with each other: the generating network G attempts to generate samples similar to the real data, while the discriminating network J will be used to discriminate between the pseudo samples and the real data samples generated by the generating network. Training the generating countermeasure network then typically takes an alternating gradient step so that the generating network G can better "fool" the discriminating network J and cause the discriminating network J to better discriminate the pseudo-samples generated by the generating network G.
Formally, p isx(x) Is defined as the probability of the distribution of the real data X in the sample space X, and
Figure BDA0003044976570000052
for hidden data z in hidden data space
Figure BDA0003044976570000053
The distribution probability of (1); p is a radical ofG(x) Defined as generating the distribution probability of the network G in the sample space X;
saddle point problem MIN needs to be solved in generating a countermeasure networkG MAXJV (J, G) for training a discrimination network J and generating a network G, wherein
Figure BDA0003044976570000061
The optimal generation network may produce a distribution p of data corresponding to the true dataX(x) Matched distribution pG(x)。
For generating network G fixed value, optimal discrimination network
Figure BDA0003044976570000062
Comprises the following steps:
Figure BDA0003044976570000063
for optimal discriminant networks
Figure BDA0003044976570000064
If and only if pG(x)=pX(x) Then, we can implement the training criterion c (g) ═ MAXJGlobal minimum of V (J, G)
In practical applications, the method of alternating gradient descent is usually performed on the discrimination network J and the network generating the network G to train the discrimination network J and the generation network G: one of the network parameters is fixed, so that V (J, G) is maximized (for J) or minimized (for G) accordingly. After training the generative network to generate the antagonistic network, p can be acquired using the generative networkXReal sample, coincidence
Figure BDA0003044976570000065
Generating dummy data samples for G (z). It should be noted that for a given data sample x, it is not possible to compute its distribution probability explicitly or to compute its distribution probability of the hidden data.
And (3) detecting the abnormal situation of the confrontation network:
the standard generation countermeasure network supports only valid data samples, which can be adjusted in several ways to achieve anomaly detection. For example, for data point x, a sample can be used to calculate the probability of distribution of an anomaly for x, thereby determining if it is an outlier. While effective sampling can be made from generating a competing network, the exact computation of the probabilities typically requires a large number of samples, resulting in a very large number of probabilistic computations. Another approach is to "invert" the generating network to find the hidden variable z by random gradient descent that reduces the reconstruction error or related objective as much as possible. Since each gradient calculation needs to be propagated backward through the generation network, the calculation amount of the method is very large, and the method is difficult to be practically applied.
To improve computational efficiency, we build a generative confrontation network containing a coding network E that maps data samples x to a hidden space z during training. In such models, computing the implicit spatial representation of the data point x (approximation) can be achieved by simply inputting x through the encoding network E. New improvements are incorporated in our model to improve the coding network by adding additional discriminant networks to achieve codec consistency, i.e. G (e (x)) ≈ x.
Theoretically, reference [ 1]]Jeff Donahue,Philipp
Figure BDA0003044976570000066
Generation of antagonistic networks and literature from and Trevor Darrell, Adversal failure recovery, International Conference on Learning recovery, 2017 [ 2]]Vincent Dumoulin,Ishmael Belghazi,Ben Poole,Alex Lamb,Martin Arjovsky,Olivier Mastropietro,and AaGeneration of antagonistic network models in ron Courville.Adversally left involved.Internationality Conference on Learning responses, 2017. the antagonistic network models will jointly distribute pG(x,z)=p(z)pG(x | z) and pE(x,z)=pX(x)pECountermeasure decision network J with (z | x) and x and z as inputsxzAnd (4) matching. Document [ 1]]Generation of countermeasure networks and documents [ 2]]The generation of the countermeasure network will determine the network JxzThe generating network G and the coding network E are determined as a saddle point problem MING,E MAX Jxz V(JxzE, G), where V (Jxz, E, G) is defined as:
Figure BDA0003044976570000071
wherein the content of the first and second substances,
Figure BDA0003044976570000072
probability expectation functions representing the data distribution in the X and Z data spaces, respectively;
for fixed values of the encoding network E and the generating network G, optimally judging the network
Figure BDA0003044976570000073
Comprises the following steps:
Figure BDA0003044976570000074
for optimal discriminant networks
Figure BDA0003044976570000075
If and only if pE(x,z)=pG(x, z), implementing training criterion C (E, G) ═ MAX Jxz V(DxzE, G).
Although theoretically, p is distributed jointlyE(x, z) and pG(x, z) should be equal, but in practice this is often not the case, as the training does not necessarily converge to the solution of the saddle point problem. This may result in a violation of codec consistency,resulting in G (E (x) ≠ x).
To solve this problem, the ALICE framework [3]]Chunyuan Li, Hao Liu, Changyou Chen, Yuche Pu, Liqun Chen, Ricardo Henao, and Lawrence Carin. Alice Towards understanding adaptive entropy learning for joint distribution matching in I.Guyon, U.V.Luxburg, S.Bengio, H.Wallach, R.Fergus, S.Vishwatanathan, and R.Garnett, editors, Advances in Neural Information Processing Systems 30, pages 5495-5503. Curran Associates, Inc.,2017. estimating entropy in a antagonistic mannerπ(x|z)=-Επ(x,z)[logπ(x|z)](where π (x, z) is the probability of joint distribution over x and z), thereby resolving codec uniformity. The saddle point problem MING,E Max Jxz VALICE(Jxz, E, G) includes encoding the network E and generating conditional entropy regularization (V) on the network GCE):
VAlice(jxz,E,G0=V(Jxz,E,G)+VCE(E,G)
The conditional entropy regularization applied to encoder E and generation network G may be pre-estimated using an additional discriminant network Jxx (x, x),
Figure BDA0003044976570000076
and document [3] proves that the discrimination network can effectively ensure the consistency of encoding and decoding.
Stably generating the confrontation network training:
referring to FIG. 1, an additional counterstudy discriminant network Jzz is used to conceal the spatial condition Hπ(x|z)=-Επ(x,z)[logπ(x|z)]Regularization, where π (x, z) is a joint distribution over x and z, with the following saddle point targets:
Figure BDA0003044976570000081
in conclusion, our proposed strategy to combat learning anomaly detection solves the saddle point problem during training,
Figure BDA0003044976570000082
wherein, V (J)XZ,JXX,JZZE, G) are defined as
V(Jxz,Jxx,Jzz,E,G)=V(Jxz,E,G)+V(Jxx,E,G)+V(Jzz,E,G)
Wherein, Jzz、JxzAnd JxxEach represents a discriminating network, G represents a generating network, and E represents an encoder.
And (3) abnormal scoring:
referring to FIG. 2, for the task of anomaly detection, we want to model the data distribution efficiently and learn the normal data p using the generation network GG(x)=pX(x) In which
Figure BDA0003044976570000083
In addition, the distribution of the data is learned so as to accurately recover the re-expression of the latent data space; and ensure that normal samples can be accurately reconstructed.
One example of a reconstruction-based anomaly detection technique that evaluates the distance between a sample and its reconstructed output. The normal samples will be reconstructed accurately, while the reconstruction results for the abnormal samples may be poor.
The generation of the confrontation network model ensures effective modeling of data distribution and the distribution of learning data, and ensures the distribution of the learning data and the accurate reconstruction of normal samples by using two symmetrical conditional entropy consistency regularizations.
Secondly, we need to use a good anomaly score to quantify the distance between the real sample and its reconstructed sample. We give an explanation of the reason why the chosen metric should work well. This was confirmed by ablation studies described in the experimental section.
The euclidean distance between the original image and its reconstruction in image space is not a reliable measure of dissimilarity. Since images with similar visual characteristics are not necessarily close to each other in terms of euclidean distance, a lot of noise may be included.
Therefore, these vectors must be projected in the feature space and the reconstruction distance calculated in this new space.
Is calculated at JxxDistance between two vectors projected in learned feature space, a (x) | | fxx(x,x)-fxx(x,G(E(x)))||1Wherein f () is the model JxxThe last fully connected layer.
For the anomaly criterion, JxxIs preferred over the simple output of model Jxx. Shall use JxxMeasuring dissimilarity of the two images; but if the system reaches a stable equilibrium, the approximate and true distributions of the resulting network can be completely fitted. J. the design is a squarexxThe prediction of (a) becomes randomized and is clearly not a suitable metric.
Training a model on normal data to provide E, G, Jxz、JxxAnd JzzThen, a scoring function u (x) is defined, which measures the degree of abnormality of the example x according to the difference between the reconstructed sample and the abnormal sample:
U(x)=||fxx(x,x)-fxx(x,G(E(x)))||2
this enhances the feasibility of discriminating networks, i.e. using our generating network to encode and reconstruct samples, resulting in samples from a true data distribution. Samples with large values of u (x) are considered to have a high probability of being anomalous data.
The specific test set-up was as follows:
data set: the counterlearning anomaly detection method was evaluated on a publicly available image data set. We used the SVHN dataset [9] Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng.reading digits in natural images with unsupervised features leaving.012011, which contains house number images, and the CIFAR10 dataset [8] Alex Krizhevsky.leaving multiple layers of features from animals & trucks, including animals or vehicles such as horses, dogs, cars and trucks. The statistics of the data set are shown in table 1.
Data quantity distribution: we generated 10 different datasets from the SVHN dataset [9] and the CIFAR10 dataset [8] by treating one category as the normal category and the remaining 9 categories as the abnormal instances in turn.
For each data set, we first trained 80% of the full formal data set, and the rest was used for the test set.
Figure BDA0003044976570000091
Table 1: common reference dataset statistics
Figure BDA0003044976570000092
25% of the training set was deleted for the validation set and outlier samples were deleted from the training set and validation set for the novel detection task. We compared the models using the area under the receiver operating characteristics (AUROC). For image data, we use an early stop on the validation set to determine the number of epochs to use to train the model. We use reconstruction losses derived from the characteristics of the reconstruction discrimination network as validation losses to stop ahead.
Comparing models:
one type of support vector machine (OC-SVM) [7 ]]Andonia creating and oil creating bharath.invoking the generator of a generating adaptive network. NIPS Workshop on adaptive Training, 2016: the method is a classic abnormal detection method, and a judgment boundary is learned around a normal example. We set the v parameter to the assumed known expected anomaly proportion in the dataset and the gamma parameter to 1/m using the radial basis function kernel, where m is the number of input features. After a grid search of this parameter in all experiments (γ ═ 1/m or 10)nWhere n ═ 3, -2, -1, 0, 1), we have found that setting in a completely unsupervised manner is a viable option.
Isolated Forest (IF) [10] Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou in isolation for est. in Proceedings of the 2008 origin IEEE International Conference on Data Mining, ICDM' 08, pages 413-: is a newer classical machine learning technique that will look for isolated anomalous data rather than modeling the normal data distribution. The method constructs a decision tree by using randomly selected segmentation values for randomly selected features. The anomaly score is then defined as the average path length from a particular sample to the root. In all experiments we used the standard parameters provided by scinit-spare [11] F.Pedregosa, G.Varoquaux, A.Gramfort, V.Michel, B.Thirion, O.Grisel, M.Blondel, P.prettenhofer, R.Weiss, V.Dubourg, J.Vanderplas, A.Passos, D.Cournaeau, M.Brucher, M.Perrot, and E.Duchesnag.Scicket-spare: Machine spare in Pyrhon.journal of Machine spare Research,12: 2825-2830, 2011.
Deep structural energy model (DSEBM) [12] Shuangfei ZHai, Yu Cheng, Weining Lu, and Zhongfei ZHang. deep structural energy based models for environmental protection. International Conference on Machine Learning, pages 1100, 2016-: is one of the most advanced methods based on an automatic encoder. The main idea is to accumulate energy between layers, similar to a de-noising autoencoder. In this method, two anomaly determination criteria are studied: energy and reconstruction error. We included these two criteria in the experiment, namely DSEMB-r (reconstruction) and DSEBM-e (energy).
Ano generating a countermeasure network [5]]Thomas Schlegl,Philipp
Figure BDA0003044976570000101
Sebas t.Waldste in, Ursula Schmi dt-Erfunth, and Georg Langs.Unsupervi analog detection with general adaptive network to guide marker di scan. International Conference on information processing in Medical Imaging, page pp.146-157,2017: is the only published anomaly detection method based on generation of a countermeasure network. It trains the DC to generate a countermeasure network [4 ]]Yuval Netzer,Tao Wang,Adam Coates,Alessandro Bissacco,Bo Wu,and Andrew Y Ng.Reading digits in natural images with unsupervised features learning.012011. the weights of the network are frozen during reasoning to recover a potential representation of the test data. The anomaly criterion is a combination of reconstruction and discrimination components. The reconstruction component measures the ability of the generation of the countermeasure network to reconstruct data by generating the network, while the discrimination component takes into account the score based on the discrimination network. Document [5]]Two anomaly scoring methods were compared and we selected the variable settings that work best here.
Image data experiment:
on the SVHN dataset, we observed that our model outperformed all baselines. But our method is significantly competitive with other comparative methods on the CIFAR10 dataset. The intuitive understanding is that when training our model for a class, it only learns how to reconstruct samples from that class, possibly reconstructing the abnormal samples as the closest image to the normal class, resulting in false negatives when evaluating the features of the reconstructed discriminative network.
Table 2: image dataset performance
Figure BDA0003044976570000121
Inference time comparisons between Ano-generated countermeasure networks [5] and our models are reported. The reasoning experiments in table 3 were performed sequentially on the same GPU, which was only used for reasoning operations. The first class of inference times is illustrated for SVHN [9] and CIFAR10[8 ].
It is therefore observed that this model is orders of magnitude faster than other anomaly detection methods based on generation of a countermeasure network.
Table 3: average inference time (ms) on GeForce GTX TITAN X
Figure BDA0003044976570000122
Details of the experiment:
CIFAR10 and SVHN experimental details
Pretreatment: the pixels are scaled to the range of [ -1,1 ].
DSEBM:
For CIFAR10 and SVHN, we use the following architecture: one convolutional layer, core size 3, stride 2, 64 filters, "same" fill, one max pooling layer and one full link layer containing 128 cells.
Ano generates a competing network:
we performed these experiments using formal DC generation against network architecture and hyper-parameters. For the anomaly detection task, we used the same hyper-parameters as the original paper. The decay rate was estimated to be 0.999 using an exponential moving average.
The invention uses the output of the star layer in the discrimination network to carry out anomaly scoring, and all the convolution layers have the same filling.
The embodiment of the invention provides a high-dimensional data fault abnormity detection method based on a generation countermeasure network, which greatly improves the accuracy of abnormity fault detection and obviously improves the detection speed. The method uses a class of generative confrontation networks that simultaneously learn the encoder network during training, enabling efficient reasoning during testing. In addition, recent techniques have been employed to further improve the encoder network and to stably generate antagonistic network training, and ablative studies have shown that these techniques improve the performance of the anomaly detection task. Experiments on a series of high-dimensional image data prove the efficiency and effectiveness of the method.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices, modules, units provided by the present invention as pure computer readable program code, the system and its various devices, modules, units provided by the present invention can be fully implemented by logically programming method steps in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units included in the system for realizing various functions can also be regarded as structures in the hardware component; means, modules, units for performing the various functions may also be regarded as structures within both software modules and hardware components for performing the method.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims (10)

1. A high-dimensional data fault anomaly detection method based on a generation countermeasure network is characterized by comprising the following steps:
step S1: constructing and generating an antagonistic network architecture;
step S2: after a generated confrontation network architecture is constructed, stably generating confrontation network training to obtain a training model;
step S3: and setting a scoring function according to the training model, performing anomaly scoring on the generated countermeasure network, and performing anomaly detection on the high-dimensional data by using the generated countermeasure network.
2. The method for detecting fault and anomaly in high-dimensional data based on generation countermeasure network as claimed in claim 1, wherein the step S1 includes the following steps:
step S1.1: the standard generation countermeasure network comprises a generation network G and a discrimination network J, and the generation network G and the discrimination network J are arranged in a group of M data samples
Figure FDA0003044976560000011
Training, wherein i ═ 1, 2, …, M;
in a hidden data space which obeys specific distribution, a generated network G maps a collected random hidden variable z to an input data space X;
discrimination network J attempts to combine actual data samples x(i)Judging with a sample G (z) generated by G;
p is to bex(x) Is defined as the probability of the distribution of the real data X in the sample space X, and
Figure FDA0003044976560000017
for hidden data z in hidden data space
Figure FDA0003044976560000018
The distribution probability of (1); p is a radical ofG(x) Defined as generating the distribution probability of the network G in the sample space X;
generating a Confrontation network model to distribute p jointlyG(x,z)=p(z)pG(x | z) and pE(x,z)=pX(x)pECountermeasure decision network J with (z | x) and x and z as inputsxzMatching;
generating a countermeasure network would identify network JxzThe generating network G and the coding network E are determined as a saddle point problem MING,EMAX JxzV(JxzE, G), where V (Jxz, E, G) is defined as:
Figure FDA0003044976560000012
wherein the content of the first and second substances,
Figure FDA0003044976560000013
probability expectation functions representing the data distribution in the X and Z data spaces, respectively;
step S1.2: for fixed values of the encoding network E and the generating network G, optimally judging the network
Figure FDA0003044976560000014
Comprises the following steps:
Figure FDA0003044976560000015
for optimal discriminant networks
Figure FDA0003044976560000016
If and only if pE(x,z)=pG(x, z), implementing training criterion C (E, G) ═ MAX JxzV(DxzE, G).
3. The method for detecting fault and anomaly in high-dimensional data based on generation countermeasure network as claimed in claim 1, wherein the step S2 includes the following steps:
implicit in the spatial condition H using an additional antagonistic learning discriminant network Jzzπ(x|z)=-Eπ(x,z)[logπ(x|z)]Regularization, where π (x, z) is a joint distribution over x and z, with the following saddle point targets:
Figure FDA0003044976560000021
Figure FDA0003044976560000022
wherein, V (J)XZ,JXX,JZZE, G) are defined as
V(Jxz,Jxx,Jzz,E,G)=V(Jxz,E,G)+V(Jxx,E,G)+V(Jzz,E,G)
Wherein, Jzz、JxzAnd JxxEach represents a discriminating network, G represents a generating network, and E represents an encoder.
4. The method for detecting fault and anomaly in high-dimensional data based on generation countermeasure network as claimed in claim 1, wherein the step S3 includes the following steps:
step S3.1: performing effective modeling on data distribution, and learning normal data p by using generation network GG(x)=pX(x) In which
Figure FDA0003044976560000024
Step S3.2: learning the distribution of the data so as to accurately recover the re-expression of the latent data space;
step S3.3: ensure that normal samples can be accurately reconstructed.
5. The method for detecting fault anomaly in high-dimensional data based on generation countermeasure network according to claim 4, characterized in that the step S3.3 is as follows:
is calculated at JxxDistance between two vectors projected in learned feature space, a (x) | | fxx(x,x)-fxx(x,G(E(x)))||1
Wherein f () is the model JxxThe last fully connected layer of (a);
training a model on normal data to provide E, G, Jxz、JxxAnd JzzThen, a scoring function u (x) is defined, which measures the degree of abnormality of the example x according to the difference between the reconstructed sample and the abnormal sample:
U(x)=||fxx(x,x)-fxx(x,G(E(x)))||2
u (x) samples with large values are considered to have a large probability of being anomalous data.
6. A system for detecting fault abnormality of high-dimensional data based on a generation countermeasure network, comprising:
model M1: constructing and generating an antagonistic network architecture;
model M2: after a generated confrontation network architecture is constructed, stably generating confrontation network training to obtain a training model;
model M3: and setting a scoring function according to the training model, performing anomaly scoring on the generation countermeasure network, and performing anomaly detection on the high-dimensional data by using the generation countermeasure network.
7. The system for high-dimensional data failure anomaly detection based on generation countermeasure network according to claim 6, wherein the module M1 comprises:
the standard generation countermeasure network comprises a generation network G and a discrimination network J, and the generation network G and the discrimination network J are arranged in a group of M data samples
Figure FDA0003044976560000023
Training, wherein i ═ 1, 2, …, M;
in a hidden data space which obeys specific distribution, a generated network G maps a collected random hidden variable z to an input data space X;
discrimination network J attempts to combine actual data samples x(i)Judging with a sample G (z) generated by G;
p is to bex(x) Is defined as the probability of the distribution of the real data X in the sample space X, and
Figure FDA0003044976560000038
for hidden data z in hidden data space
Figure FDA0003044976560000039
The distribution probability of (1); p is a radical ofG(x) Defined as generating the distribution probability of the network G in the sample space X;
generating a Confrontation network model to distribute p jointlyG(x,z)=p(z)pG(x | z) and pE(x,z)=pX(x)pECountermeasure decision network J with (z | x) and x and z as inputsxzMatching;
generating a countermeasure network would identify network JxzThe generating network G and the coding network E are determined as a saddle point problem MING,EMAX JxzV(JxzSolutions of E, G) in which V (J)xzE, G) is defined as:
Figure FDA0003044976560000031
wherein the content of the first and second substances,
Figure FDA0003044976560000032
probability expectation functions representing the data distribution in the X and Z data spaces, respectively;
for fixed values of the encoding network E and the generating network G, optimally judging the network
Figure FDA0003044976560000033
Comprises the following steps:
Figure FDA0003044976560000034
for optimal discriminant networks
Figure FDA0003044976560000035
If and only if pE(x,z)=pG(x, z), implementing training criterion C (E, G) ═ MAX JxzV(DxzE, G).
8. The system for high-dimensional data failure anomaly detection based on generation countermeasure network according to claim 6, wherein the module M2 comprises:
implicit in the spatial condition H using an additional antagonistic learning discriminant network Jzzπ(x|z)=-Eπ(x,z)[logπ(x|z)]Regularization, where π (x, z) is a joint distribution over x and z, with the following saddle point targets:
Figure FDA0003044976560000036
Figure FDA0003044976560000037
wherein, V (J)XZ,JXX,JZZE, G) are defined as
V(Jxz,Jxx,Jzz,E,G)=V(Jxz,E,G)+V(Jxx,E,G)+V(Jzz,E,G)
Wherein, Jzz、JxzAnd JxxEach represents a discriminating network, G represents a generating network, and E represents an encoder.
9. The system for high-dimensional data failure anomaly detection based on generation countermeasure network according to claim 6, wherein the module M3 comprises:
performing effective modeling on data distribution, and learning normal data p by using generation network GG(x)=pX(x) In which
Figure FDA00030449765600000310
Learning the distribution of the data so as to accurately recover the re-expression of the latent data space;
ensure that normal samples can be accurately reconstructed.
10. The system for detecting the high-dimensional data fault abnormality based on the generative countermeasure network according to claim 9, wherein the ensuring of the accurate reconstruction of the normal sample is as follows:
is calculated at JxxDistance between two vectors projected in learned feature space, a (x) | | fxx(x,x)-fxx(x,G(E(x)))||1
Wherein f () is the model JxxThe last fully connected layer of (a);
training a model on normal data to provide E, G, Jxz、JxxAnd JzzThen, a scoring function u (x) is defined, which measures the degree of abnormality of the example x according to the difference between the reconstructed sample and the abnormal sample:
U(x)=||fxx(x,x)-fxx(x,G(E(x)))||2
u (x) samples with large values are considered to have a large probability of being anomalous data.
CN202110468859.5A 2021-04-28 2021-04-28 High-dimensional data fault anomaly detection method and system based on generation countermeasure network Pending CN113077013A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110468859.5A CN113077013A (en) 2021-04-28 2021-04-28 High-dimensional data fault anomaly detection method and system based on generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110468859.5A CN113077013A (en) 2021-04-28 2021-04-28 High-dimensional data fault anomaly detection method and system based on generation countermeasure network

Publications (1)

Publication Number Publication Date
CN113077013A true CN113077013A (en) 2021-07-06

Family

ID=76619031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110468859.5A Pending CN113077013A (en) 2021-04-28 2021-04-28 High-dimensional data fault anomaly detection method and system based on generation countermeasure network

Country Status (1)

Country Link
CN (1) CN113077013A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330444A (en) * 2017-05-27 2017-11-07 苏州科技大学 A kind of image autotext mask method based on generation confrontation network
CN108009628A (en) * 2017-10-30 2018-05-08 杭州电子科技大学 A kind of method for detecting abnormality based on generation confrontation network
CN109165735A (en) * 2018-07-12 2019-01-08 杭州电子科技大学 Based on the method for generating confrontation network and adaptive ratio generation new samples
CN109410179A (en) * 2018-09-28 2019-03-01 合肥工业大学 A kind of image abnormity detection method based on generation confrontation network
CN109584221A (en) * 2018-11-16 2019-04-05 聚时科技(上海)有限公司 A kind of abnormal image detection method generating confrontation network based on supervised
CN109580215A (en) * 2018-11-30 2019-04-05 湖南科技大学 A kind of wind-powered electricity generation driving unit fault diagnostic method generating confrontation network based on depth
CN110991027A (en) * 2019-11-27 2020-04-10 华南理工大学 Robot simulation learning method based on virtual scene training
CN112435221A (en) * 2020-11-10 2021-03-02 东南大学 Image anomaly detection method based on generative confrontation network model

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330444A (en) * 2017-05-27 2017-11-07 苏州科技大学 A kind of image autotext mask method based on generation confrontation network
CN108009628A (en) * 2017-10-30 2018-05-08 杭州电子科技大学 A kind of method for detecting abnormality based on generation confrontation network
CN109165735A (en) * 2018-07-12 2019-01-08 杭州电子科技大学 Based on the method for generating confrontation network and adaptive ratio generation new samples
CN109410179A (en) * 2018-09-28 2019-03-01 合肥工业大学 A kind of image abnormity detection method based on generation confrontation network
CN109584221A (en) * 2018-11-16 2019-04-05 聚时科技(上海)有限公司 A kind of abnormal image detection method generating confrontation network based on supervised
CN109580215A (en) * 2018-11-30 2019-04-05 湖南科技大学 A kind of wind-powered electricity generation driving unit fault diagnostic method generating confrontation network based on depth
CN110991027A (en) * 2019-11-27 2020-04-10 华南理工大学 Robot simulation learning method based on virtual scene training
CN112435221A (en) * 2020-11-10 2021-03-02 东南大学 Image anomaly detection method based on generative confrontation network model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HOUSSAM ZENATI 等: "Adversarially Learned Anomaly Detection", 《2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING》 *

Similar Documents

Publication Publication Date Title
Choi et al. Gan-based anomaly detection and localization of multivariate time series data for power plant
Sun et al. Robust co-training
CN111581405A (en) Cross-modal generalization zero sample retrieval method for generating confrontation network based on dual learning
CN113902926A (en) General image target detection method and device based on self-attention mechanism
Peng et al. Fault feature extractor based on bootstrap your own latent and data augmentation algorithm for unlabeled vibration signals
Hu et al. You only segment once: Towards real-time panoptic segmentation
CN112785526B (en) Three-dimensional point cloud restoration method for graphic processing
EP4246458A1 (en) System for three-dimensional geometric guided student-teacher feature matching (3dg-stfm)
Xu et al. A zero-shot fault semantics learning model for compound fault diagnosis
Alawieh et al. Identifying wafer-level systematic failure patterns via unsupervised learning
Rahul et al. Detection and correction of abnormal data with optimized dirty data: a new data cleaning model
Abu-Gellban et al. Livedi: An anti-theft model based on driving behavior
Han et al. L-Net: lightweight and fast object detector-based ShuffleNetV2
CN115587335A (en) Training method of abnormal value detection model, abnormal value detection method and system
Zhao et al. Fault diagnosis based on space mapping and deformable convolution networks
Kang et al. Htnet: Anchor-free temporal action localization with hierarchical transformers
Haurum et al. Multi-scale hybrid vision transformer and Sinkhorn tokenizer for sewer defect classification
CN113077013A (en) High-dimensional data fault anomaly detection method and system based on generation countermeasure network
CN115222998B (en) Image classification method
Malik et al. Teacher-class network: A neural network compression mechanism
He et al. A diffusion-based framework for multi-class anomaly detection
Wang et al. Unsupervised anomaly detection with local-sensitive VQVAE and global-sensitive transformers
Wang et al. Deep embedded clustering with asymmetric residual autoencoder
Wickramasinghe et al. Deep embedded clustering with ResNets
Olin-Ammentorp et al. Bridge networks: Relating inputs through vector-symbolic manipulations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210706