CN113077013A

CN113077013A - High-dimensional data fault anomaly detection method and system based on generation countermeasure network

Info

Publication number: CN113077013A
Application number: CN202110468859.5A
Authority: CN
Inventors: 不公告发明人
Original assignee: Shanghai Lianlu Semiconductor Technology Co ltd
Current assignee: Shanghai Lianlu Semiconductor Technology Co ltd
Priority date: 2021-04-28
Filing date: 2021-04-28
Publication date: 2021-07-06

Abstract

The invention provides a high-dimensional data fault anomaly detection method and system based on a generation countermeasure network, and relates to the technical field of anomaly detection in multi-dimensional data, wherein the method comprises the following steps: step S1: generating a countermeasure network architecture; step S2: after the confrontation network architecture is generated, the confrontation network training is stably generated, and a training model is obtained; step S3: and setting a scoring function according to the training model, and performing abnormal scoring on the countermeasure network. The invention can detect abnormal data which does not appear before, and process the abnormal detection of the two-dimensional and three-dimensional image data of the semiconductor wafer.

Description

High-dimensional data fault anomaly detection method and system based on generation countermeasure network

Technical Field

The invention relates to the technical field of anomaly detection in multidimensional data, in particular to a high-dimensional data fault anomaly detection method and system based on a generation countermeasure network.

Background

Anomaly detection in multidimensional data is a problem of great practical significance, and comprises a large number of practical applications in the real world, including network security manufacturing, fraud detection, medical imaging and the like. A typical anomaly detection method requires modeling of the pattern of normal data to identify anomalous samples that do not conform to the normal data pattern. Although anomaly detection has been subject to a great deal of research, there are still significant challenges to developing efficient methods suitable for complex and high-dimensional data.

Generating a countermeasure network is a powerful, high-dimensional data modeling framework that can address this challenge. The standard generative confrontation network consists of two neural networks, one is the generative network (G) and one is the discriminative network (J) which learns the mapping pattern by learning mapping from hidden data variables (z) (assumed to obey gaussian or uniform distribution) to the virtual real data space during training, while the discriminative network is used to learn samples that distinguish real data from virtual real data generated by the generative network. Generation of countermeasure networks has enjoyed great success in the application of virtual image generation and is increasingly used in speech and medical imaging applications.

The invention patent publication US6292582B1 discloses a method and system for identifying defects in semiconductors that is capable of classifying specific types of anomalies, including image acquisition and processing, but searches for pre-extracted features based on a nearest neighbor database to find nearest neighbor anomalies and fail to detect new anomalies.

The invention patent with publication number US8126681B2 discloses a method for identifying semiconductor outliers using a sequential combined data transformation process, which is based on simple statistical techniques, has a fast evaluation speed and a certain theoretical basis, but needs to use electrical test data with higher acquisition cost, and uses classical statistical methods to detect outliers, which may not be enough to capture outliers in complex image data.

Disclosure of Invention

In view of the defects in the prior art, the present invention provides a method and a system for detecting fault abnormality of high-dimensional data based on a generative countermeasure network, so as to solve the above problems.

According to the high-dimensional data fault abnormity detection method and system based on the generation countermeasure network provided by the invention, the scheme is as follows:

in a first aspect, a method for detecting fault abnormality of high-dimensional data based on generation of a countermeasure network is provided, the method including:

constructing and generating an antagonistic network architecture;

after a generated confrontation network architecture is constructed, stably generating confrontation network training to obtain a training model;

and setting a scoring function according to the training model, performing anomaly scoring on the generation countermeasure network, and performing anomaly detection on the high-dimensional data by using the generation countermeasure network.

Preferably, the specific steps of generating the countermeasure network architecture are as follows:

the standard generation countermeasure network comprises a generation network G and a discrimination network J, and the generation network G and the discrimination network J are arranged in a group of M data samples

Training, wherein i ═ 1, 2, …, M;

in a hidden data space which obeys specific distribution, a generated network G maps a collected random hidden variable z to an input data space X;

discrimination network J attempts to combine actual data samples x⁽ⁱ⁾Judging with a sample G (z) generated by G;

p is to be_x(x) Is defined as the probability of the distribution of the real data X in the sample space X, and

for hidden data z in hidden data space

The distribution probability of (1); p is a radical of_G(x) Defined as generating the distribution probability of the network G in the sample space X;

generating a Confrontation network model to distribute p jointly_G(x，z)＝p(z)p_G(x | z) and p_E(x，z)＝p_X(x)p_ECountermeasure decision network J with (z | x) and x and z as inputs_xzMatching;

generating a countermeasure network would identify network J_xzThe generating network G and the coding network E are determined as a saddle point problem MIN_G,EMAX J_xz V(J_xzSolutions of E, G) in which V: (Jxz, E, G) is defined as:

wherein the content of the first and second substances,

probability expectation functions representing the data distribution in the X and Z data spaces, respectively;

for fixed values of the encoding network E and the generating network G, optimally judging the network

Comprises the following steps:

for optimal discriminant networks

If and only if p_E(x，z)＝p_G(x, z), implementing training criterion C (E, G) ═ MAX J_xz V(D_xzE, G).

Preferably, the specific steps of acquiring the training model are as follows:

implicit in the spatial condition H using an additional antagonistic learning discriminant network Jzz^π(x|z)＝-Ε_π(x,z)[logπ(x|z)]Regularization, where π (x, z) is a joint distribution over x and z, with the following saddle point targets:

wherein, V (J)_XZ，J_XX，J_ZZE, G) are defined as

V(J_xz,J_xx，J_zz，E，G)＝V(J_xz，E，G)+V(J_xx，E，G)+V(J_zz，E，G)

Wherein, J_zz、J_xzAnd J_xxEach represents a discriminating network, G represents a generating network, and E represents an encoder.

Preferably, the abnormality scoring specifically comprises the following steps:

performing effective modeling on data distribution, and learning normal data p by using generation network G_G(x)＝p_X(x) In which

Learning the distribution of the data so as to accurately recover the re-expression of the latent data space;

ensuring that a normal sample can be accurately reconstructed;

preferably, the normal sample is reconstructed as follows:

is calculated at J_xxDistance between two vectors projected in learned feature space, a (x) | | f_xx(x，x)-f_xx(x，G(E(x)))||₁；

Wherein f () is the model J_xxThe last fully connected layer of (a);

training a model on normal data to provide E, G, J_xz、J_xxAnd J_zzThen, a scoring function u (x) is defined, which measures the degree of abnormality of the example x according to the difference between the reconstructed sample and the abnormal sample:

U(x)＝||f_xx(x,x)-f_xx(x,G(E(x)))||₂

u (x) samples with large values are considered to have a large probability of being anomalous data.

In a second aspect, a system for detecting fault and anomaly of high-dimensional data based on generation of a countermeasure network is provided, the system comprising:

model M1: constructing and generating an antagonistic network architecture;

model M2: after a generated confrontation network architecture is constructed, stably generating confrontation network training to obtain a training model;

model M3: and setting a scoring function according to the training model, performing anomaly scoring on the generation countermeasure network, and performing anomaly detection on the high-dimensional data by using the generation countermeasure network.

Preferably, the module M1 includes:

Training, wherein i ═ 1, 2, …, M;

for hidden data z in hidden data space

generating a countermeasure network would identify network J_xzThe generating network G and the coding network E are determined as a saddle point problem MIN_G,E MAX J_xz V(J_xzE, G), where V (Jxz, E, G) is defined as:

wherein the content of the first and second substances,

which are the expected functions of the data distribution in the X and Z data spaces, respectively.

Comprises the following steps:

for optimal discriminant networks

Preferably, the module M2 includes:

wherein, V (J)_XZ，J_XX，J_ZZE, G) are defined as

V(J_xz,J_xx,j_zz，E，G)＝V(J_xz，E，G)+V(J_xx，E，G)+V(J_zz，E，G)

Preferably, the module M3 includes:

ensuring that a normal sample can be accurately reconstructed;

preferably, the method for ensuring accurate reconstruction of a normal sample specifically comprises the following steps:

Wherein f () is the model J_xxThe last fully connected layer of (a);

U(x)＝||f_xx(x,x)-f_xx(x,G(E(x)))||₂

Compared with the prior art, the invention has the following beneficial effects:

1. the deep learning anomaly detection model provided by the invention does not need known anomaly data during training;

2. the invention can detect the abnormal data which does not appear before;

3. the invention can process the abnormal detection of two-dimensional and three-dimensional image data of the semiconductor wafer.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a final generated countermeasure network model;

FIG. 2 is a graph of test data using outliers, encoded expression, and reconstruction.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.

The embodiment of the invention provides a high-dimensional data fault abnormity detection method based on a generated countermeasure network, and as shown in figure 1, firstly, a framework for generating the countermeasure network is constructed:

Training, wherein i ═ 1, 2, …, M; in a hidden data space which obeys specific distribution, a generated network G maps a collected random hidden variable z to an input data space X; discrimination network J attempts to combine actual data samples x⁽ⁱ⁾And G (z) is distinguished from the sample G (z) generated by G.

In the overall structure, the two networks, the generation network G and the discrimination network J compete with each other: the generating network G attempts to generate samples similar to the real data, while the discriminating network J will be used to discriminate between the pseudo samples and the real data samples generated by the generating network. Training the generating countermeasure network then typically takes an alternating gradient step so that the generating network G can better "fool" the discriminating network J and cause the discriminating network J to better discriminate the pseudo-samples generated by the generating network G.

Formally, p is_x(x) Is defined as the probability of the distribution of the real data X in the sample space X, and

for hidden data z in hidden data space

saddle point problem MIN needs to be solved in generating a countermeasure network_G MAX_JV (J, G) for training a discrimination network J and generating a network G, wherein

The optimal generation network may produce a distribution p of data corresponding to the true data_X(x) Matched distribution p_G(x)。

For generating network G fixed value, optimal discrimination network

Comprises the following steps:

for optimal discriminant networks

If and only if p_G(x)＝p_X(x) Then, we can implement the training criterion c (g) ═ MAX_JGlobal minimum of V (J, G)

In practical applications, the method of alternating gradient descent is usually performed on the discrimination network J and the network generating the network G to train the discrimination network J and the generation network G: one of the network parameters is fixed, so that V (J, G) is maximized (for J) or minimized (for G) accordingly. After training the generative network to generate the antagonistic network, p can be acquired using the generative network_XReal sample, coincidence

Generating dummy data samples for G (z). It should be noted that for a given data sample x, it is not possible to compute its distribution probability explicitly or to compute its distribution probability of the hidden data.

And (3) detecting the abnormal situation of the confrontation network:

the standard generation countermeasure network supports only valid data samples, which can be adjusted in several ways to achieve anomaly detection. For example, for data point x, a sample can be used to calculate the probability of distribution of an anomaly for x, thereby determining if it is an outlier. While effective sampling can be made from generating a competing network, the exact computation of the probabilities typically requires a large number of samples, resulting in a very large number of probabilistic computations. Another approach is to "invert" the generating network to find the hidden variable z by random gradient descent that reduces the reconstruction error or related objective as much as possible. Since each gradient calculation needs to be propagated backward through the generation network, the calculation amount of the method is very large, and the method is difficult to be practically applied.

To improve computational efficiency, we build a generative confrontation network containing a coding network E that maps data samples x to a hidden space z during training. In such models, computing the implicit spatial representation of the data point x (approximation) can be achieved by simply inputting x through the encoding network E. New improvements are incorporated in our model to improve the coding network by adding additional discriminant networks to achieve codec consistency, i.e. G (e (x)) ≈ x.

Theoretically, reference [ 1]]Jeff Donahue,Philipp

Generation of antagonistic networks and literature from and Trevor Darrell, Adversal failure recovery, International Conference on Learning recovery, 2017 [ 2]]Vincent Dumoulin,Ishmael Belghazi,Ben Poole,Alex Lamb,Martin Arjovsky,Olivier Mastropietro,and AaGeneration of antagonistic network models in ron Courville.Adversally left involved.Internationality Conference on Learning responses, 2017. the antagonistic network models will jointly distribute p_G(x，z)＝p(z)p_G(x | z) and p_E(x，z)＝p_X(x)p_ECountermeasure decision network J with (z | x) and x and z as inputs_xzAnd (4) matching. Document [ 1]]Generation of countermeasure networks and documents [ 2]]The generation of the countermeasure network will determine the network J_xzThe generating network G and the coding network E are determined as a saddle point problem MIN_G,E MAX J_xz V(J_xzE, G), where V (Jxz, E, G) is defined as:

wherein the content of the first and second substances,

Comprises the following steps:

for optimal discriminant networks

Although theoretically, p is distributed jointly_E(x, z) and p_G(x, z) should be equal, but in practice this is often not the case, as the training does not necessarily converge to the solution of the saddle point problem. This may result in a violation of codec consistency,resulting in G (E (x) ≠ x).

To solve this problem, the ALICE framework [3]]Chunyuan Li, Hao Liu, Changyou Chen, Yuche Pu, Liqun Chen, Ricardo Henao, and Lawrence Carin. Alice Towards understanding adaptive entropy learning for joint distribution matching in I.Guyon, U.V.Luxburg, S.Bengio, H.Wallach, R.Fergus, S.Vishwatanathan, and R.Garnett, editors, Advances in Neural Information Processing Systems 30, pages 5495-5503. Curran Associates, Inc.,2017. estimating entropy in a antagonistic manner^π(x|z)＝-Ε_π(x,z)[logπ(x|z)](where π (x, z) is the probability of joint distribution over x and z), thereby resolving codec uniformity. The saddle point problem MIN_G,E Max J_xz V_ALICE(Jxz, E, G) includes encoding the network E and generating conditional entropy regularization (V) on the network G_CE)：

V_Alice(j_xz,E,G0＝V(J_xz,E,G)+V_CE(E,G)

The conditional entropy regularization applied to encoder E and generation network G may be pre-estimated using an additional discriminant network Jxx (x, x),

and document [3] proves that the discrimination network can effectively ensure the consistency of encoding and decoding.

Stably generating the confrontation network training:

referring to FIG. 1, an additional counterstudy discriminant network Jzz is used to conceal the spatial condition H^π(x|z)＝-Ε_π(x,z)[logπ(x|z)]Regularization, where π (x, z) is a joint distribution over x and z, with the following saddle point targets:

in conclusion, our proposed strategy to combat learning anomaly detection solves the saddle point problem during training,

wherein, V (J)_XZ，J_XX，J_ZZE, G) are defined as

V(J_xz,J_xx,J_zz,E,G)＝V(J_xz,E,G)+V(J_xx,E,G)+V(J_zz,E,G)

And (3) abnormal scoring:

referring to FIG. 2, for the task of anomaly detection, we want to model the data distribution efficiently and learn the normal data p using the generation network G_G(x)＝p_X(x) In which

In addition, the distribution of the data is learned so as to accurately recover the re-expression of the latent data space; and ensure that normal samples can be accurately reconstructed.

One example of a reconstruction-based anomaly detection technique that evaluates the distance between a sample and its reconstructed output. The normal samples will be reconstructed accurately, while the reconstruction results for the abnormal samples may be poor.

The generation of the confrontation network model ensures effective modeling of data distribution and the distribution of learning data, and ensures the distribution of the learning data and the accurate reconstruction of normal samples by using two symmetrical conditional entropy consistency regularizations.

Secondly, we need to use a good anomaly score to quantify the distance between the real sample and its reconstructed sample. We give an explanation of the reason why the chosen metric should work well. This was confirmed by ablation studies described in the experimental section.

The euclidean distance between the original image and its reconstruction in image space is not a reliable measure of dissimilarity. Since images with similar visual characteristics are not necessarily close to each other in terms of euclidean distance, a lot of noise may be included.

Therefore, these vectors must be projected in the feature space and the reconstruction distance calculated in this new space.

Is calculated at J_xxDistance between two vectors projected in learned feature space, a (x) | | f_xx(x，x)-f_xx(x，G(E(x)))||₁Wherein f () is the model J_xxThe last fully connected layer.

For the anomaly criterion, J_xxIs preferred over the simple output of model Jxx. Shall use J_xxMeasuring dissimilarity of the two images; but if the system reaches a stable equilibrium, the approximate and true distributions of the resulting network can be completely fitted. J. the design is a square_xxThe prediction of (a) becomes randomized and is clearly not a suitable metric.

U(x)＝||f_xx(x,x)-f_xx(x,G(E(x)))||₂

this enhances the feasibility of discriminating networks, i.e. using our generating network to encode and reconstruct samples, resulting in samples from a true data distribution. Samples with large values of u (x) are considered to have a high probability of being anomalous data.

The specific test set-up was as follows:

data set: the counterlearning anomaly detection method was evaluated on a publicly available image data set. We used the SVHN dataset [9] Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng.reading digits in natural images with unsupervised features leaving.012011, which contains house number images, and the CIFAR10 dataset [8] Alex Krizhevsky.leaving multiple layers of features from animals & trucks, including animals or vehicles such as horses, dogs, cars and trucks. The statistics of the data set are shown in table 1.

Data quantity distribution: we generated 10 different datasets from the SVHN dataset [9] and the CIFAR10 dataset [8] by treating one category as the normal category and the remaining 9 categories as the abnormal instances in turn.

For each data set, we first trained 80% of the full formal data set, and the rest was used for the test set.

Table 1: common reference dataset statistics

25% of the training set was deleted for the validation set and outlier samples were deleted from the training set and validation set for the novel detection task. We compared the models using the area under the receiver operating characteristics (AUROC). For image data, we use an early stop on the validation set to determine the number of epochs to use to train the model. We use reconstruction losses derived from the characteristics of the reconstruction discrimination network as validation losses to stop ahead.

Comparing models:

one type of support vector machine (OC-SVM) [7 ]]Andonia creating and oil creating bharath.invoking the generator of a generating adaptive network. NIPS Workshop on adaptive Training, 2016: the method is a classic abnormal detection method, and a judgment boundary is learned around a normal example. We set the v parameter to the assumed known expected anomaly proportion in the dataset and the gamma parameter to 1/m using the radial basis function kernel, where m is the number of input features. After a grid search of this parameter in all experiments (γ ═ 1/m or 10)ⁿWhere n ═ 3, -2, -1, 0, 1), we have found that setting in a completely unsupervised manner is a viable option.

Isolated Forest (IF) [10] Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou in isolation for est. in Proceedings of the 2008 origin IEEE International Conference on Data Mining, ICDM' 08, pages 413-: is a newer classical machine learning technique that will look for isolated anomalous data rather than modeling the normal data distribution. The method constructs a decision tree by using randomly selected segmentation values for randomly selected features. The anomaly score is then defined as the average path length from a particular sample to the root. In all experiments we used the standard parameters provided by scinit-spare [11] F.Pedregosa, G.Varoquaux, A.Gramfort, V.Michel, B.Thirion, O.Grisel, M.Blondel, P.prettenhofer, R.Weiss, V.Dubourg, J.Vanderplas, A.Passos, D.Cournaeau, M.Brucher, M.Perrot, and E.Duchesnag.Scicket-spare: Machine spare in Pyrhon.journal of Machine spare Research,12: 2825-2830, 2011.

Deep structural energy model (DSEBM) [12] Shuangfei ZHai, Yu Cheng, Weining Lu, and Zhongfei ZHang. deep structural energy based models for environmental protection. International Conference on Machine Learning, pages 1100, 2016-: is one of the most advanced methods based on an automatic encoder. The main idea is to accumulate energy between layers, similar to a de-noising autoencoder. In this method, two anomaly determination criteria are studied: energy and reconstruction error. We included these two criteria in the experiment, namely DSEMB-r (reconstruction) and DSEBM-e (energy).

Ano generating a countermeasure network [5]]Thomas Schlegl,Philipp

Sebas t.Waldste in, Ursula Schmi dt-Erfunth, and Georg Langs.Unsupervi analog detection with general adaptive network to guide marker di scan. International Conference on information processing in Medical Imaging, page pp.146-157,2017: is the only published anomaly detection method based on generation of a countermeasure network. It trains the DC to generate a countermeasure network [4 ]]Yuval Netzer,Tao Wang,Adam Coates,Alessandro Bissacco,Bo Wu,and Andrew Y Ng.Reading digits in natural images with unsupervised features learning.012011. the weights of the network are frozen during reasoning to recover a potential representation of the test data. The anomaly criterion is a combination of reconstruction and discrimination components. The reconstruction component measures the ability of the generation of the countermeasure network to reconstruct data by generating the network, while the discrimination component takes into account the score based on the discrimination network. Document [5]]Two anomaly scoring methods were compared and we selected the variable settings that work best here.

Image data experiment:

on the SVHN dataset, we observed that our model outperformed all baselines. But our method is significantly competitive with other comparative methods on the CIFAR10 dataset. The intuitive understanding is that when training our model for a class, it only learns how to reconstruct samples from that class, possibly reconstructing the abnormal samples as the closest image to the normal class, resulting in false negatives when evaluating the features of the reconstructed discriminative network.

Table 2: image dataset performance

Inference time comparisons between Ano-generated countermeasure networks [5] and our models are reported. The reasoning experiments in table 3 were performed sequentially on the same GPU, which was only used for reasoning operations. The first class of inference times is illustrated for SVHN [9] and CIFAR10[8 ].

It is therefore observed that this model is orders of magnitude faster than other anomaly detection methods based on generation of a countermeasure network.

Table 3: average inference time (ms) on GeForce GTX TITAN X

Details of the experiment:

CIFAR10 and SVHN experimental details

Pretreatment: the pixels are scaled to the range of [ -1,1 ].

DSEBM：

For CIFAR10 and SVHN, we use the following architecture: one convolutional layer, core size 3, stride 2, 64 filters, "same" fill, one max pooling layer and one full link layer containing 128 cells.

Ano generates a competing network:

we performed these experiments using formal DC generation against network architecture and hyper-parameters. For the anomaly detection task, we used the same hyper-parameters as the original paper. The decay rate was estimated to be 0.999 using an exponential moving average.

The invention uses the output of the star layer in the discrimination network to carry out anomaly scoring, and all the convolution layers have the same filling.

The embodiment of the invention provides a high-dimensional data fault abnormity detection method based on a generation countermeasure network, which greatly improves the accuracy of abnormity fault detection and obviously improves the detection speed. The method uses a class of generative confrontation networks that simultaneously learn the encoder network during training, enabling efficient reasoning during testing. In addition, recent techniques have been employed to further improve the encoder network and to stably generate antagonistic network training, and ablative studies have shown that these techniques improve the performance of the anomaly detection task. Experiments on a series of high-dimensional image data prove the efficiency and effectiveness of the method.

Those skilled in the art will appreciate that, in addition to implementing the system and its various devices, modules, units provided by the present invention as pure computer readable program code, the system and its various devices, modules, units provided by the present invention can be fully implemented by logically programming method steps in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units included in the system for realizing various functions can also be regarded as structures in the hardware component; means, modules, units for performing the various functions may also be regarded as structures within both software modules and hardware components for performing the method.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims

1. A high-dimensional data fault anomaly detection method based on a generation countermeasure network is characterized by comprising the following steps:

step S1: constructing and generating an antagonistic network architecture;

step S2: after a generated confrontation network architecture is constructed, stably generating confrontation network training to obtain a training model;

step S3: and setting a scoring function according to the training model, performing anomaly scoring on the generated countermeasure network, and performing anomaly detection on the high-dimensional data by using the generated countermeasure network.

2. The method for detecting fault and anomaly in high-dimensional data based on generation countermeasure network as claimed in claim 1, wherein the step S1 includes the following steps:

step S1.1: the standard generation countermeasure network comprises a generation network G and a discrimination network J, and the generation network G and the discrimination network J are arranged in a group of M data samples

Training, wherein i ═ 1, 2, …, M;

for hidden data z in hidden data space

generating a countermeasure network would identify network J_xzThe generating network G and the coding network E are determined as a saddle point problem MIN_G，EMAX J_xzV(J_xzE, G), where V (Jxz, E, G) is defined as:

wherein the content of the first and second substances,

step S1.2: for fixed values of the encoding network E and the generating network G, optimally judging the network

Comprises the following steps:

for optimal discriminant networks

If and only if p_E(x，z)＝p_G(x, z), implementing training criterion C (E, G) ═ MAX J_xzV(D_xzE, G).

3. The method for detecting fault and anomaly in high-dimensional data based on generation countermeasure network as claimed in claim 1, wherein the step S2 includes the following steps:

implicit in the spatial condition H using an additional antagonistic learning discriminant network Jzz^π(x|z)＝-E_π(x，z)[logπ(x|z)]Regularization, where π (x, z) is a joint distribution over x and z, with the following saddle point targets:

wherein, V (J)_XZ，J_XX，J_ZZE, G) are defined as

V(J_xz，J_xx，J_zz，E，G)＝V(J_xz，E，G)+V(J_xx，E，G)+V(J_zz，E，G)

4. The method for detecting fault and anomaly in high-dimensional data based on generation countermeasure network as claimed in claim 1, wherein the step S3 includes the following steps:

step S3.1: performing effective modeling on data distribution, and learning normal data p by using generation network G_G(x)＝p_X(x) In which

Step S3.2: learning the distribution of the data so as to accurately recover the re-expression of the latent data space;

step S3.3: ensure that normal samples can be accurately reconstructed.

5. The method for detecting fault anomaly in high-dimensional data based on generation countermeasure network according to claim 4, characterized in that the step S3.3 is as follows:

Wherein f () is the model J_xxThe last fully connected layer of (a);

U(x)＝||f_xx(x，x)-f_xx(x，G(E(x)))||₂

6. A system for detecting fault abnormality of high-dimensional data based on a generation countermeasure network, comprising:

model M1: constructing and generating an antagonistic network architecture;

7. The system for high-dimensional data failure anomaly detection based on generation countermeasure network according to claim 6, wherein the module M1 comprises:

Training, wherein i ═ 1, 2, …, M;

for hidden data z in hidden data space

generating a countermeasure network would identify network J_xzThe generating network G and the coding network E are determined as a saddle point problem MIN_G，EMAX J_xzV(J_xzSolutions of E, G) in which V (J)_xzE, G) is defined as:

wherein the content of the first and second substances,

Comprises the following steps:

for optimal discriminant networks

8. The system for high-dimensional data failure anomaly detection based on generation countermeasure network according to claim 6, wherein the module M2 comprises:

wherein, V (J)_XZ，J_XX，J_ZZE, G) are defined as

V(J_xz，J_xx，J_zz，E，G)＝V(J_xz，E，G)+V(J_xx，E，G)+V(J_zz，E，G)

9. The system for high-dimensional data failure anomaly detection based on generation countermeasure network according to claim 6, wherein the module M3 comprises:

ensure that normal samples can be accurately reconstructed.

10. The system for detecting the high-dimensional data fault abnormality based on the generative countermeasure network according to claim 9, wherein the ensuring of the accurate reconstruction of the normal sample is as follows:

Wherein f () is the model J_xxThe last fully connected layer of (a);

U(x)＝||f_xx(x,x)-f_xx(x,G(E(x)))||₂