CN116108755A - Anti-fact confidence data generation method based on fault dictionary - Google Patents
Anti-fact confidence data generation method based on fault dictionary Download PDFInfo
- Publication number
- CN116108755A CN116108755A CN202310221540.1A CN202310221540A CN116108755A CN 116108755 A CN116108755 A CN 116108755A CN 202310221540 A CN202310221540 A CN 202310221540A CN 116108755 A CN116108755 A CN 116108755A
- Authority
- CN
- China
- Prior art keywords
- fault
- sample
- counterfactual
- original
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2111/00—Details relating to CAD techniques
- G06F2111/04—Constraint-based CAD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2111/00—Details relating to CAD techniques
- G06F2111/08—Probabilistic or stochastic CAD
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computer Hardware Design (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Geometry (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Test And Diagnosis Of Digital Computers (AREA)
Abstract
The invention relates to the technical field of new generation information, in particular to a fault dictionary-based anti-fact confidence data generation method, which comprises the following steps: collecting an original fault sample X of a fault machine; performing decoding processing on the original fault sample by using CNN and VAE to generate fault semantics F and working condition attributes S; simultaneously inputting fault semantics F and working condition attributes S into a generator and generating a counterfactual sampleWill be a counterfactual sampleInputting into the discriminator, training the model until the discriminator cannot distinguish the original fault sample from the counterfactual sampleIs a distinction between (a); then inputting the counterfactual confidence sample into the original fault sample to enlarge the capacity of the original fault sample; according to the invention, through the generated confidence counterfactual sample, the time and economic cost brought by an actual experiment are greatly reduced, the problems of unbalance and 'out-of-distribution' of the original data can be solved, and the accuracy of fault diagnosis is effectively improved.
Description
Technical Field
The invention relates to the technical field of new generation information, in particular to a fault dictionary-based anti-fact confidence data generation method.
Background
For mechanical devices, which are affected by their own material properties, manufacturing industry, external environment, etc., many mechanical failures occur during the working process, which in turn leads to the device terminating operation, resulting in the corresponding workflow also being forced to be interrupted. In order to avoid the influence on the normal operation of the mechanical equipment caused by the mechanical failure, it is necessary to diagnose the health of the machine in advance in the non-working time.
With the progress of technology, more and more deep learning methods are applied to the field of mechanical fault diagnosis. But among the common problems are: most of deep learning fault diagnosis methods are based on the condition that a training set is balanced, namely, various types of fault data are balanced, and acquired experimental data are required to statistically satisfy the assumption of independent identical distribution; in the case of unbalanced training sets, however, there is a large error in the diagnosis result. Meanwhile, due to the constraint of experiment time and economic cost, the collected data are often limited to a certain specific working condition, such as rotating speed, loading torque and the like, and the exhaustive combination of different working conditions and faults cannot be reflected through multiple experiments, so that the phenomenon of 'out-of-distribution' of the data in deep learning is caused, and the phenomenon is represented in that a deep learning model often selects experimental working condition characterization, but not fault mechanism characterization is used as a basis for judging fault types. In actual industrial production, a single fault type is relatively easy to judge; the composite fault type is concurrent and coupled with various fault types, so that the difficulty of identifying the fault type is increased, further, fewer fault samples for determining the fault type are caused, the non-uniformity of sample distribution and the phenomenon of data out of distribution are caused, and the stability and the accuracy of mechanical composite fault diagnosis are severely limited, so that the problem needs to be solved.
Disclosure of Invention
In order to avoid and overcome the technical problems in the prior art, the invention provides a method for generating anti-fact confidence data based on a fault dictionary. The invention can solve the problem of unbalance of the original data and effectively improve the accuracy of fault diagnosis through the generated confidence counterfactual sample.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a method for generating anti-fact confidence data based on a fault dictionary comprises the following steps:
s1, collecting an original fault sample X of a fault machine;
s2, processing an original fault sample by using CNN and VAE to generate a fault semantic F and a working condition attribute S;
s3, simultaneously inputting the fault semantics F and the working condition attribute S into a generator and generating a counterfactual sample
S4, comparing the counterfactual sampleInputting into a discriminator, and training the model until the discriminator cannot distinguish between the original fault sample and the counterfactual sample>Is a distinction between (a);
s5, inputting the sample with the anti-fact confidence into the original fault sample to enlarge the capacity of the original fault sample.
As a further scheme of the invention: if and only if the condition attribute S is group decoupled from the fault semantics F, a generated counterfactual sampleIs a counterfactual confidence that the overall loss function L is used when the discriminator performs group decoupling tot Constraint on target, overall loss function L tot The following are provided:
wherein L is S Is a working condition attribute loss function; l (L) F Is a fault semantic loss function; l (L) d Is a third loss function; v and ρ are trade-off parameters; θ andtraining models for CNN and VAE; omega is a training model in the decoder.
As still further aspects of the invention: working condition attribute loss function L S The expression is:
wherein beta is a weight factor, M represents the original fault patternThe number of X, D KL Representing a priori random noise P (Z) and posterior condition attributes from original fault sample encodingKL divergence between; p (P) θ (x (i) S, F) represents the probability of sample data, +.>Probability of posterior distribution representing operating condition attribute, and P θ (x (i) S, F) and +.>Using a deep gaussian family implementation.
As still further aspects of the invention: fault semantic loss function L F The expression is:
wherein dist represents the Euclidean distance, exp represents an exponential function with e as the base; x 'represents a counterfactual sample generated by combining the operating condition attribute S and other fault type fault semantics F'.
As still further aspects of the invention: third loss function L d The expression is:
wherein I 1 Representation l 1 Norms, l 1 The norm refers to the sum of the absolute values of the individual elements in the vector, also called "sparse rule operator", where the reconstruction penalty is used for constraints;representing mathematical expectations, dec (·) represents a function about the semantic embedded decoder.
As still further of the present inventionThe scheme is as follows: from the data spaceThe value v= (S, F) in (a) is represented by epsilon representing the fault semantics F,/-, for example>Expressed as a condition attribute S, therefore +.>And->Space representing fault semantics F and operating condition attributes S, respectively, usingRepresenting slave data space->To the feature space->Is used for the endophytic mapping; wherein g corresponds to P θ Sampling of (X|S, F) and is a continuous inverse function g -1 Is a continuous unishot function of (2);
if and only if a counterfactual sampleIn essence group decoupling with respect to subset epsilon, the counterfactual sample +.>Is confidence; counterfactual sample->Can be decomposed into:
wherein the method comprises the steps ofRepresenting the function composition, T' represents the inverse generating function, g -1 Is a continuous inverse function of g;
the working condition attribute S is kept unchanged, the fault semantic F value is converted into F, and then for anyCan be decomposed into:
as still further aspects of the invention: the fault semantics F are the vibration characteristics, namely the vibration frequency and the vibration amplitude,
F=(a 1 ,a 2 ,…,a k ,…,a R )
wherein a is i Is a semantic element in F, and i is more than or equal to 1 and less than or equal to R;
the plurality of fault semantics are F to form a fault semantic set W:
W={F 1 ,F 2 ,…,F i ,…,F N }
where N is the number of fault samples;
the R samples of the original fault samples are selected to constitute a fault vibration signal set g:
g=(v 1 ,v 2 ,…,v k ,…,v R )
wherein R is selected to be substantially greater than one period of the vibration signal, wherein v k Represents the kth data point;
defining the threshold value of the fault vibration signal set g as lambda, if the dimension value v of the fault vibration signal set g k Greater than lambda, then a k The value of (2) is set to 1, otherwise alpha k The value set to 0 is calculated as follows:
wherein λ is calculated as follows:
where α is an empirically determined hyper-parameter, α is chosen to preserve the characteristics of the fault vibration signal.
As still further aspects of the invention: the working condition attribute S comprises the rotating speed, the load, the sampling frequency, the sampling environment and the sampling model in an original fault sample;
the condition attribute needs to be extracted through CNN and VAE: the original fault sample is subjected to wavelet transformation to obtain a two-dimensional time-frequency matrix, and the two-dimensional time-frequency matrix is input into a CNN network to perform feature extraction; using VAE, the distribution characteristics of the input data are then learned by its encoder encoding and reconstruction using the encoded latent vector Y as a posterior distribution of the original failure samples:
where U is a probability density function, σ 2 Representing the variance of the distribution, μ representing the expectation of the distribution, eps representing a matrix of normal distribution meeting the criteria having the same dimensions as the original failure sample expectation μ.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the invention, the working condition attribute S extracted from the original data is utilized to replace the original sample fault semantics of 'facts', and the counter fact sample balance data set with higher quality can be generated by combining the counter fact fault semantics F, so that the problem of unbalance of the original data is solved, and the accuracy rate of fault diagnosis is effectively improved.
2. According to the invention, CNN, VAE, GAN neural networks are organically combined together, the characteristics are automatically extracted by utilizing the strong performance of the neural networks, and the data are efficiently analyzed and processed, so that the problems of high professional threshold and low efficiency in the traditional fault diagnosis field are effectively improved.
3. The invention designs a method for generating a pseudo sample by counterfactual, which is different from the traditional method for generating the pseudo sample by using random noise, learns the information of an original signal by a semantic embedding decoder and a feedback module, provides auxiliary information when generating the pseudo sample, and can generate the pseudo sample which is more similar to the original data distribution and has diversity.
Drawings
FIG. 1 is a schematic diagram of the new text generation process of the present invention.
Fig. 2 is a graph of the inverse facts generating group theory track of the present invention.
Fig. 3 is a schematic diagram of the basic structure of the CNN of the present invention.
Fig. 4 is a schematic structural view of the VAE of the present invention.
Fig. 5 is a schematic diagram of a GAN structural model according to the present invention.
Fig. 6 is a schematic diagram of a process for generating a counterfactual sample according to the present invention.
Fig. 7 is a flowchart of the generation of a counterfactual sample according to the present invention.
Fig. 8 is a schematic diagram of a model framework of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
1. Data generation-fault data generated based on thought of fault dictionary is proposed:
a. fault dictionary
First, a dictionary is an instrument book that provides phonological and meaning interpretations for words. The combination of the fault and the dictionary is that the fault diagnosis of the machine can also form a look-up system similar to the dictionary. The concept of word separation and word combination is used to expand the words in the fault dictionary, and the words are used as the basis to judge the faults of the machine. According to the idea of breaking up and combining the Chinese characters, in general, the Chinese characters can be broken up into two parts. As shown in fig. 1, "故" can be disassembled into "古" and "〩" and "障" can be disassembled into "阝" and "章". These disassembled parts can be combined with other corresponding parts to form new Chinese characters. For example, "古" and "亻" can form Chinese characters "估", and "章" and "木" can form "樟". However, there may be accidents, for example, the characters formed by combining "古" and "广" are not Chinese characters, that is, the generated character is meaningless, that is, the newly generated fault data is out of distribution, so that the generated data is required to be in a real data subsection by using the anti-fact confidence theory (second part) to generate useful fault data.
b. Disassembly of the word
It has been mentioned above that words can typically be broken down into two parts, as can the fault data. Under the general condition, the fault data cannot be decomposed by manual analysis alone, and the fault data needs to be decomposed by CNN and VAE, so that the fault data can be decomposed into fault semantics and working condition attributes in the following decomposition modes:
fault semantics: we can use the vibration characteristics (vibration frequency and vibration amplitude) of the original fault sample as the fault semantics F:
F=(a 1 ,a 2 ,…,a k ,…,a R )
wherein a is i Is a semantic element in F, and i is more than or equal to 1 and less than or equal to R;
the plurality of fault semantics are F to form a fault semantic set W:
W={F 1 ,F 2 ,…,F i ,…,f N }
where N is the number of failure samples.
The R samples of the original fault samples are selected to constitute a fault vibration signal set g:
g=(v 1 ,v 2 ,…,v k ,…,v R )
wherein R is selected to be substantially greater than one period of the vibration signal, wherein v k Representing the kth data point.
Defining the threshold value of the fault vibration signal set g as lambda, if the dimension value v of the fault vibration signal set g k Greater than lambda, then a k The value of (2) is set to 1, otherwise alpha k The value set to 0 is calculated as follows:
wherein λ is calculated as follows:
where α is an empirically determined hyper-parameter, α is chosen to preserve the characteristics of the fault vibration signal.
Operating condition attribute: in general, the original fault samples include characteristic attributes of faults, such as rotation speed, load, sampling frequency, sampling environment, sampling model, and the like, which are referred to herein as operating mode attributes. The condition attribute needs to be extracted through CNN and VAE: the original fault sample is subjected to wavelet transformation to obtain a two-dimensional time-frequency matrix, and the two-dimensional time-frequency matrix is input into a CNN network to perform feature extraction; using VAE, the distribution characteristics of the input data are then learned by its encoder encoding and reconstruction using the encoded latent vector Y as a posterior distribution of the original failure samples:
where U is a probability density function, σ 2 The variance of the distribution is represented, mu represents the expected distribution, eps represents a matrix which has the same dimension as the expected mu of the original fault sample and meets the standard normal distribution, and the condition attribute is designed so that the acquired U meets the isotropic Gaussian distribution, and the condition attribute of group decoupling can be completely decoupled from the fault semantics.
2. Anti-fact confidence—the assurance of the trustworthiness of the generated data by anti-fact confidence is proposed:
for the counterfactual confidence, given X ε X (X is the original failure sample, X is the original failure sample space), if the counterfactual sample is generatedThe anti-facts sample generation method is anti-facts trusted.
Counterfactual sample for a class of fault samples XCan be expressed as +.>In the application of the counterfactual reasoning, the "facts" given for one sample are s=s (X), f=f (X) (representing the fault semantics and condition attributes when x=x for a fault sample), and the "counterfactual" assumption is f+.f (X). In reality, if a machine fails under a certain condition, we want to know that the vibration characteristics after it has failed at that time are completely impossible, but by means of the belief of the counterfactual that we want to know what the data it failed under other conditions if it did not occur under that condition, we can generate a pseudo sample that approximates the original failure data X distribution. And reserving the working condition attribute of the original fault sample, and generating the counterfactual samples of other fault types by replacing the fault semantics of the original fault sample with the fault semantics of the other fault types. The data generated in this case isThe generation based on the operating condition properties of the original fault samples, rather than by random noise, is part of the samples within the original fault samples. Because our framework is based on counterfact confidence-centric, some theorem is required to ensure that the generated fault data is trusted.
Theorem 1: the sample X generated by the counterfactual is if and only if the working condition attribute S and the fault semantic F are group decoupled F [S(x)]Is of opposite sense.
When the working condition attribute S and the fault semantic F are not decoupled by the group, the generated fault data is fuzzy, unclear and difficult to classify. Therefore, in order to realize the anti-fact confidence, the condition attribute S and the fault semantic F need to be subjected to group decoupling, and a total loss function L is taken tot Constraining the target, i.e.
Wherein L is S Is a working condition attribute loss function; l (L) F Is a fault semantic loss function; l (L) d Is a third loss function; v and ρ are trade-off parameters; θ andtraining models for CNN and VAE; omega is a training model in the decoder. />
First, for the working condition attribute loss function L S The expression is:
wherein D is KL Representing a priori random noise P (X) and posterior condition attributes from original fault sample encodingKL divergence (KL divergence, which may also be referred to as relative entropy, may measure the difference between two data distributions, when the two data distributions are identical,their relative entropy is zero, and as the difference between the two data distributions increases, so too does their relative entropy, so relative entropy can be used to compare similarity). P (P) θ (x (i) S, F) represents the probability of sample data; />Probability of posterior distribution of the condition attribute; p (P) θ (x (i) S, F) and +.>Using a deep gaussian family implementation. Beta is a weight factor. />Representing logP θ (x (i) S, F) about +.>Where M represents the number of training sessions. By imposing a strict constraint on S overall to follow the endogenous prior P (S), the operating condition attribute S is not affected by the failure semantics F, i.e. S is disentangled from F.
Fault semantic loss function L F The expression is:
where dist denotes the Euclidean distance and exp denotes the exponential function underlying e.
Over parameterized model P during data generation θ (X|S, F) may ignore fault semantics and generate a counterfactual sample using only the condition attributes, and the GAN model shows that a large number of photo-level realism images may be generated using only the condition attributes. To prevent the generator in the GAN network from using only the worker Kuang Shuxing S, but ignoring the failure semantics F generates a counterfactual sample. We believe that this generation is because the information in the fault semantics F may be contained entirely in the operating condition attributes S, which are not completely decoupledThe counterfactual confidence is not satisfied and therefore it is necessary to decouple the fault semantics in the operating condition attributes.
The method is concretely realized as follows: sampling and obtaining working condition attributes from original fault sample XAnd fault semanticsGenerating a counterfactual sample by combining the acquired working condition attribute S and the fault semantics F>We require that the counterfactual sample be closer to the original fault sample, but far from the counterfactual sample X 'generated by the combination of the operating condition attribute S and other fault type fault semantics F'. By calculating the contrast loss of the two anti-facts samples, the situation that the generator only uses the working condition attribute to generate the anti-facts samples but ignores the fault semantics can be effectively avoided, the generator is forced to generate the anti-facts samples by using the fault semantics to assist, the sample difference before and after the intervention is maximized, and therefore the fault semantics F can be disentangled from the working condition attribute S.
Third loss function L d :
Wherein I 1 Representation l 1 Norms, l 1 The norm refers to the sum of the absolute values of the individual elements in the vector, also called "sparse rule operator", where the reconstruction penalty is used for constraints;representing mathematical expectations, dec (·) represents a function about the semantic embedded decoder.
The loss function is obtained by introducing a semantic embedded decoder Dec, and the fault semantics F are reconstructed from the input sample features, thereby ensuring that the generator performs feature synthesisWeather can generate a distribution closer to the original dataBecause the original data is directly input into the semantic embedded decoder in the training stage, in the process of continuously reconstructing fault semantics, information about original fault sample characteristics and fault semantics can be learned, at the moment, a Feedback module feed back and a semantic embedded decoder Dec are introduced to jointly solve the problem of characteristic synthesis of the generator, potential embedding of the semantic embedded decoder is used as input of the Feedback module, the Feedback module transforms the potential embedding of the Dec, and output of the Feedback module is added into the input of the generator, so that the generator can be further helped to realize improved characteristic synthesis. The formulas are put into CNN and VAE for training, so that the group decoupling of the working condition attribute S and the fault semantic F can be completed.
Set slave data spaceThe value v= (S, F) in (a) is represented by epsilon representing the fault semantics F,/-, for example>Expressed as a condition attribute S, therefore +.>And->Space representing fault semantics F and operating condition attributes S, respectively, using +.>Representing slave data space->To the feature space->Is described. Wherein g corresponds to P θ Samples of (X|S, F) which are continuous inverse functions g -1 Is a continuous unishot function of (2). P (P) θ (X|S, F) is implemented in our model using deterministic mappings, which are unirradiative.
If the inverse fact generating function is a transform T', affecting only the variables indexed by ε, then self homomorphismEssentially separated from the subset epsilon of endogenous variables, as known from the Equivariant theory of decoupling characterization, for any
From the group theory, two major classes of fault semantics and operating condition attributes are decoupled from the collected samples, as shown in figure 2,representing samples, s i Representing the property of the working condition, f n Representing the fault semantics of each fault. Each track represents a type of fault, where the elements on each track represent different samples. The proposed method can implement cross-orbits action by replacing fault semantics.
Theorem 2: group decoupling and confidence: if and only if the counterfactual sample X F [S(x)]Counter fact sample X when essentially group decoupled relative to subset ε F [S(x)]Is confidence. Decoupling is the separation of some differentiated, semantically related vectors from data.
If the transformation T' is essentially group-decoupled, it isAnd therefore the inverse fact map must be fidelity. Conversely, assume a fidelity counterfactual sample X F [S(x)]. Counterfactual sample X F [S(x)]Can be decomposed into:
wherein the method comprises the steps ofAnd (3) representing the function composition, T' representing the inverse generating function, enabling S to keep the S=S (x) working condition attribute unchanged, and converting the fault semantic F value into F. Now for any +.>X f [s(x)]Can be similarly decomposed into:
since T' is a transformation that affects only the variables in ε (i.e., F), the believable counterfactual transformation X can be demonstrated f [s(X)]Is essentially epsilon independent. The method only needs to realize the decoupling of the S and the F groups, and can generate the anti-fact sample close to the original data distribution, namely the anti-fact sample meeting the anti-fact confidence by changing the fault semantic F under the condition that the working condition attribute S is unchanged by utilizing the sufficient condition.
3. CNN convolutional neural network, VAE variational self-encoder, GAN generation countermeasure network
1、CNN
CNN is a typical deep feed-forward artificial neural network inspired by a biological sensing mechanism, in biology, the main substance in neurons is a cell body, which is a kind of nerve cell, and neurons in the brain of a human are connected with each other in an intricate manner, namely the neural network. CNNs are typically composed of convolutional, pooled, fully connected layers, and the like. The essence of the method is to construct a plurality of perceptrons capable of extracting the characteristics of the input data, to convolve and pool the input data layer by layer through the perceptrons, to extract the topological structure characteristics hidden in the data step by step, to abstract the extracted characteristics gradually as the network structure layer is deep, and to finally obtain the characteristic representation of the input data with unchanged translation, rotation and scaling. The sub-sampling fully utilizes the characteristics of locality and the like contained in the data, reduces the data dimension, optimizes the network structure, and can ensure displacement invariance to a certain extent, thereby being very suitable for processing and learning of mass data.
As shown in fig. 3, CNN has a powerful signal processing and analyzing function, but when features are extracted, a two-dimensional time-frequency diagram is input instead of one-dimensional one, because when one-dimensional data is used for processing, converted images are relatively close and difficult to distinguish, the training time required for training by using CNN is long, and the accuracy of classification and identification is limited. The time-frequency distribution can provide the joint distribution information of the time domain and the frequency domain, can better highlight the signal characterization, and is beneficial to the training and the recognition of CNN.
2、VAE
A variational self-encoder is a generative structural model based on variational Bayesian inference that is capable of utilizing low-dimensional feature vectors to learn interpretable low-dimensional feature representations contained in raw data. As shown in fig. 4, the variation is split from the encoder into two neural networks as a whole: an Encoder and a Decode. The Encoder maximizes the lower bound of the edge likelihood function of the observed data by continual iteration and updating of the variation parameters, approximates the posterior probability of the unobservable variable, and outputs the probability distribution of the hidden variable. The Decoder restores the approximate probability distribution of the original data based on the hidden variable probability distribution output by the Encoder. The variational self-encoder has a hidden layer sampling process similar to Dropout and regularization, so that the whole training process of the model is not easy to generate an overfitting problem, and compared with a traditional feature extraction model, the variational self-encoder limits low-dimensional feature vectors to a standard normal distribution, and is more suitable for solving the problem of small sample number.
3、GAN
The chinese name generation type countermeasure network of GAN, as shown in fig. 5, is a generation model in which GAN is learned by implicitly calculating a certain similarity between model distribution and actual data distribution, and the purpose is to estimate distribution or density of actual data, learn patterns of actual data, and generate new data according to learned knowledge. The network structure of GAN is composed of a generation network and a discrimination network, and its structure is as follows. The generator G accepts the random variables and generates the counterfactual sample data with the aim of making the generated data identical to the actual data distribution. The discriminator is mainly used for judging the authenticity of the real data and the generated anti-reality sample data, and the output value of the discriminator is generally a probability value. At the same time, the output of the arbiter also affects the generator, which is equivalent to training the generator so that it can generate better counterfactual samples. When the discriminator cannot judge whether the input data come from the real data or the inverse facts sample data, the model reaches the optimal state at the moment, and the output probability value of the discriminator is 1/2.
In generating new fault data, in order to be able to obtain counterfactual samples that are closer to the original data distribution, a discriminator is trained, the role of which is to output a real value that represents the magnitude of the sample feature considered to be a true value given the fault semantics F. During the game of the generator and the discriminator, the ability of the generator to generate the counterfactual samples and the discriminator to discriminate between the real samples and the counterfactual samples is gradually increased, eventually causing the discriminator to consider the counterfactual samples as real data, thereby generating high quality counterfactual samples that approximate the original data distribution. We use LSGAN loss as an optimization function:
the GAN concept comes from the two-player zero and game in game theory, and the generator G and the arbiter D can be seen as two players in the game. In the model training process, the generator and the discriminator can update own parameters respectively to minimize loss, and a Nash equilibrium state is finally achieved through continuous iterative optimization, so that the model is optimal. The objective function is defined as:
wherein the method comprises the steps ofMeaning that D is maximized first and G is then minimized. X-p data (x) Statistical distribution probability density function representing x conforming to real dataNumber p data I.e. x belongs to the real data. z-p z(z) Statistical distribution probability density function p representing z-coincidence coding z(z) I.e. z is a random number sampled from the encoded statistical distribution. G (-) represents a function about the generator and D (-) represents a function about the discriminator.
In generating new fault data, in order to be able to obtain counterfactual samples that are closer to the original data distribution, a discriminator D (x, y) is trained, the role of which is to output a real value that represents the magnitude of the sample feature considered to be a true value given the fault semantics F. During the game of the generator and the discriminator, the ability of the generator to generate the counterfactual samples and the discriminator to discriminate between the real samples and the counterfactual samples is gradually increased, eventually causing the discriminator to consider the counterfactual samples as real data, thereby generating high quality counterfactual samples that approximate the original data distribution. We use LSGAN loss as an optimization function:
wherein p is data (x) Representing a statistically distributed probability density function, p, of raw data s (s|x) represents the operating condition attribute extracted from the raw data, Φ (·) represents the linear mapping function, G (·) represents the function with respect to the generator, D (·) represents the function with respect to the discriminator, s represents the operating condition attribute, f represents the fault semantics
4. New fault data generation process
Overall description the process of generating the counterfactual samples is shown in fig. 6 and 8, the entire generation network being composed of CNN, encoder, generator, discriminator, dec, feedback. The time domain signal is converted into a two-dimensional time-frequency diagram after wavelet transformation, the two-dimensional time-frequency diagram is input into a CNN network to extract characteristics, and the extracted characteristics are input into an encoder to obtain the working condition attribute S. And meanwhile, processing the original signal through a fault semantic extraction module to obtain a fault semantic F. And then inputting the working condition attribute S and the fault semantics F of the fault type into a generator together to obtain a counterfactual sample, inputting the counterfactual sample and the fault semantics into a discriminator, gradually improving the performance of the generator in the game process of the generator and the discriminator, taking care that a feedback module is not trained in the first cycle, and in the second cycle (the dotted line part in the figure), the feedback module uses the potential embedding of the semantic embedding decoder as input, and inputting the working condition attribute S and the fault semantics F of the first cycle into the generator again to help generate the counterfactual sample conforming to the original data distribution.
The process of generating the anti-fact fault data is shown in fig. 7, wherein the process is that the normal distribution is sampled to be used as a working condition attribute S, the prior working condition attribute S and the fault semantic F are input into a trained generator to generate an anti-fact sample, the anti-fact sample is input into a semantic embedded decoder trained by original data, a hidden layer of the semantic embedded decoder is taken as an input of a feedback module, and finally the working condition attribute S, the fault semantic F and the output of the feedback module are input into the generator together to generate the anti-fact sample. The generation process of the counterfactual sample is different from that of the counterfactual sample, the information of the semantic embedded decoder and the feedback module trained by various fault types of the original data is added in the second cycle, the working condition attribute S at the moment can be considered to be extracted from the original data, and the counterfactual sample conforming to the original data distribution can be generated by combining the 'counterfactual confidence' fault semantics F of other fault types.
Conclusion: according to the method for generating the counter facts sample, the original sample fault semantics of the facts are replaced by the working condition attribute S extracted from the original data, and the counter facts sample balance data set with higher quality can be generated by combining the counter facts fault semantics F, so that the problem of unbalance of the original data is solved, and the accuracy rate of fault diagnosis is effectively improved. Just as we imagine that dinosaur looks based on their fossils rather than kneading by sky, the proposed method uses more information about mechanical conditions, such as load, rotational speed, etc., in the raw data than in the traditional method where the generator uses random noise as a sample attribute. The generated confidence counterfactual sample greatly reduces time and economic cost brought by an actual experiment, solves the problems of unbalance and 'out-of-distribution' of original data, and effectively improves the accuracy of fault diagnosis.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.
Claims (8)
1. The method for generating the anti-fact confidence data based on the fault dictionary is characterized by comprising the following steps of:
s1, collecting an original fault sample X of a fault machine;
s2, processing an original fault sample by using CNN and VAE to generate a fault semantic F and a working condition attribute S;
s3, simultaneously inputting the fault semantics F and the working condition attribute S into a generator and generating a counterfactual sample
S4, comparing the counterfactual sampleInputting into discriminator, repeating training model until discriminator can not distinguish original failure sample from counterfactual sample +.>Is a distinction between (a);
s5, inputting the sample with the anti-fact confidence into the original fault sample to enlarge the capacity of the original fault sample.
2. The fault dictionary based anti-facts confidence data of claim 1The generation method is characterized in that if and only if the working condition attribute S and the fault semantic F are group decoupling, a generated counterfactual sample is generatedIs a counterfactual confidence that the overall loss function L is used when the discriminator performs group decoupling tot Constraint on target, overall loss function L tot The following are provided:
3. The method for generating anti-facts confidence data based on fault dictionary as claimed in claim 2, wherein the condition attribute loss function L S The expression is:
wherein beta is a weight factor, M represents the number of original fault samples X, D KL Representing a priori random noise P (Z) and posterior condition attributes from original fault sample encodingKL divergence between; p (P) θ (x (i) S, F) represents the probability of sample data, +.>Probability of posterior distribution representing operating condition attribute, and P θ (x (i) S, F) and +.>Using a deep gaussian family implementation.
4. The method for generating anti-facts confidence data based on fault dictionary according to claim 3, wherein the fault semantic loss function L F The expression is:
wherein dist represents the Euclidean distance, exp represents an exponential function with e as the base; x 'represents a counterfactual sample generated by combining the operating condition attribute S and other fault type fault semantics F'.
5. The method for generating anti-facts confidence data based on a fault dictionary according to claim 4, wherein the third loss function L d The expression is:
wherein I 1 Representation l 1 Norms, l 1 The norm refers to the sum of the absolute values of the individual elements in the vector, also called "sparse rule operator", where the reconstruction penalty is used for constraints;representing mathematical expectations, dec (·) represents a function about the semantic embedded decoder.
6. The fault dictionary-based anti-facts confidence data generation method according to any one of claims 1 to 5, whereinFrom the data spaceThe value v= (S, F) in (a) is represented by epsilon representing the fault semantics F,/-, for example>Expressed as a condition attribute S, therefore +.>And->The space respectively representing the fault semantics F and the working condition attributes S uses g: the%>Representing slave data space->To the feature space->Is used for the endophytic mapping; wherein g corresponds to P θ Sampling of (X|S, F) and is a continuous inverse function g -1 Is a continuous unishot function of (2);
if and only if a counterfactual sampleIn essence group decoupling with respect to subset epsilon, the counterfactual sample +.>Is confidence; counterfactual sample->Can be decomposed into:
wherein the method comprises the steps ofRepresenting the function composition, T' represents the inverse generating function, g -1 Is a continuous inverse function of g;
the working condition attribute S is kept unchanged, the fault semantic F value is converted into F, and then for anyCan be decomposed into:
7. the method for generating inverse facts confidence data based on a fault dictionary according to claim 6, wherein the fault semantics F are vibration characteristics of original fault samples, namely vibration frequency and vibration amplitude,
F=(a 1 ,a 2 ,…,a k ,…,a R )
wherein a is i Is a semantic element in F, and i is more than or equal to 1 and less than or equal to R;
the plurality of fault semantics are F to form a fault semantic set W:
W={F 1 ,F 2 ,…,F i ,…,F N }
where N is the number of fault samples;
the R samples of the original fault samples are selected to constitute a fault vibration signal set g:
g=(v 1 ,v 2 ,…,v k ,…,v R )
wherein R is selected to be substantially greater than one period of the vibration signal, wherein v k Represents the kth data point;
defining fault vibration messagesThe threshold value of the number set g is lambda, if the dimension value v of the fault vibration signal set g k Greater than lambda, then a k The value of (2) is set to 1, otherwise alpha k The value set to 0 is calculated as follows:
wherein λ is calculated as follows:
where α is an empirically determined hyper-parameter, α is chosen to preserve the characteristics of the fault vibration signal.
8. The method for generating the anti-facts confidence data based on the fault dictionary according to claim 7, wherein the working condition attribute S includes a rotation speed, a load, a sampling frequency, a sampling environment and a sampling model in an original fault sample;
the condition attribute needs to be extracted through CNN and VAE: the original fault sample is subjected to wavelet transformation to obtain a two-dimensional time-frequency matrix, and the two-dimensional time-frequency matrix is input into a CNN network to perform feature extraction; using VAE, the distribution characteristics of the input data are then learned by its encoder encoding and reconstruction using the encoded latent vector Y as a posterior distribution of the original failure samples:
where U is a probability density function, σ 2 Representing the variance of the distribution, μ representing the expected distribution, eps representing a satisfaction index having the same dimensions as the original failure sample expected μA quasi-normally distributed matrix.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310221540.1A CN116108755A (en) | 2023-03-09 | 2023-03-09 | Anti-fact confidence data generation method based on fault dictionary |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310221540.1A CN116108755A (en) | 2023-03-09 | 2023-03-09 | Anti-fact confidence data generation method based on fault dictionary |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116108755A true CN116108755A (en) | 2023-05-12 |
Family
ID=86262351
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310221540.1A Pending CN116108755A (en) | 2023-03-09 | 2023-03-09 | Anti-fact confidence data generation method based on fault dictionary |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116108755A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116759042A (en) * | 2023-08-22 | 2023-09-15 | 之江实验室 | System and method for generating anti-facts medical data based on annular consistency |
CN117520905A (en) * | 2024-01-03 | 2024-02-06 | 合肥工业大学 | Anti-fact fault data generation method based on causal intervention |
-
2023
- 2023-03-09 CN CN202310221540.1A patent/CN116108755A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116759042A (en) * | 2023-08-22 | 2023-09-15 | 之江实验室 | System and method for generating anti-facts medical data based on annular consistency |
CN116759042B (en) * | 2023-08-22 | 2023-12-22 | 之江实验室 | System and method for generating anti-facts medical data based on annular consistency |
CN117520905A (en) * | 2024-01-03 | 2024-02-06 | 合肥工业大学 | Anti-fact fault data generation method based on causal intervention |
CN117520905B (en) * | 2024-01-03 | 2024-03-22 | 合肥工业大学 | Anti-fact fault data generation method based on causal intervention |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107516110B (en) | Medical question-answer semantic clustering method based on integrated convolutional coding | |
CN116108755A (en) | Anti-fact confidence data generation method based on fault dictionary | |
CN109003678B (en) | Method and system for generating simulated text medical record | |
CN109635774B (en) | Face synthesis method based on generation of confrontation network | |
Li et al. | A unified framework incorporating predictive generative denoising autoencoder and deep Coral network for rolling bearing fault diagnosis with unbalanced data | |
CN111127146B (en) | Information recommendation method and system based on convolutional neural network and noise reduction self-encoder | |
CN109389171B (en) | Medical image classification method based on multi-granularity convolution noise reduction automatic encoder technology | |
CN112765352A (en) | Graph convolution neural network text classification method based on self-attention mechanism | |
CN113033309A (en) | Fault diagnosis method based on signal downsampling and one-dimensional convolution neural network | |
CN110991190A (en) | Document theme enhanced self-attention network, text emotion prediction system and method | |
CN112883756A (en) | Generation method of age-transformed face image and generation countermeasure network model | |
Puscasiu et al. | Automated image captioning | |
Qiang et al. | deep variational autoencoder for modeling functional brain networks and ADHD identification | |
Elarabawy et al. | Direct inversion: Optimization-free text-driven real image editing with diffusion models | |
Chen et al. | Learning multiscale consistency for self-supervised electron microscopy instance segmentation | |
Gao et al. | REPRESENTATION LEARNING OF KNOWLEDGE GRAPHS USING CONVOLUTIONAL NEURAL NETWORKS. | |
Chen et al. | Co-attention fusion based deep neural network for Chinese medical answer selection | |
CN112818124A (en) | Entity relationship extraction method based on attention neural network | |
Maheshwari et al. | Autoencoder: Issues, challenges and future prospect | |
Parviainen | Deep bottleneck classifiers in supervised dimension reduction | |
Okur et al. | Pretrained neural models for turkish text classification | |
CN116070025A (en) | Interpretable recommendation method based on joint score prediction and reason generation | |
Liu et al. | A dual-branch balance saliency model based on discriminative feature for fabric defect detection | |
Bi et al. | K-means clustering optimizing deep stacked sparse autoencoder | |
Ye et al. | An improved semi-supervised variational autoencoder with gate mechanism for text classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |