CN116108755A - Anti-fact confidence data generation method based on fault dictionary - Google Patents

Anti-fact confidence data generation method based on fault dictionary Download PDF

Info

Publication number
CN116108755A
CN116108755A CN202310221540.1A CN202310221540A CN116108755A CN 116108755 A CN116108755 A CN 116108755A CN 202310221540 A CN202310221540 A CN 202310221540A CN 116108755 A CN116108755 A CN 116108755A
Authority
CN
China
Prior art keywords
fault
sample
counterfactual
original
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310221540.1A
Other languages
Chinese (zh)
Inventor
丁煦
汪俊龙
王正成
胡立靖
胡冬成
王辉
吴昊
翟华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changhui Automobile Steering System Huangshan Co ltd
Hefei University of Technology
Original Assignee
Changhui Automobile Steering System Huangshan Co ltd
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changhui Automobile Steering System Huangshan Co ltd, Hefei University of Technology filed Critical Changhui Automobile Steering System Huangshan Co ltd
Priority to CN202310221540.1A priority Critical patent/CN116108755A/en
Publication of CN116108755A publication Critical patent/CN116108755A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/04Constraint-based CAD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/08Probabilistic or stochastic CAD

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Geometry (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Test And Diagnosis Of Digital Computers (AREA)

Abstract

The invention relates to the technical field of new generation information, in particular to a fault dictionary-based anti-fact confidence data generation method, which comprises the following steps: collecting an original fault sample X of a fault machine; performing decoding processing on the original fault sample by using CNN and VAE to generate fault semantics F and working condition attributes S; simultaneously inputting fault semantics F and working condition attributes S into a generator and generating a counterfactual sample
Figure DDA0004116888260000011
Will be a counterfactual sample
Figure DDA0004116888260000012
Inputting into the discriminator, training the model until the discriminator cannot distinguish the original fault sample from the counterfactual sample
Figure DDA0004116888260000013
Is a distinction between (a); then inputting the counterfactual confidence sample into the original fault sample to enlarge the capacity of the original fault sample; according to the invention, through the generated confidence counterfactual sample, the time and economic cost brought by an actual experiment are greatly reduced, the problems of unbalance and 'out-of-distribution' of the original data can be solved, and the accuracy of fault diagnosis is effectively improved.

Description

Anti-fact confidence data generation method based on fault dictionary
Technical Field
The invention relates to the technical field of new generation information, in particular to a fault dictionary-based anti-fact confidence data generation method.
Background
For mechanical devices, which are affected by their own material properties, manufacturing industry, external environment, etc., many mechanical failures occur during the working process, which in turn leads to the device terminating operation, resulting in the corresponding workflow also being forced to be interrupted. In order to avoid the influence on the normal operation of the mechanical equipment caused by the mechanical failure, it is necessary to diagnose the health of the machine in advance in the non-working time.
With the progress of technology, more and more deep learning methods are applied to the field of mechanical fault diagnosis. But among the common problems are: most of deep learning fault diagnosis methods are based on the condition that a training set is balanced, namely, various types of fault data are balanced, and acquired experimental data are required to statistically satisfy the assumption of independent identical distribution; in the case of unbalanced training sets, however, there is a large error in the diagnosis result. Meanwhile, due to the constraint of experiment time and economic cost, the collected data are often limited to a certain specific working condition, such as rotating speed, loading torque and the like, and the exhaustive combination of different working conditions and faults cannot be reflected through multiple experiments, so that the phenomenon of 'out-of-distribution' of the data in deep learning is caused, and the phenomenon is represented in that a deep learning model often selects experimental working condition characterization, but not fault mechanism characterization is used as a basis for judging fault types. In actual industrial production, a single fault type is relatively easy to judge; the composite fault type is concurrent and coupled with various fault types, so that the difficulty of identifying the fault type is increased, further, fewer fault samples for determining the fault type are caused, the non-uniformity of sample distribution and the phenomenon of data out of distribution are caused, and the stability and the accuracy of mechanical composite fault diagnosis are severely limited, so that the problem needs to be solved.
Disclosure of Invention
In order to avoid and overcome the technical problems in the prior art, the invention provides a method for generating anti-fact confidence data based on a fault dictionary. The invention can solve the problem of unbalance of the original data and effectively improve the accuracy of fault diagnosis through the generated confidence counterfactual sample.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a method for generating anti-fact confidence data based on a fault dictionary comprises the following steps:
s1, collecting an original fault sample X of a fault machine;
s2, processing an original fault sample by using CNN and VAE to generate a fault semantic F and a working condition attribute S;
s3, simultaneously inputting the fault semantics F and the working condition attribute S into a generator and generating a counterfactual sample
Figure BDA0004116888230000021
S4, comparing the counterfactual sample
Figure BDA0004116888230000022
Inputting into a discriminator, and training the model until the discriminator cannot distinguish between the original fault sample and the counterfactual sample>
Figure BDA0004116888230000023
Is a distinction between (a);
s5, inputting the sample with the anti-fact confidence into the original fault sample to enlarge the capacity of the original fault sample.
As a further scheme of the invention: if and only if the condition attribute S is group decoupled from the fault semantics F, a generated counterfactual sample
Figure BDA0004116888230000024
Is a counterfactual confidence that the overall loss function L is used when the discriminator performs group decoupling tot Constraint on target, overall loss function L tot The following are provided:
Figure BDA0004116888230000025
wherein L is S Is a working condition attribute loss function; l (L) F Is a fault semantic loss function; l (L) d Is a third loss function; v and ρ are trade-off parameters; θ and
Figure BDA0004116888230000026
training models for CNN and VAE; omega is a training model in the decoder.
As still further aspects of the invention: working condition attribute loss function L S The expression is:
Figure BDA0004116888230000027
wherein beta is a weight factor, M represents the original fault patternThe number of X, D KL Representing a priori random noise P (Z) and posterior condition attributes from original fault sample encoding
Figure BDA0004116888230000028
KL divergence between; p (P) θ (x (i) S, F) represents the probability of sample data, +.>
Figure BDA0004116888230000031
Probability of posterior distribution representing operating condition attribute, and P θ (x (i) S, F) and +.>
Figure BDA0004116888230000032
Using a deep gaussian family implementation.
As still further aspects of the invention: fault semantic loss function L F The expression is:
Figure BDA0004116888230000033
wherein dist represents the Euclidean distance, exp represents an exponential function with e as the base; x 'represents a counterfactual sample generated by combining the operating condition attribute S and other fault type fault semantics F'.
As still further aspects of the invention: third loss function L d The expression is:
Figure BDA0004116888230000034
wherein I 1 Representation l 1 Norms, l 1 The norm refers to the sum of the absolute values of the individual elements in the vector, also called "sparse rule operator", where the reconstruction penalty is used for constraints;
Figure BDA0004116888230000035
representing mathematical expectations, dec (·) represents a function about the semantic embedded decoder.
As still further of the present inventionThe scheme is as follows: from the data space
Figure BDA0004116888230000036
The value v= (S, F) in (a) is represented by epsilon representing the fault semantics F,/-, for example>
Figure BDA0004116888230000037
Expressed as a condition attribute S, therefore +.>
Figure BDA0004116888230000038
And->
Figure BDA0004116888230000039
Space representing fault semantics F and operating condition attributes S, respectively, using
Figure BDA00041168882300000310
Representing slave data space->
Figure BDA00041168882300000317
To the feature space->
Figure BDA00041168882300000318
Is used for the endophytic mapping; wherein g corresponds to P θ Sampling of (X|S, F) and is a continuous inverse function g -1 Is a continuous unishot function of (2);
if and only if a counterfactual sample
Figure BDA00041168882300000311
In essence group decoupling with respect to subset epsilon, the counterfactual sample +.>
Figure BDA00041168882300000312
Is confidence; counterfactual sample->
Figure BDA00041168882300000313
Can be decomposed into:
Figure BDA00041168882300000314
wherein the method comprises the steps of
Figure BDA00041168882300000319
Representing the function composition, T' represents the inverse generating function, g -1 Is a continuous inverse function of g;
the working condition attribute S is kept unchanged, the fault semantic F value is converted into F, and then for any
Figure BDA00041168882300000315
Can be decomposed into:
Figure BDA00041168882300000316
as still further aspects of the invention: the fault semantics F are the vibration characteristics, namely the vibration frequency and the vibration amplitude,
F=(a 1 ,a 2 ,…,a k ,…,a R )
wherein a is i Is a semantic element in F, and i is more than or equal to 1 and less than or equal to R;
the plurality of fault semantics are F to form a fault semantic set W:
W={F 1 ,F 2 ,…,F i ,…,F N }
where N is the number of fault samples;
the R samples of the original fault samples are selected to constitute a fault vibration signal set g:
g=(v 1 ,v 2 ,…,v k ,…,v R )
wherein R is selected to be substantially greater than one period of the vibration signal, wherein v k Represents the kth data point;
defining the threshold value of the fault vibration signal set g as lambda, if the dimension value v of the fault vibration signal set g k Greater than lambda, then a k The value of (2) is set to 1, otherwise alpha k The value set to 0 is calculated as follows:
Figure BDA0004116888230000041
wherein λ is calculated as follows:
Figure BDA0004116888230000042
where α is an empirically determined hyper-parameter, α is chosen to preserve the characteristics of the fault vibration signal.
As still further aspects of the invention: the working condition attribute S comprises the rotating speed, the load, the sampling frequency, the sampling environment and the sampling model in an original fault sample;
the condition attribute needs to be extracted through CNN and VAE: the original fault sample is subjected to wavelet transformation to obtain a two-dimensional time-frequency matrix, and the two-dimensional time-frequency matrix is input into a CNN network to perform feature extraction; using VAE, the distribution characteristics of the input data are then learned by its encoder encoding and reconstruction using the encoded latent vector Y as a posterior distribution of the original failure samples:
Figure BDA0004116888230000043
Figure BDA0004116888230000051
where U is a probability density function, σ 2 Representing the variance of the distribution, μ representing the expectation of the distribution, eps representing a matrix of normal distribution meeting the criteria having the same dimensions as the original failure sample expectation μ.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the invention, the working condition attribute S extracted from the original data is utilized to replace the original sample fault semantics of 'facts', and the counter fact sample balance data set with higher quality can be generated by combining the counter fact fault semantics F, so that the problem of unbalance of the original data is solved, and the accuracy rate of fault diagnosis is effectively improved.
2. According to the invention, CNN, VAE, GAN neural networks are organically combined together, the characteristics are automatically extracted by utilizing the strong performance of the neural networks, and the data are efficiently analyzed and processed, so that the problems of high professional threshold and low efficiency in the traditional fault diagnosis field are effectively improved.
3. The invention designs a method for generating a pseudo sample by counterfactual, which is different from the traditional method for generating the pseudo sample by using random noise, learns the information of an original signal by a semantic embedding decoder and a feedback module, provides auxiliary information when generating the pseudo sample, and can generate the pseudo sample which is more similar to the original data distribution and has diversity.
Drawings
FIG. 1 is a schematic diagram of the new text generation process of the present invention.
Fig. 2 is a graph of the inverse facts generating group theory track of the present invention.
Fig. 3 is a schematic diagram of the basic structure of the CNN of the present invention.
Fig. 4 is a schematic structural view of the VAE of the present invention.
Fig. 5 is a schematic diagram of a GAN structural model according to the present invention.
Fig. 6 is a schematic diagram of a process for generating a counterfactual sample according to the present invention.
Fig. 7 is a flowchart of the generation of a counterfactual sample according to the present invention.
Fig. 8 is a schematic diagram of a model framework of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
1. Data generation-fault data generated based on thought of fault dictionary is proposed:
a. fault dictionary
First, a dictionary is an instrument book that provides phonological and meaning interpretations for words. The combination of the fault and the dictionary is that the fault diagnosis of the machine can also form a look-up system similar to the dictionary. The concept of word separation and word combination is used to expand the words in the fault dictionary, and the words are used as the basis to judge the faults of the machine. According to the idea of breaking up and combining the Chinese characters, in general, the Chinese characters can be broken up into two parts. As shown in fig. 1, "故" can be disassembled into "古" and "〩" and "障" can be disassembled into "阝" and "章". These disassembled parts can be combined with other corresponding parts to form new Chinese characters. For example, "古" and "亻" can form Chinese characters "估", and "章" and "木" can form "樟". However, there may be accidents, for example, the characters formed by combining "古" and "广" are not Chinese characters, that is, the generated character is meaningless, that is, the newly generated fault data is out of distribution, so that the generated data is required to be in a real data subsection by using the anti-fact confidence theory (second part) to generate useful fault data.
b. Disassembly of the word
It has been mentioned above that words can typically be broken down into two parts, as can the fault data. Under the general condition, the fault data cannot be decomposed by manual analysis alone, and the fault data needs to be decomposed by CNN and VAE, so that the fault data can be decomposed into fault semantics and working condition attributes in the following decomposition modes:
fault semantics: we can use the vibration characteristics (vibration frequency and vibration amplitude) of the original fault sample as the fault semantics F:
F=(a 1 ,a 2 ,…,a k ,…,a R )
wherein a is i Is a semantic element in F, and i is more than or equal to 1 and less than or equal to R;
the plurality of fault semantics are F to form a fault semantic set W:
W={F 1 ,F 2 ,…,F i ,…,f N }
where N is the number of failure samples.
The R samples of the original fault samples are selected to constitute a fault vibration signal set g:
g=(v 1 ,v 2 ,…,v k ,…,v R )
wherein R is selected to be substantially greater than one period of the vibration signal, wherein v k Representing the kth data point.
Defining the threshold value of the fault vibration signal set g as lambda, if the dimension value v of the fault vibration signal set g k Greater than lambda, then a k The value of (2) is set to 1, otherwise alpha k The value set to 0 is calculated as follows:
Figure BDA0004116888230000071
wherein λ is calculated as follows:
Figure BDA0004116888230000072
where α is an empirically determined hyper-parameter, α is chosen to preserve the characteristics of the fault vibration signal.
Operating condition attribute: in general, the original fault samples include characteristic attributes of faults, such as rotation speed, load, sampling frequency, sampling environment, sampling model, and the like, which are referred to herein as operating mode attributes. The condition attribute needs to be extracted through CNN and VAE: the original fault sample is subjected to wavelet transformation to obtain a two-dimensional time-frequency matrix, and the two-dimensional time-frequency matrix is input into a CNN network to perform feature extraction; using VAE, the distribution characteristics of the input data are then learned by its encoder encoding and reconstruction using the encoded latent vector Y as a posterior distribution of the original failure samples:
Figure BDA0004116888230000073
Figure BDA0004116888230000074
where U is a probability density function, σ 2 The variance of the distribution is represented, mu represents the expected distribution, eps represents a matrix which has the same dimension as the expected mu of the original fault sample and meets the standard normal distribution, and the condition attribute is designed so that the acquired U meets the isotropic Gaussian distribution, and the condition attribute of group decoupling can be completely decoupled from the fault semantics.
2. Anti-fact confidence—the assurance of the trustworthiness of the generated data by anti-fact confidence is proposed:
for the counterfactual confidence, given X ε X (X is the original failure sample, X is the original failure sample space), if the counterfactual sample is generated
Figure BDA0004116888230000081
The anti-facts sample generation method is anti-facts trusted.
Counterfactual sample for a class of fault samples X
Figure BDA0004116888230000082
Can be expressed as +.>
Figure BDA0004116888230000083
In the application of the counterfactual reasoning, the "facts" given for one sample are s=s (X), f=f (X) (representing the fault semantics and condition attributes when x=x for a fault sample), and the "counterfactual" assumption is f+.f (X). In reality, if a machine fails under a certain condition, we want to know that the vibration characteristics after it has failed at that time are completely impossible, but by means of the belief of the counterfactual that we want to know what the data it failed under other conditions if it did not occur under that condition, we can generate a pseudo sample that approximates the original failure data X distribution. And reserving the working condition attribute of the original fault sample, and generating the counterfactual samples of other fault types by replacing the fault semantics of the original fault sample with the fault semantics of the other fault types. The data generated in this case isThe generation based on the operating condition properties of the original fault samples, rather than by random noise, is part of the samples within the original fault samples. Because our framework is based on counterfact confidence-centric, some theorem is required to ensure that the generated fault data is trusted.
Theorem 1: the sample X generated by the counterfactual is if and only if the working condition attribute S and the fault semantic F are group decoupled F [S(x)]Is of opposite sense.
When the working condition attribute S and the fault semantic F are not decoupled by the group, the generated fault data is fuzzy, unclear and difficult to classify. Therefore, in order to realize the anti-fact confidence, the condition attribute S and the fault semantic F need to be subjected to group decoupling, and a total loss function L is taken tot Constraining the target, i.e.
Figure BDA0004116888230000084
Wherein L is S Is a working condition attribute loss function; l (L) F Is a fault semantic loss function; l (L) d Is a third loss function; v and ρ are trade-off parameters; θ and
Figure BDA0004116888230000085
training models for CNN and VAE; omega is a training model in the decoder. />
First, for the working condition attribute loss function L S The expression is:
Figure BDA0004116888230000091
wherein D is KL Representing a priori random noise P (X) and posterior condition attributes from original fault sample encoding
Figure BDA0004116888230000092
KL divergence (KL divergence, which may also be referred to as relative entropy, may measure the difference between two data distributions, when the two data distributions are identical,their relative entropy is zero, and as the difference between the two data distributions increases, so too does their relative entropy, so relative entropy can be used to compare similarity). P (P) θ (x (i) S, F) represents the probability of sample data; />
Figure BDA0004116888230000093
Probability of posterior distribution of the condition attribute; p (P) θ (x (i) S, F) and +.>
Figure BDA0004116888230000094
Using a deep gaussian family implementation. Beta is a weight factor. />
Figure BDA0004116888230000095
Representing logP θ (x (i) S, F) about +.>
Figure BDA0004116888230000096
Where M represents the number of training sessions. By imposing a strict constraint on S overall to follow the endogenous prior P (S), the operating condition attribute S is not affected by the failure semantics F, i.e. S is disentangled from F.
Fault semantic loss function L F The expression is:
Figure BDA0004116888230000097
where dist denotes the Euclidean distance and exp denotes the exponential function underlying e.
Over parameterized model P during data generation θ (X|S, F) may ignore fault semantics and generate a counterfactual sample using only the condition attributes, and the GAN model shows that a large number of photo-level realism images may be generated using only the condition attributes. To prevent the generator in the GAN network from using only the worker Kuang Shuxing S, but ignoring the failure semantics F generates a counterfactual sample. We believe that this generation is because the information in the fault semantics F may be contained entirely in the operating condition attributes S, which are not completely decoupledThe counterfactual confidence is not satisfied and therefore it is necessary to decouple the fault semantics in the operating condition attributes.
The method is concretely realized as follows: sampling and obtaining working condition attributes from original fault sample X
Figure BDA0004116888230000098
And fault semantics
Figure BDA0004116888230000101
Generating a counterfactual sample by combining the acquired working condition attribute S and the fault semantics F>
Figure BDA0004116888230000102
We require that the counterfactual sample be closer to the original fault sample, but far from the counterfactual sample X 'generated by the combination of the operating condition attribute S and other fault type fault semantics F'. By calculating the contrast loss of the two anti-facts samples, the situation that the generator only uses the working condition attribute to generate the anti-facts samples but ignores the fault semantics can be effectively avoided, the generator is forced to generate the anti-facts samples by using the fault semantics to assist, the sample difference before and after the intervention is maximized, and therefore the fault semantics F can be disentangled from the working condition attribute S.
Third loss function L d
Figure BDA0004116888230000103
Wherein I 1 Representation l 1 Norms, l 1 The norm refers to the sum of the absolute values of the individual elements in the vector, also called "sparse rule operator", where the reconstruction penalty is used for constraints;
Figure BDA0004116888230000104
representing mathematical expectations, dec (·) represents a function about the semantic embedded decoder.
The loss function is obtained by introducing a semantic embedded decoder Dec, and the fault semantics F are reconstructed from the input sample features, thereby ensuring that the generator performs feature synthesisWeather can generate a distribution closer to the original data
Figure BDA0004116888230000105
Because the original data is directly input into the semantic embedded decoder in the training stage, in the process of continuously reconstructing fault semantics, information about original fault sample characteristics and fault semantics can be learned, at the moment, a Feedback module feed back and a semantic embedded decoder Dec are introduced to jointly solve the problem of characteristic synthesis of the generator, potential embedding of the semantic embedded decoder is used as input of the Feedback module, the Feedback module transforms the potential embedding of the Dec, and output of the Feedback module is added into the input of the generator, so that the generator can be further helped to realize improved characteristic synthesis. The formulas are put into CNN and VAE for training, so that the group decoupling of the working condition attribute S and the fault semantic F can be completed.
Set slave data space
Figure BDA0004116888230000106
The value v= (S, F) in (a) is represented by epsilon representing the fault semantics F,/-, for example>
Figure BDA0004116888230000107
Expressed as a condition attribute S, therefore +.>
Figure BDA0004116888230000108
And->
Figure BDA0004116888230000109
Space representing fault semantics F and operating condition attributes S, respectively, using +.>
Figure BDA00041168882300001010
Representing slave data space->
Figure BDA00041168882300001011
To the feature space->
Figure BDA00041168882300001012
Is described. Wherein g corresponds to P θ Samples of (X|S, F) which are continuous inverse functions g -1 Is a continuous unishot function of (2). P (P) θ (X|S, F) is implemented in our model using deterministic mappings, which are unirradiative.
If the inverse fact generating function is a transform T', affecting only the variables indexed by ε, then self homomorphism
Figure BDA0004116888230000111
Essentially separated from the subset epsilon of endogenous variables, as known from the Equivariant theory of decoupling characterization, for any
Figure BDA0004116888230000112
From the group theory, two major classes of fault semantics and operating condition attributes are decoupled from the collected samples, as shown in figure 2,
Figure BDA0004116888230000113
representing samples, s i Representing the property of the working condition, f n Representing the fault semantics of each fault. Each track represents a type of fault, where the elements on each track represent different samples. The proposed method can implement cross-orbits action by replacing fault semantics.
Theorem 2: group decoupling and confidence: if and only if the counterfactual sample X F [S(x)]Counter fact sample X when essentially group decoupled relative to subset ε F [S(x)]Is confidence. Decoupling is the separation of some differentiated, semantically related vectors from data.
If the transformation T' is essentially group-decoupled, it is
Figure BDA0004116888230000114
And therefore the inverse fact map must be fidelity. Conversely, assume a fidelity counterfactual sample X F [S(x)]. Counterfactual sample X F [S(x)]Can be decomposed into:
Figure BDA0004116888230000115
wherein the method comprises the steps of
Figure BDA0004116888230000118
And (3) representing the function composition, T' representing the inverse generating function, enabling S to keep the S=S (x) working condition attribute unchanged, and converting the fault semantic F value into F. Now for any +.>
Figure BDA0004116888230000116
X f [s(x)]Can be similarly decomposed into:
Figure BDA0004116888230000117
since T' is a transformation that affects only the variables in ε (i.e., F), the believable counterfactual transformation X can be demonstrated f [s(X)]Is essentially epsilon independent. The method only needs to realize the decoupling of the S and the F groups, and can generate the anti-fact sample close to the original data distribution, namely the anti-fact sample meeting the anti-fact confidence by changing the fault semantic F under the condition that the working condition attribute S is unchanged by utilizing the sufficient condition.
3. CNN convolutional neural network, VAE variational self-encoder, GAN generation countermeasure network
1、CNN
CNN is a typical deep feed-forward artificial neural network inspired by a biological sensing mechanism, in biology, the main substance in neurons is a cell body, which is a kind of nerve cell, and neurons in the brain of a human are connected with each other in an intricate manner, namely the neural network. CNNs are typically composed of convolutional, pooled, fully connected layers, and the like. The essence of the method is to construct a plurality of perceptrons capable of extracting the characteristics of the input data, to convolve and pool the input data layer by layer through the perceptrons, to extract the topological structure characteristics hidden in the data step by step, to abstract the extracted characteristics gradually as the network structure layer is deep, and to finally obtain the characteristic representation of the input data with unchanged translation, rotation and scaling. The sub-sampling fully utilizes the characteristics of locality and the like contained in the data, reduces the data dimension, optimizes the network structure, and can ensure displacement invariance to a certain extent, thereby being very suitable for processing and learning of mass data.
As shown in fig. 3, CNN has a powerful signal processing and analyzing function, but when features are extracted, a two-dimensional time-frequency diagram is input instead of one-dimensional one, because when one-dimensional data is used for processing, converted images are relatively close and difficult to distinguish, the training time required for training by using CNN is long, and the accuracy of classification and identification is limited. The time-frequency distribution can provide the joint distribution information of the time domain and the frequency domain, can better highlight the signal characterization, and is beneficial to the training and the recognition of CNN.
2、VAE
A variational self-encoder is a generative structural model based on variational Bayesian inference that is capable of utilizing low-dimensional feature vectors to learn interpretable low-dimensional feature representations contained in raw data. As shown in fig. 4, the variation is split from the encoder into two neural networks as a whole: an Encoder and a Decode. The Encoder maximizes the lower bound of the edge likelihood function of the observed data by continual iteration and updating of the variation parameters, approximates the posterior probability of the unobservable variable, and outputs the probability distribution of the hidden variable. The Decoder restores the approximate probability distribution of the original data based on the hidden variable probability distribution output by the Encoder. The variational self-encoder has a hidden layer sampling process similar to Dropout and regularization, so that the whole training process of the model is not easy to generate an overfitting problem, and compared with a traditional feature extraction model, the variational self-encoder limits low-dimensional feature vectors to a standard normal distribution, and is more suitable for solving the problem of small sample number.
3、GAN
The chinese name generation type countermeasure network of GAN, as shown in fig. 5, is a generation model in which GAN is learned by implicitly calculating a certain similarity between model distribution and actual data distribution, and the purpose is to estimate distribution or density of actual data, learn patterns of actual data, and generate new data according to learned knowledge. The network structure of GAN is composed of a generation network and a discrimination network, and its structure is as follows. The generator G accepts the random variables and generates the counterfactual sample data with the aim of making the generated data identical to the actual data distribution. The discriminator is mainly used for judging the authenticity of the real data and the generated anti-reality sample data, and the output value of the discriminator is generally a probability value. At the same time, the output of the arbiter also affects the generator, which is equivalent to training the generator so that it can generate better counterfactual samples. When the discriminator cannot judge whether the input data come from the real data or the inverse facts sample data, the model reaches the optimal state at the moment, and the output probability value of the discriminator is 1/2.
In generating new fault data, in order to be able to obtain counterfactual samples that are closer to the original data distribution, a discriminator is trained, the role of which is to output a real value that represents the magnitude of the sample feature considered to be a true value given the fault semantics F. During the game of the generator and the discriminator, the ability of the generator to generate the counterfactual samples and the discriminator to discriminate between the real samples and the counterfactual samples is gradually increased, eventually causing the discriminator to consider the counterfactual samples as real data, thereby generating high quality counterfactual samples that approximate the original data distribution. We use LSGAN loss as an optimization function:
the GAN concept comes from the two-player zero and game in game theory, and the generator G and the arbiter D can be seen as two players in the game. In the model training process, the generator and the discriminator can update own parameters respectively to minimize loss, and a Nash equilibrium state is finally achieved through continuous iterative optimization, so that the model is optimal. The objective function is defined as:
Figure BDA0004116888230000131
wherein the method comprises the steps of
Figure BDA0004116888230000141
Meaning that D is maximized first and G is then minimized. X-p data (x) Statistical distribution probability density function representing x conforming to real dataNumber p data I.e. x belongs to the real data. z-p z(z) Statistical distribution probability density function p representing z-coincidence coding z(z) I.e. z is a random number sampled from the encoded statistical distribution. G (-) represents a function about the generator and D (-) represents a function about the discriminator.
In generating new fault data, in order to be able to obtain counterfactual samples that are closer to the original data distribution, a discriminator D (x, y) is trained, the role of which is to output a real value that represents the magnitude of the sample feature considered to be a true value given the fault semantics F. During the game of the generator and the discriminator, the ability of the generator to generate the counterfactual samples and the discriminator to discriminate between the real samples and the counterfactual samples is gradually increased, eventually causing the discriminator to consider the counterfactual samples as real data, thereby generating high quality counterfactual samples that approximate the original data distribution. We use LSGAN loss as an optimization function:
Figure BDA0004116888230000142
Figure BDA0004116888230000143
wherein p is data (x) Representing a statistically distributed probability density function, p, of raw data s (s|x) represents the operating condition attribute extracted from the raw data, Φ (·) represents the linear mapping function, G (·) represents the function with respect to the generator, D (·) represents the function with respect to the discriminator, s represents the operating condition attribute, f represents the fault semantics
4. New fault data generation process
Overall description the process of generating the counterfactual samples is shown in fig. 6 and 8, the entire generation network being composed of CNN, encoder, generator, discriminator, dec, feedback. The time domain signal is converted into a two-dimensional time-frequency diagram after wavelet transformation, the two-dimensional time-frequency diagram is input into a CNN network to extract characteristics, and the extracted characteristics are input into an encoder to obtain the working condition attribute S. And meanwhile, processing the original signal through a fault semantic extraction module to obtain a fault semantic F. And then inputting the working condition attribute S and the fault semantics F of the fault type into a generator together to obtain a counterfactual sample, inputting the counterfactual sample and the fault semantics into a discriminator, gradually improving the performance of the generator in the game process of the generator and the discriminator, taking care that a feedback module is not trained in the first cycle, and in the second cycle (the dotted line part in the figure), the feedback module uses the potential embedding of the semantic embedding decoder as input, and inputting the working condition attribute S and the fault semantics F of the first cycle into the generator again to help generate the counterfactual sample conforming to the original data distribution.
The process of generating the anti-fact fault data is shown in fig. 7, wherein the process is that the normal distribution is sampled to be used as a working condition attribute S, the prior working condition attribute S and the fault semantic F are input into a trained generator to generate an anti-fact sample, the anti-fact sample is input into a semantic embedded decoder trained by original data, a hidden layer of the semantic embedded decoder is taken as an input of a feedback module, and finally the working condition attribute S, the fault semantic F and the output of the feedback module are input into the generator together to generate the anti-fact sample. The generation process of the counterfactual sample is different from that of the counterfactual sample, the information of the semantic embedded decoder and the feedback module trained by various fault types of the original data is added in the second cycle, the working condition attribute S at the moment can be considered to be extracted from the original data, and the counterfactual sample conforming to the original data distribution can be generated by combining the 'counterfactual confidence' fault semantics F of other fault types.
Conclusion: according to the method for generating the counter facts sample, the original sample fault semantics of the facts are replaced by the working condition attribute S extracted from the original data, and the counter facts sample balance data set with higher quality can be generated by combining the counter facts fault semantics F, so that the problem of unbalance of the original data is solved, and the accuracy rate of fault diagnosis is effectively improved. Just as we imagine that dinosaur looks based on their fossils rather than kneading by sky, the proposed method uses more information about mechanical conditions, such as load, rotational speed, etc., in the raw data than in the traditional method where the generator uses random noise as a sample attribute. The generated confidence counterfactual sample greatly reduces time and economic cost brought by an actual experiment, solves the problems of unbalance and 'out-of-distribution' of original data, and effectively improves the accuracy of fault diagnosis.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.

Claims (8)

1. The method for generating the anti-fact confidence data based on the fault dictionary is characterized by comprising the following steps of:
s1, collecting an original fault sample X of a fault machine;
s2, processing an original fault sample by using CNN and VAE to generate a fault semantic F and a working condition attribute S;
s3, simultaneously inputting the fault semantics F and the working condition attribute S into a generator and generating a counterfactual sample
Figure FDA0004116888220000011
S4, comparing the counterfactual sample
Figure FDA0004116888220000012
Inputting into discriminator, repeating training model until discriminator can not distinguish original failure sample from counterfactual sample +.>
Figure FDA0004116888220000013
Is a distinction between (a);
s5, inputting the sample with the anti-fact confidence into the original fault sample to enlarge the capacity of the original fault sample.
2. The fault dictionary based anti-facts confidence data of claim 1The generation method is characterized in that if and only if the working condition attribute S and the fault semantic F are group decoupling, a generated counterfactual sample is generated
Figure FDA0004116888220000014
Is a counterfactual confidence that the overall loss function L is used when the discriminator performs group decoupling tot Constraint on target, overall loss function L tot The following are provided:
Figure FDA0004116888220000015
wherein L is S Is a working condition attribute loss function; l (L) F Is a fault semantic loss function; l (L) d Is a third loss function; v and ρ are trade-off parameters; θ and
Figure FDA0004116888220000016
training models for CNN and VAE; omega is a training model in the decoder.
3. The method for generating anti-facts confidence data based on fault dictionary as claimed in claim 2, wherein the condition attribute loss function L S The expression is:
Figure FDA0004116888220000017
wherein beta is a weight factor, M represents the number of original fault samples X, D KL Representing a priori random noise P (Z) and posterior condition attributes from original fault sample encoding
Figure FDA0004116888220000018
KL divergence between; p (P) θ (x (i) S, F) represents the probability of sample data, +.>
Figure FDA0004116888220000021
Probability of posterior distribution representing operating condition attribute, and P θ (x (i) S, F) and +.>
Figure FDA0004116888220000022
Using a deep gaussian family implementation.
4. The method for generating anti-facts confidence data based on fault dictionary according to claim 3, wherein the fault semantic loss function L F The expression is:
Figure FDA0004116888220000023
wherein dist represents the Euclidean distance, exp represents an exponential function with e as the base; x 'represents a counterfactual sample generated by combining the operating condition attribute S and other fault type fault semantics F'.
5. The method for generating anti-facts confidence data based on a fault dictionary according to claim 4, wherein the third loss function L d The expression is:
Figure FDA0004116888220000024
wherein I 1 Representation l 1 Norms, l 1 The norm refers to the sum of the absolute values of the individual elements in the vector, also called "sparse rule operator", where the reconstruction penalty is used for constraints;
Figure FDA00041168882200000217
representing mathematical expectations, dec (·) represents a function about the semantic embedded decoder.
6. The fault dictionary-based anti-facts confidence data generation method according to any one of claims 1 to 5, whereinFrom the data space
Figure FDA0004116888220000025
The value v= (S, F) in (a) is represented by epsilon representing the fault semantics F,/-, for example>
Figure FDA0004116888220000026
Expressed as a condition attribute S, therefore +.>
Figure FDA0004116888220000027
And->
Figure FDA0004116888220000028
The space respectively representing the fault semantics F and the working condition attributes S uses g: the%>
Figure FDA0004116888220000029
Representing slave data space->
Figure FDA00041168882200000210
To the feature space->
Figure FDA00041168882200000211
Is used for the endophytic mapping; wherein g corresponds to P θ Sampling of (X|S, F) and is a continuous inverse function g -1 Is a continuous unishot function of (2);
if and only if a counterfactual sample
Figure FDA00041168882200000212
In essence group decoupling with respect to subset epsilon, the counterfactual sample +.>
Figure FDA00041168882200000218
Is confidence; counterfactual sample->
Figure FDA00041168882200000213
Can be decomposed into:
Figure FDA00041168882200000214
wherein the method comprises the steps of
Figure FDA00041168882200000215
Representing the function composition, T' represents the inverse generating function, g -1 Is a continuous inverse function of g;
the working condition attribute S is kept unchanged, the fault semantic F value is converted into F, and then for any
Figure FDA00041168882200000216
Can be decomposed into:
Figure FDA0004116888220000031
7. the method for generating inverse facts confidence data based on a fault dictionary according to claim 6, wherein the fault semantics F are vibration characteristics of original fault samples, namely vibration frequency and vibration amplitude,
F=(a 1 ,a 2 ,…,a k ,…,a R )
wherein a is i Is a semantic element in F, and i is more than or equal to 1 and less than or equal to R;
the plurality of fault semantics are F to form a fault semantic set W:
W={F 1 ,F 2 ,…,F i ,…,F N }
where N is the number of fault samples;
the R samples of the original fault samples are selected to constitute a fault vibration signal set g:
g=(v 1 ,v 2 ,…,v k ,…,v R )
wherein R is selected to be substantially greater than one period of the vibration signal, wherein v k Represents the kth data point;
defining fault vibration messagesThe threshold value of the number set g is lambda, if the dimension value v of the fault vibration signal set g k Greater than lambda, then a k The value of (2) is set to 1, otherwise alpha k The value set to 0 is calculated as follows:
Figure FDA0004116888220000032
wherein λ is calculated as follows:
Figure FDA0004116888220000033
where α is an empirically determined hyper-parameter, α is chosen to preserve the characteristics of the fault vibration signal.
8. The method for generating the anti-facts confidence data based on the fault dictionary according to claim 7, wherein the working condition attribute S includes a rotation speed, a load, a sampling frequency, a sampling environment and a sampling model in an original fault sample;
the condition attribute needs to be extracted through CNN and VAE: the original fault sample is subjected to wavelet transformation to obtain a two-dimensional time-frequency matrix, and the two-dimensional time-frequency matrix is input into a CNN network to perform feature extraction; using VAE, the distribution characteristics of the input data are then learned by its encoder encoding and reconstruction using the encoded latent vector Y as a posterior distribution of the original failure samples:
Figure FDA0004116888220000041
Figure FDA0004116888220000042
where U is a probability density function, σ 2 Representing the variance of the distribution, μ representing the expected distribution, eps representing a satisfaction index having the same dimensions as the original failure sample expected μA quasi-normally distributed matrix.
CN202310221540.1A 2023-03-09 2023-03-09 Anti-fact confidence data generation method based on fault dictionary Pending CN116108755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310221540.1A CN116108755A (en) 2023-03-09 2023-03-09 Anti-fact confidence data generation method based on fault dictionary

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310221540.1A CN116108755A (en) 2023-03-09 2023-03-09 Anti-fact confidence data generation method based on fault dictionary

Publications (1)

Publication Number Publication Date
CN116108755A true CN116108755A (en) 2023-05-12

Family

ID=86262351

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310221540.1A Pending CN116108755A (en) 2023-03-09 2023-03-09 Anti-fact confidence data generation method based on fault dictionary

Country Status (1)

Country Link
CN (1) CN116108755A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116759042A (en) * 2023-08-22 2023-09-15 之江实验室 System and method for generating anti-facts medical data based on annular consistency
CN117520905A (en) * 2024-01-03 2024-02-06 合肥工业大学 Anti-fact fault data generation method based on causal intervention

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116759042A (en) * 2023-08-22 2023-09-15 之江实验室 System and method for generating anti-facts medical data based on annular consistency
CN116759042B (en) * 2023-08-22 2023-12-22 之江实验室 System and method for generating anti-facts medical data based on annular consistency
CN117520905A (en) * 2024-01-03 2024-02-06 合肥工业大学 Anti-fact fault data generation method based on causal intervention
CN117520905B (en) * 2024-01-03 2024-03-22 合肥工业大学 Anti-fact fault data generation method based on causal intervention

Similar Documents

Publication Publication Date Title
CN107516110B (en) Medical question-answer semantic clustering method based on integrated convolutional coding
CN116108755A (en) Anti-fact confidence data generation method based on fault dictionary
CN109003678B (en) Method and system for generating simulated text medical record
CN109635774B (en) Face synthesis method based on generation of confrontation network
Li et al. A unified framework incorporating predictive generative denoising autoencoder and deep Coral network for rolling bearing fault diagnosis with unbalanced data
CN111127146B (en) Information recommendation method and system based on convolutional neural network and noise reduction self-encoder
CN109389171B (en) Medical image classification method based on multi-granularity convolution noise reduction automatic encoder technology
CN112765352A (en) Graph convolution neural network text classification method based on self-attention mechanism
CN113033309A (en) Fault diagnosis method based on signal downsampling and one-dimensional convolution neural network
CN110991190A (en) Document theme enhanced self-attention network, text emotion prediction system and method
CN112883756A (en) Generation method of age-transformed face image and generation countermeasure network model
Puscasiu et al. Automated image captioning
Qiang et al. deep variational autoencoder for modeling functional brain networks and ADHD identification
Elarabawy et al. Direct inversion: Optimization-free text-driven real image editing with diffusion models
Chen et al. Learning multiscale consistency for self-supervised electron microscopy instance segmentation
Gao et al. REPRESENTATION LEARNING OF KNOWLEDGE GRAPHS USING CONVOLUTIONAL NEURAL NETWORKS.
Chen et al. Co-attention fusion based deep neural network for Chinese medical answer selection
CN112818124A (en) Entity relationship extraction method based on attention neural network
Maheshwari et al. Autoencoder: Issues, challenges and future prospect
Parviainen Deep bottleneck classifiers in supervised dimension reduction
Okur et al. Pretrained neural models for turkish text classification
CN116070025A (en) Interpretable recommendation method based on joint score prediction and reason generation
Liu et al. A dual-branch balance saliency model based on discriminative feature for fabric defect detection
Bi et al. K-means clustering optimizing deep stacked sparse autoencoder
Ye et al. An improved semi-supervised variational autoencoder with gate mechanism for text classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination