CN115859142A - Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network - Google Patents
Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network Download PDFInfo
- Publication number
- CN115859142A CN115859142A CN202211233344.8A CN202211233344A CN115859142A CN 115859142 A CN115859142 A CN 115859142A CN 202211233344 A CN202211233344 A CN 202211233344A CN 115859142 A CN115859142 A CN 115859142A
- Authority
- CN
- China
- Prior art keywords
- convolution
- data
- signal
- transformer
- sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003745 diagnosis Methods 0.000 title claims abstract description 46
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000005096 rolling process Methods 0.000 title claims abstract description 31
- 238000012549 training Methods 0.000 claims abstract description 37
- 238000012545 processing Methods 0.000 claims abstract description 13
- 238000009826 distribution Methods 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 15
- 238000012360 testing method Methods 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 9
- 230000004913 activation Effects 0.000 claims description 7
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 230000015572 biosynthetic process Effects 0.000 claims description 2
- 238000003786 synthesis reaction Methods 0.000 claims description 2
- 239000013598 vector Substances 0.000 description 41
- 230000000694 effects Effects 0.000 description 17
- 238000000605 extraction Methods 0.000 description 6
- 238000005070 sampling Methods 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 108010076504 Protein Sorting Signals Proteins 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 238000013526 transfer learning Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000003754 machining Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
Images
Landscapes
- Complex Calculations (AREA)
Abstract
A small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network relates to the field of fault diagnosis of rolling bearings and other rotating equipment and solves the problem that accurate fault diagnosis is difficult to achieve under the condition of scarce operating data. Firstly, acquiring signal data of a rolling bearing under actual operation conditions and carrying out data standardization processing on the signal data; secondly, constructing a generator and a discriminator with a convolution and transformer cross structure, and effectively extracting global time domain characteristics of the time sequence signal by using a transformer layer; on the basis, the convolution layer is used for further extracting the local time domain characteristics of the time sequence signal. Meanwhile, the position codes are embedded into the time sequence signals, so that the model can fully learn the position information characteristics of the signals, and finally, high-quality time sequence signal samples are generated to expand the original training samples, thereby improving the fault diagnosis precision under the condition of small samples.
Description
Technical Field
The invention relates to the field of fault diagnosis of rotating equipment such as rolling bearings and the like, in particular to a small-sample rolling bearing fault diagnosis method based on a countermeasure network generated by a convolution transformer.
Background
In recent years, the deep neural network has been successfully applied to the fault diagnosis of the rolling bearing by virtue of its strong feature extraction capability. Their main assumption is that there is a large amount of valid data for training the fault diagnosis model. However, in an actual engineering scene, due to the operation safety problem of the rolling bearing device and the complex and variable working conditions, the data acquisition system can often record only a small amount of operation data, and the fault diagnosis effect is greatly influenced. Therefore, it is critical and necessary to design an effective fault diagnosis method under the condition of scarce operation data.
Currently, researchers have proposed various methods to deal with the limited data problem in fault diagnosis. Data sampling is one method of handling limited data. Balancing the data of each class by undersampling a large number of sample classes and oversampling a small number of sample classes is a common way to handle the various sample scale imbalances. Although the data sampling method works well in situations where the data is limited and the samples are unbalanced. However, the data sampling method can only utilize the existing data information, and cannot effectively map the original data distribution, so that the data cannot be effectively expanded to meet the requirement of the intelligent fault diagnosis method on mass data. Transfer learning solves the cross-domain problem by transferring knowledge acquired by a source domain to a target domain. Methods based on transfer learning typically utilize model pre-training and tuning to solve the problem of fault diagnosis under limited data. However, the greatest limitation of this approach is that it does not fundamentally solve the data deficiency problem, and in addition, pre-training of the original model still requires a large number of samples.
With the gradual development of generative models, the solution of the sample scarcity problem through data generation has received a great deal of attention. Among them, generating a countermeasure Network (GAN) is a mainstream generation model in the field of artificial intelligence. GAN can generate data similar to the raw data distribution and is originally applied in the field of image processing. By virtue of its powerful data generating capability, GAN has been successfully applied in the field of rolling bearing fault diagnosis. Yang et al developed a fusion diagnostic model CGAN-2D-CNN. And converting the vibration signals into two-dimensional gray images, and expanding and classifying image data by using CGAN and 2D-CNN for diagnosing the bearing fault of the small sample. Liang et al extract the time-frequency image features from the one-dimensional raw time-domain signal by wavelet transform and generate a large number of time-frequency image samples using GAN. However, converting a one-dimensional time sequence signal into a two-dimensional image cannot well represent vibration information carried by a vibration signal, so that the quality of a generated sample is poor, and the final fault diagnosis effect is affected. With the gradual development of GAN in the field of time sequence signal generation, a method for directly expanding the vibration signal of the rolling bearing by using GAN has also made a rapid progress. Guo et al propose a fault diagnosis framework called multi-tag all-1D generation countermeasure network (ML 1D-GAN) that can be used to directly generate one-dimensional vibration signal data. Sonal Dixit et al propose a novel one-dimensional condition-assisted classifier to generate an anti-network fault diagnosis model to better generate bearing signal samples directly. Zhang et al developed a small sample intelligent fault diagnosis method based on multi-module gradient penalty generative countermeasure network (MGPGAN) to generate mechanical fault signals with high similarity.
However, the above solutions all have certain problems. 1) The GAN feature extraction capability with the full connection layer as the basic structure is insufficient, and the model parameter amount is too large when processing the long sequence signal of the bearing signal. 2) The GAN with the one-dimensional convolution as the basic structure has strong local feature extraction capability, but lacks global feature extraction capability seriously, and cannot be effectively modeled aiming at long-sequence signals. 3) The GAN at the last stage does not take into account the relative or absolute position information of the entire original vibration signal sequence when generating the bearing signal samples, thereby affecting the quality of signal generation.
Disclosure of Invention
The invention aims to solve the problem that diagnosis precision is reduced due to the fact that rolling bearing carrier data are scarce, and the small-sample rolling bearing fault diagnosis method is based on a Convolutional Transformer generation countermeasure Network (CoT-GAN). In order to enable the model to better extract global and local features of the vibration signal, a generator and a discriminator of a transformer and convolution cross structure are designed. The transformer is good at processing long sequence signals and has strong global feature extraction capability, and can effectively carry out global modeling on vibration signals. Furthermore, adding a position code to the vibration signal sequence may enable the model to efficiently learn the relative and absolute position information of the signal, thereby preserving its inherent vibration information characteristics. On the basis, the convolutional layer is utilized to further enhance the learning capability of the model to the local characteristics of the signal. The method starts from the characteristics of the vibration signals, fully considers the time sequence characteristics of the vibration signals, combines the respective advantages of the transformer and the convolution, models the vibration signals of the bearing from local and global, and fully utilizes the position information carried by the vibration signals. And finally, generating sufficient vibration signal samples and effectively improving the fault diagnosis performance.
In order to realize the purpose, the invention adopts the following technical scheme:
a small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network is characterized by comprising the following steps:
(1) Firstly, historical operating data of the rolling bearing is acquired, data standardization processing is carried out, and then signal samples after data standardization are divided into training samples and testing samples.
(2) Constructing a generation countermeasure network (CoT-GAN) with a convolution and transformer cross structure, generating random noise into a generation signal similar to the distribution of a real signal by using a generator, carrying out true and false discrimination and category discrimination on the generation signal and the real signal by using a discriminator, alternately learning the generator and the discriminator in a zero and game mode so as to improve the performance of a model until a Nash equilibrium state is reached, and finally generating a signal sample; expanding the generated signal sample to an original training sample as an enhanced data set to train a fault classifier;
(3) And (3) adopting the fault classifier trained in the step (2) to carry out fault identification and classification on the test sample, and completing a final fault diagnosis task.
A small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network is characterized in that the specific process of the step (1) is as follows:
1) Obtaining historical data of the rolling bearing under actual operation conditionsWhere n represents the number of samples and m represents the sample dimension and also the total number of samples collected. Calculating the mean X and standard deviation sigma of the historical number X, and normalizing the data X to obtain->
Wherein i =1,2,. Cndot.n;
2) Will normalize the dataDivided into training sample sets>And the test sample set->Wherein the sum of p and q is n;
a small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network is characterized in that a signal is generated by using the generation countermeasure network with a convolution and transformer cross structure, and the specific process of the step (2) is as follows:
1) Setting random noise Z and embedding the corresponding fault class label c into the random noise to obtain random noise Z = [ Z, c ] containing the fault class label, specifically,
first, a random noise of a standard normal distribution (mean 0, variance 1) is obtainedWherein k represents the number of random noises, and l represents the dimensionality of the random noises;
secondly, label the corresponding fault categoryEmbedding into random noise z = [ z ] 1 ,z 2 ,...z k ]Obtaining random noise Z containing a fault class label, wherein i belongs to {1,2,3,4};
2) In order to facilitate the processing of the discriminator and the transformer module on the input vector by the signal generated by the subsequent generator, the input random noise Z is transformed and converted into a patch of a fixed size, specifically,
firstly, changing the dimension of input random noise into a fixed value L to facilitate the subsequent processing of a discriminator to generate signals;
secondly, random noise Z = [ Z ] is embedded through one-dimensional convolution 1 ,Z 2 ,...Z k ]Into a plurality of fixed-size patches, specifically,
partitioning random noise into N patches of dimension MWherein M represents the size of the patch, N = L/M represents the number of patches, j ∈ {1, 2.
In order to reduce the parameter calculation amount, the characteristics of good weight sharing and local feature extraction effects of the convolutional neural network are utilized, and a one-dimensional convolution is used for forming an embedded module. Setting the convolution kernel size of the one-dimensional convolution as M multiplied by 1 and the step length as M, thereby enabling the one-dimensional convolution kernel to process random noise in a non-overlapping mode and finally obtaining N dimensions of M patches. In particular, the method comprises the following steps of,
embedding matrices using learningIt is projected to the dimension of the model as D by convolution model In the vector of (a). Wherein, the formula of the one-dimensional convolution operation is as follows:
wherein v is i And u j Corresponding to the input of the ith channel and the output of the jth channel, respectively. k is the convolution kernel, b is the bias, and x is the convolution operation. M j Is a channel set of jth channels for computing output functions;
the fixed size patch is then embedded with a position tag so that the generated signal can have more similar position information to the real signal, thereby improving the quality of the generated sample, specifically,
will have dimension D model Position information matrix ofEncoding and attaching to the patch, the obtained patch with position information being:
finally, the patch sequence T carrying the position information Z ′=[T Z,1 ,T Z,1 ,...,T Z,k ]A generator which is sent into the transformer module and sequentially passes through the convolution and the transformer cross structureSignal forming sampleWhere l represents the number of generated samples, specifically,
the patch carrying the location information is sent to the transformer module to extract the global features of the input, specifically,
the transformer module can dynamically capture the characteristic information of the input vector by means of a multi-head attention mechanism in the transformer module, so that the generator can grasp the global characteristic information to a great extent. The function of self-attention is to update each component of the sequence by aggregating global context information from the complete input sequence. The formula for self-attention can be expressed as:
wherein d is k The representation signal is converted into the dimension of a specific key value vector, and Q, K and V respectively represent a query vector, a key value vector and a matrix corresponding to the value vector.
Multi-head attention is a mechanism involving multiple self-attentions that can encapsulate multiple complex relationships between different elements in a sequence. Assuming h self-attention modules, multi-head attention translates a given input vector into three different sets of vectors. Each group has h vectors of dimension D/h. Then, vectors from different inputs are packed into different matrices: and &>Thus, the formula for a multi-head attention mechanism can be expressed as:
wherein Q ', K ' and V ' are each independently And &>The cascade of (2), device for combining or screening>Is a linear projection matrix;
the transformer module applies layer normalization prior to multi-head attention operation. The information flow is then enhanced with residual concatenation to achieve higher performance. Specifically, it can be expressed as:
x′=x+Multihead(LN(x)) (6)
wherein x is an input vector of the transformer module;
after the steps, the final output of the transformer module is output by the multilayer perceptron, and the specific operation is as follows:
after processing by the transformer module, the output is fed into the deconvolution layer to effectively obtain its local characteristics. Outputting the feature vector after passing through the deconvolution layerAnd again input to the module cross-structured by the transformer and the deconvolution layer. The generator comprises 4 transformers and deconvolution cross-type structure modules in total, and generates a generated sample with the same dimension as a real signal after an input vector passes through a last deconvolution layer.
Will be generated by the generatorNumber (C)And true signalThe mixture is fed to a discriminator, which, in particular,
first, the generated signal and the real signal input to the discrimination are converted into a plurality of fixed-size patches by means of one-dimensional convolution embedding, and the specific operation is similar to that of a 22). And processing the input signal of the discriminator by using a one-dimensional convolutional neural network in a non-overlapping mode to obtain a plurality of patches with fixed sizes.
Secondly, each patch is added with a corresponding position label, so that the discriminator can pay more attention to the relative position and absolute value information of the signal when learning the signal characteristics, and the signal generation of a generator is facilitated.
Then, the patch carrying the position information is sent to a subsequent transformer module and sequentially passes through a discriminator with a convolution and transformer cross structure, specifically,
the vector passing through the transformer module is introduced into a convolution layer for extracting the local characteristics of the input vector. Output vector passing through the convolutional layerIt will continue to be sent to a transformer module to obtain global signatures. The processing process of the input vector in the network is similar to that of a generator, and the discriminator always comprises 4 convolution and transformer modules with a cross structure.
Finally, the characteristic vector output by the last layer of convolution layer is deformed to obtain a plurality of vectors of 1 multiplied by 1024, the vectors are respectively subjected to two-classification discrimination and multi-classification discrimination by utilizing Sigmoid and Softmax activation functions,
and (3) respectively passing the output vectors with the dimensionality of 1 multiplied by 1024 through a two-classification full connection layer and a multi-classification full connection layer to respectively obtain the output vectors with the output dimensionality of 1 and the fault class number. Respectively sending the two output vectors into a Sigmoid and Softmax activation function to perform true and false discrimination and category discrimination, wherein the formula of the Sigmoid activation function is as follows:
where x represents the input vector into the Sigmoid activation function.
The formula for the Softmax activation function is:
wherein z represents an input vector, z k Represents the kth input vector, z i Representing the ith input vector and K representing the number of classes of the multi-classification.
3) Finally, the generated signal samples are expanded to the original training samples as an enhanced data set to train the fault classifier.
A small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network is characterized in that in step (2), the specific calculation process is as follows:
1) The generator and the discriminator are alternately trained in a zero-sum game mode until Nash equilibrium is reached, and the objective function of the CoT-GAN is expressed as follows:
wherein, P data Is the true data distribution, P g Is to generate a data distribution of the sample, D(s) represents the probability from the real data,representing the probability from the real data.Represents a desire for a true data distribution>Indicating the expectation of noise synthesis generation data. P (Y = Y | S) real ) Representing a conditional probability distribution over class labels. The optimization process of the generator and the discriminator is a binary minimum and maximum problem, and can be formalized as the following equation:
2) Training the fault classifier by using the enhanced data set so that the fault classifier can have better generalization capability, wherein an objective function of the fault classifier is represented as follows:
the CoT-GAN network structure specifically comprises: the CoT-GAN is composed of a generator and a discriminator of a convolution and transformer cross structure, can effectively model the global characteristics and the local parts of the vibration signals, and fully considers the relative position and absolute position information contained in the signals to generate sufficient signal data. The generator is composed of L deconvolution and transformer cross modules, the input of the generator is composed of random noise and existing fault category labels, data points are converted into patch forms through one-dimensional convolution embedding, embedded position information is input into network layers of L transformers and deconvolution cross structures, and finally generated signals with the same dimensionalities as real signals are output. The discriminator consists of L convolution and transformer cross modules, the input of the discriminator consists of a generated signal and a real signal, the input signal is converted into a plurality of patches through one-dimensional convolution and embedded with position information, the patches are input to L transformer and network layers of a convolution cross structure, and finally the output layer of the discriminator is the probability of two-class classification and multi-class classification.
wherein,represents the output vector of the l-1 th transformer module in the generator, is->Representing the output vector of the ith transformer module in the generator. f. of G,l (. H) represents the corresponding set i transformer module and deconvolution operation in the generator, when l =1, and->I.e. a fixed patch representing position-coded information, when L = L, then £ h @>I.e. the output vector representing the generator.Represents the output vector of the l-1 th transformer module in the discriminator>To representAnd the output vector of the ith transformer module in the discriminator. f. of D,l (·) represents the corresponding i-th set of transformer modules and convolution operations in the arbiter, when l =1,i.e. a fixed patch representing position-coded information, when L = L, then £ h @>I.e. the output vector representing the arbiter. More specifically, the generator consisting of L cross transformer modules and deconvolution can be expressed as:
advantageous effects
The invention designs a generation countermeasure network with a transformer and convolution cross structure, and utilizes the advantages of the transformer and convolution respectively to extract global and local characteristics of a time sequence signal by utilizing a transformer layer and a convolution layer, so that a model can fully capture time domain characteristics of vibration. Secondly, position coding is embedded into the vibration signal, so that the model can fully learn the relative and absolute position information of the signal, the inherent time sequence characteristic of the generated signal is enhanced, sufficient signal data are finally generated, and the fault diagnosis performance under the condition of small samples is effectively improved. The method fully considers the characteristics of the time sequence signals during sample generation and carries out modeling from the whole situation and the local situation, has the characteristics of strong characteristic expression capability, strong pertinence and high diagnosis accuracy, and has very important significance for fault diagnosis of the rolling bearing.
Drawings
FIG. 1 is a flow chart of the CoT-GAN method of the present invention;
FIG. 2 is a schematic diagram of a generator;
FIG. 3 is a schematic diagram of the discriminator;
FIG. 4 is a schematic view of a Kaiser university of West storage (CWRU) bearing test stand;
FIG. 5 illustrates the results of the present invention generated for a CWRU bearing dataset;
FIG. 6 is a graph showing the effect of varying the number of training samples on the diagnostic performance of a model;
FIG. 7 shows the diagnostic effect of the model for 1 training sample;
FIG. 8 shows the diagnostic effect of the model for 2 training samples;
FIG. 9 shows the diagnostic effect of the model for 4 training samples;
FIG. 10 shows the diagnostic effect of the model for 8 training samples;
FIG. 11 shows the diagnostic effect of the model for 16 training samples;
FIG. 12 shows the diagnostic effect of the model for 32 training samples;
Detailed Description
The invention provides a small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network, aiming at the defects of the prior art, and the method can effectively generate time sequence signal samples to expand an original training sample set so as to improve the rolling bearing fault diagnosis precision under the condition of small samples.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art based on the embodiments of the present invention without inventive step, are within the scope of the present invention.
Referring to fig. 1, the invention provides a small sample rolling bearing fault diagnosis method based on a Convolutional Transformer generated countermeasure Network (CoT-GAN), which overcomes the problem that accurate fault diagnosis is difficult to realize under the condition of scarce operating data. Firstly, acquiring signal data of a rolling bearing under actual operation conditions and carrying out data standardization processing on the signal data; next, a generator and a discriminator of a convolution and transformer cross structure are constructed as shown in fig. 2 and 3, respectively. The generator and the discriminator of the convolution and transformer cross structure are constructed, so that the local and global time domain characteristics of the time sequence signal can be effectively extracted. And meanwhile, the position codes are embedded into the time sequence signals, so that the model can fully learn the position information characteristics of the signals, and finally sufficient time sequence signal samples are generated to keep the fault diagnosis precision under the condition of small samples.
The Keiss Sichu university (CWRU) common bearing data set is widely used to verify the performance of fault diagnosis. Fig. 4 shows a CWRU bearing test stand consisting of two motors, a torque sensor, a dynamometer and other control devices. Single point failures on the bearing inner, outer and ball elements were caused by using electro-discharge machining with damage diameters of 0.007, 0.014 and 0.021 inches, respectively. The accelerometer collects vibration signals at various loads of 0 to 3 horsepower. Vibration signals were acquired at 12kHz and 48kHz sampling frequencies using a 16 channel DAT recorder. In this experiment, vibration data collected from a drive end bearing with a fault severity of 0.021 inches, a load of 0hp, and a sampling frequency of 12kHz was used for analysis. Four different bearing health conditions are selected for classification, namely a health state, an outer ring fault, an inner ring fault and a ball fault. Each class contains 100 samples, each sample containing 1024 data points. The training data of the experiment is 1-32 samples randomly sampled for each category, and the rest are test data.
In the hyper-parameter setting, the CoT-GAN adopts an Adam optimizer to perform model optimization, and in order to make model training more stable, a label smoothing strategy is adopted, wherein a real label is set to be 0.9, a false label is set to be 0.1, a Batch _ size of a training model is set to be 4, a learning rate lr of a discriminator is set to be 0.0003, a learning rate lr of a generator is set to be 0.0005, and the model is iterated for 1000 times in total.
Based on the above description, according to the invention, the specific process is implemented as follows:
1) For experimental data X = [ X ] 1 ,x 2 ,...,x 100 ]∈R 1×1024 Performing standardization, and calculating the mean value of XAnd standard deviation->Normalizing X by equation (1) results in->
3) Random noise setting a standard normal distribution (mean 0, variance 1)Label the corresponding fault category->Embedding in random noise z = [ z ] 1 ,z 2 ,...z k ]Obtaining random noise Z containing a fault class label, wherein i belongs to {1,2,3,4};
4) According to the formula (2), inputting random noise Z containing fault category to a one-dimensional convolution embedding module, and converting the form of data point into fixed patch Z p =[Z 1 ,Z 2 ,...,Z K ]In the form of (a);
5) According to the formula (3), adding position information to the patch to obtain the patch T carrying the position information Z =[T Z,1 ,T Z,2 ,...,T Z,k ]And subjecting it toSending the signal into a network layer of a transformer and convolution cross structure;
6) Obtaining the output vector of the transformer according to the formulas (5), (6) and (7)Sending the signal into a convolution layer behind a rear transformer to obtain an output vector (or greater or lesser) of the transformer and the convolution crossing module>The above operation can be represented by formula (10);
7) According to equation (12), the output vector of the final generator is obtainedI.e. to generate signal samples
8) Will generate a signalAnd training samplesInputting the data into a discriminator to train the discriminator;
9) Similar to 4), converting the generated signal and the real signal into a patch with a fixed size in a one-dimensional convolution embedding mode;
10 Like 5), position information codes are respectively added to patches of fixed size, and the patches are input into a network layer of a transformer and convolution cross structure;
11 Obtaining the output vector of the last transformer of the discriminator and the convolution cross module according to a formula (15) and a formula (17), and changing the shape of the output vector;
12 Respectively sending the final output vector into a two-classification full connection layer and a multi-classification full connection layer, and processing the output vector behind the full connection layers according to the activation functions of formulas (8) and (9) to finally obtain the probability of judging real data and judging categories;
13 ) generators and discriminators are trained in an alternating manner to eventually reach a nash equilibrium state and generate signal samplesA resulting plot of the generated signals is shown in fig. 5. Wherein, the upper part of fig. 5 is the original signal and the lower part is the generated signal; />
14 Adding the generated signal to the original training sample to obtain an enhanced data setWherein H represents the total number of samples;
15 Will enhance the data setFor training fault classifiers and using test data setsAnd performing fault diagnosis. The diagnostic effect using the enhanced data set and the original data set is shown in table 1. Where 4 in the first column of table 1 indicates that there are four classes in total, the numbers multiplied by the latter represent the amount of training samples contained in each class. As can be seen from table 1, the final diagnosis effect obtained by training the fault classifier with the enhanced data is far better than that obtained by using only the original small sample data set, and the obtained fault diagnosis effect is better as the number of training CoT-GAN samples and the number of generated samples gradually increase. FIG. 6 illustrates the effect on model diagnostic efficacy as a function of the number of training samples. Wherein the number of generated samples for each category is 10. As can be seen from FIG. 6, with the increase of the number of training samples, the CoT-GAN can effectively generate a synthesized sample to train the classifier, thereby effectively improving the fault accuracy under a small sample. In order to further show the classification precision of each fault class under different training samples, the confusion matrix is used for showing the classification effect of different classes. As shown in FIGS. 7-12, as the number of training samples increasesIn addition, the classification effect of each category is also obviously improved.
Finally, the method can be used for effectively diagnosing the faults under the condition of the small sample, so that the method has great beneficial effect on fault diagnosis of the rolling bearing with the small sample.
TABLE 1 diagnostic accuracy (%) comparison using enhanced and raw data sets
Claims (4)
1. A small sample rolling bearing fault diagnosis method based on a convolution transformer generation countermeasure network is characterized by comprising the following steps:
(1) Firstly, acquiring historical operating data of a rolling bearing, carrying out data standardization processing, and dividing a signal sample after data standardization into a training sample and a test sample;
(2) Constructing a generation countermeasure network with a convolution and transformer cross structure, generating random noise into a generation signal similar to real signal distribution by using a generator, performing true and false discrimination and category discrimination on the generation signal and the real signal by using a discriminator, alternately learning the generator and the discriminator in a zero and game mode so as to improve the performance of a model until a Nash equilibrium state is reached, and finally generating a signal sample; expanding the generated signal samples to original training samples as an enhanced data set to train a fault classifier;
(3) And (3) adopting the fault classifier trained in the step (2) to carry out fault identification and classification on the test sample, and completing a final fault diagnosis task.
2. The small-sample rolling bearing fault diagnosis method based on the convolution transformer generation countermeasure network as claimed in claim 1, characterized in that: the specific steps of (1) are as follows:
1) Obtaining historical data of rolling bearing under actual operation conditionWherein n represents the number of samples, m represents the sample dimension, and also represents the total number of collected samples; calculating the mean of the number of histories X>And standard deviation σ, normalized data X results in >>
Wherein i =1,2, ·, n;
3. The small-sample rolling bearing fault diagnosis method based on the convolution transformer generation countermeasure network as claimed in claim 1, characterized in that in step (2), the signal sample is generated by using the generation countermeasure network of the cross-type structure of convolution and transformer, and the specific steps are as follows:
1) Setting a standard normally distributed random noise Z with a mean value of 0 and a variance of 1, and embedding a corresponding fault class label c into the random noise to obtain random noise Z = [ Z, c ] containing the fault class label;
2) Converting an input signal into a plurality of patches with fixed sizes by using a one-dimensional convolution embedding mode, and embedding position coding information into each patch;
3) Constructing a generation countermeasure network with a convolution and transformer cross structure, and extracting global characteristic local characteristics of signals by using a transformer layer and a convolution layer respectively; sending a random noise patch sequence carrying position information into a generator with a transformer and convolution cross structure to generate signal samples; patch operation is carried out on the generated signal and the real signal, position information is embedded, then the generated signal and the real signal are mixed and sent to a discriminator with a transformer and convolution cross structure for learning, and a two-class prediction label and a multi-class prediction label are output by utilizing a Sigmoid and a Softmax activation function at the tail end of the discriminator, so that true and false discrimination and category discrimination are carried out by comparing with the real label;
4) The generator and the discriminator are alternately trained in a zero-sum game mode to reach a Nash equilibrium state, and finally signal samples are generated;
5) The generated signal samples are extended to the original training samples as an enhanced data set to train the fault classifier.
4. The small-sample rolling bearing fault diagnosis method based on the convolution transformer generation countermeasure network as claimed in claim 3, characterized in that in step (2), the specific calculation process is as follows:
1) Performing convolution operation on an input signal in a non-overlapping sliding mode by utilizing a one-dimensional convolution kernel, so that the input signal is divided into a plurality of patches with fixed sizes, and each patch is embedded with a position code which can be learnt in model training; the one-dimensional convolution operation formula and the position coding operation formula are as follows:
wherein v is i And u j Inputs corresponding to the ith channel and outputs corresponding to the jth channel, respectively; k is the convolution kernel, b is the offset, and is the convolution operation; m is a group of j Is a channel set of jth channels for computing output functions;
wherein, U p Representing different patches, E representing a learnable embedded matrix, E pos Representing a learnable position information matrix, T U A final patch sequence representing a final binding position code;
2) The generator and the discriminator are alternately trained in a zero-sum game mode until Nash equilibrium is reached, and the objective function of the CoT-GAN is expressed as follows:
wherein, P data Is the true data distribution, P g Is to generate a data distribution of the sample, D(s) represents the probability from the real data,representing the probability from noisy data;Represents a desire for a true data distribution>Representing a desire for noise synthesis generated data; p (Y = Y | S) real ) Representing a conditional probability distribution over class labels; the optimization process of the generator and the discriminator is a binary minimum and maximum problem, and is formalized as the following equation:
3) Training a fault classifier with the enhanced data set, the objective function of the fault classifier being represented as follows:
where x represents the input sample of the fault classifier, y represents the data label output by the classifier, P data And P g Data distributions representing real and generated samples, respectively; p (Y = Y | x) also represents the conditional probability distribution on the class label.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211233344.8A CN115859142A (en) | 2022-10-10 | 2022-10-10 | Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211233344.8A CN115859142A (en) | 2022-10-10 | 2022-10-10 | Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115859142A true CN115859142A (en) | 2023-03-28 |
Family
ID=85661365
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211233344.8A Pending CN115859142A (en) | 2022-10-10 | 2022-10-10 | Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115859142A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117076935A (en) * | 2023-10-16 | 2023-11-17 | 武汉理工大学 | Digital twin-assisted mechanical fault data lightweight generation method and system |
CN117152548A (en) * | 2023-11-01 | 2023-12-01 | 山东理工大学 | Method and system for identifying working conditions of actually measured electric diagram of oil pumping well |
CN117743947A (en) * | 2024-02-20 | 2024-03-22 | 烟台哈尔滨工程大学研究院 | Intelligent cabin fault diagnosis method and medium under small sample |
-
2022
- 2022-10-10 CN CN202211233344.8A patent/CN115859142A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117076935A (en) * | 2023-10-16 | 2023-11-17 | 武汉理工大学 | Digital twin-assisted mechanical fault data lightweight generation method and system |
CN117076935B (en) * | 2023-10-16 | 2024-02-06 | 武汉理工大学 | Digital twin-assisted mechanical fault data lightweight generation method and system |
CN117152548A (en) * | 2023-11-01 | 2023-12-01 | 山东理工大学 | Method and system for identifying working conditions of actually measured electric diagram of oil pumping well |
CN117152548B (en) * | 2023-11-01 | 2024-01-30 | 山东理工大学 | Method and system for identifying working conditions of actually measured electric diagram of oil pumping well |
CN117743947A (en) * | 2024-02-20 | 2024-03-22 | 烟台哈尔滨工程大学研究院 | Intelligent cabin fault diagnosis method and medium under small sample |
CN117743947B (en) * | 2024-02-20 | 2024-04-30 | 烟台哈尔滨工程大学研究院 | Intelligent cabin fault diagnosis method and medium under small sample |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shao et al. | Generative adversarial networks for data augmentation in machine fault diagnosis | |
Han et al. | Multi-level wavelet packet fusion in dynamic ensemble convolutional neural network for fault diagnosis | |
CN103728551B (en) | A kind of analog-circuit fault diagnosis method based on cascade integrated classifier | |
CN115859142A (en) | Small sample rolling bearing fault diagnosis method based on convolution transformer generation countermeasure network | |
CN110045015B (en) | Concrete structure internal defect detection method based on deep learning | |
Gao et al. | ASM1D-GAN: An intelligent fault diagnosis method based on assembled 1D convolutional neural network and generative adversarial networks | |
Wu et al. | A transformer-based approach for novel fault detection and fault classification/diagnosis in manufacturing: A rotary system application | |
CN107870321B (en) | Radar one-dimensional range profile target identification method based on pseudo-label learning | |
CN108871762A (en) | A kind of gearbox of wind turbine method for diagnosing faults | |
CN113673346A (en) | Motor vibration data processing and state recognition method based on multi-scale SE-Resnet | |
CN113139512B (en) | Depth network hyperspectral image classification method based on residual error and attention | |
CN112541524B (en) | BP-Adaboost multisource information motor fault diagnosis method based on attention mechanism improvement | |
CN115774851B (en) | Method and system for detecting internal defects of crankshaft based on hierarchical knowledge distillation | |
CN113295413B (en) | Traction motor bearing fault diagnosis method based on indirect signals | |
CN111639697B (en) | Hyperspectral image classification method based on non-repeated sampling and prototype network | |
CN115019104A (en) | Small sample remote sensing image classification method and system based on multi-source domain self-attention | |
CN115290326A (en) | Rolling bearing fault intelligent diagnosis method | |
CN115761398A (en) | Bearing fault diagnosis method based on lightweight neural network and dimension expansion | |
Han et al. | Data-enhanced stacked autoencoders for insufficient fault classification of machinery and its understanding via visualization | |
CN114676733A (en) | Fault diagnosis method for complex supply and delivery mechanism based on sparse self-coding assisted classification generation type countermeasure network | |
Zhang et al. | CBAM-CRLSGAN: A novel fault diagnosis method for planetary transmission systems under small samples scenarios | |
Man et al. | Bearing remaining useful life prediction based on AdCNN and CWGAN under few samples | |
CN118097261A (en) | Small sample image classification method and system based on extrusion excitation | |
CN113758709A (en) | Rolling bearing fault diagnosis method and system combining edge calculation and deep learning | |
CN116593980B (en) | Radar target recognition model training method, radar target recognition method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |