CN115962946A

CN115962946A - Bearing fault diagnosis method based on improved WGAN-GP and Alxnet

Info

Publication number: CN115962946A
Application number: CN202310093646.8A
Authority: CN
Inventors: 付文龙; 陈禹朋; 谭超; 张赟宁; 蒋晓辉
Original assignee: China Three Gorges University CTGU
Current assignee: China Three Gorges University CTGU
Priority date: 2023-01-18
Filing date: 2023-01-18
Publication date: 2023-04-14

Abstract

The bearing fault diagnosis method based on the improved WGAN-GP and the Alxnet is used for obtaining an original vibration signal of a bearing, and converting the original vibration signal into time-frequency domain signals of different fault categories by using wavelet transformation; dividing the time-frequency domain signals into a training set and a test set, and constructing an unbalanced data set; constructing an improved WGAN-GP network, and fusing a self-attention module and a DenseNet module in a generator to automatically learn important global information; taking a training set of the unbalanced data set as the input of an improved WGAN-GP network, adding fault data generated by a generator into the unbalanced data set for data expansion, and finally obtaining a balanced data set; on the basis of the balanced data set, an Alxnet classifier is trained, and fault diagnosis performance is detected. The method solves the problem of unbalanced data based on the data generated by improving the WGAN-GP, enables the fault diagnosis network to extract more effective characteristics to obtain better classification accuracy, and has good classification accuracy under different fault categories.

Description

Bearing fault diagnosis method based on improved WGAN-GP and Alxnet

Technical Field

The invention relates to the technical field of bearing fault diagnosis, in particular to a bearing fault diagnosis method based on improved WGAN-GP and Alxnet.

Background

In industrial production, rotary machines are widely used in mechanical devices such as compressors, fans, turbines, generators, gas turbines, aircraft engines, and various electric motors. The safe operation of mechanical equipment is a core requirement in industrial production processes. The bearing is used as an essential part in mechanical devices and electric power systems, and the running state of the bearing influences the safety and reliability of the whole equipment. However, when the rolling bearing is operated in a complex working condition for a long time, the rolling bearing is easy to have the defects of peeling, abrasion, fracture failure, indentation failure, gluing failure and the like, and the rolling bearing often causes huge economic loss and the damage of workers. It is therefore necessary to perform reliable detection and diagnosis of the rolling bearing.

With the continuous development of artificial intelligence technology, more and more methods for applying deep learning to bearing fault diagnosis are available. However, most research application scenarios are based on a balanced data set, and it is considered that the cost and difficulty for collecting high-quality samples in a fault state are high when a bearing operates in a normal state for a long time. The situation that normal samples and fault samples are often seriously unbalanced in a real scene brings huge obstacles to fault diagnosis.

The traditional method for solving the data imbalance can be classified into sample sampling and an improved diagnosis model, wherein the sample sampling comprises down-sampling and up-sampling, and the over-sampling means that data in a few samples are copied in a large quantity, so that the number of the samples and the number of the samples in the majority are balanced, and the classifier tends to be over-fitted. Whereas the under-sampling technique is the opposite of over-sampling, i.e. reducing the number of most samples to make them substantially identical to a few samples, important information may be lost, resulting in information space distortion and incompleteness. In the aspect of algorithms, the diagnostic model is improved in a manner starting from the diagnostic model, the diagnostic accuracy is improved by adjusting the sensitivity of the classifier, the improvement degree is limited, and the optimal weight is difficult to obtain.

Although the data enhancement methods proposed for data imbalance can achieve good effects in fixed situations, the methods do not have qualitative changes in data diversity, and have the problems that learning ability is weak, diagnosis accuracy reaches a certain value, and high improvement is difficult, and the like, so the effects still need to be further improved.

Disclosure of Invention

The method aims at the condition that partial fault categories are difficult to obtain in the bearing fault diagnosis, so that the data set is unbalanced. The invention provides a bearing fault diagnosis method based on improved WGAN-GP and Alxnet, which solves the problem of unbalanced data because the data generated by the improved WGAN-GP is closer to the real data, so that the fault diagnosis network extracts more effective characteristics to obtain better classification accuracy and has good classification accuracy under different fault categories.

The technical scheme adopted by the invention is as follows:

the bearing fault diagnosis method based on the improved WGAN-GP and the Alxnet comprises the following steps:

step 1: acquiring an original vibration signal of a bearing by using vibration acquisition equipment, and converting the original vibration signal into time-frequency domain signals of different fault categories by using wavelet transformation;

step 2: dividing the time-frequency domain signals obtained in the step 1 into a training set and a test set according to the ratio of 8;

and step 3: constructing an improved WGAN-GP network, fusing a self-attention module and a DenseNet module in a generator to automatically learn important global information, and adding switchable normalization in a discriminator to enhance the generalization capability of the network;

and 4, step 4: taking a training set of the unbalanced data set as the input of an improved WGAN-GP network, repeating the iteration generator and the discriminator until Nash equilibrium is reached, adding fault data generated by the generator into the unbalanced data set for data expansion, and finally obtaining four balanced data sets;

and 5: on the basis of the balanced data set, an Alxnet classifier is trained, and fault diagnosis performance is detected.

The step 1 comprises the following steps:

s1.1: mounting a vibration acceleration sensor on a bearing, and acquiring one-dimensional vibration signals of the bearing in different fault states through the vibration acceleration sensor to form a data set containing original vibration signals;

s1.2: due to the fact that wavelet transformation has strong time-frequency feature extraction capacity, a two-dimensional time-frequency image containing more useful information can be obtained; therefore, the invention adopts wavelet transformation to preprocess the original vibration signal, takes the time-frequency domain signal after the wavelet transformation as the input data of the discriminator, and the formula of the wavelet transformation is as follows:

wherein x (t) is an input vibration signal; s is a scale factor; μ is a translation factor; t represents a time variable;

is a wavelet basis function; the Morle wavelet is selected as a basis function of continuous wavelet transformation to process the original vibration signal data.

In step 2, in order to verify the generalization performance of the Alxnet model under different unbalanced data sets, the number of training sets is limited, and four comparative experimental data sets are set, wherein the unbalanced ratio is 1, 2, 1, 5 and 1, respectively, and corresponds to data sets a, B, C and D.

In the step 3, the discriminator consists of four convolution layers and two full-connection layers; the generator consists of four deconvolution layers and two full-connection layers; the step 3 comprises the following steps:

step S3.1: in the generator, a Batch Normalization (BN) and a Relu activation function are arranged behind each deconvolution layer;

in the discriminator, because the improved WGAN-GP network independently applies a gradient penalty effect to each sample, the BN is changed into Switchable Normalization (SN), the generalization capability of the model is increased, and meanwhile, the activation function is changed into LeakyRelu; the SN determines an appropriate Normalization operation for each Normalization Layer IN the deep network by using differentiable learning, combines BN (Layer Normalization) and Instance Normalization (IN) through six weights, and automatically finds an appropriate Normalization method IN the training process; the output of the N operation may be defined as:

y _sn ＝w _bn y _bn +w _Ln y _Ln +w _In y _In

in the formula, y _bn 、y _Ln 、y _In Respectively representing normalized values of the input vector obtained through BN, LN and IN; w is a _bn 、w _Ln 、w _In Representing the weight occupied by the respective normalization;

step S3.2: in order to strengthen the feature transfer in the generator and reduce the parameters and gradient disappearance in the network, a DenseNet network is added behind the first convolution layer of the generator; the DenseNet comprises a dense block and a transition block, wherein the dense block defines a dense connection relation between input and output, and the dense block is directly connected with all subsequent layers by introducing any layer, so that the problem of gradient disappearance is relieved, the characteristic propagation is enhanced, and the parameter quantity is greatly reduced; the DenseNet structure of the present invention uses three dense blocks, and in one dense block, the characteristics of the i-th layer are represented by the following formula:

x _i ＝H _i ([x ₀ ,x ₁ ,…,x _i-1 ])

wherein [ x ] ₀ ,x ₁ ,…,x _i-1 ]Representing the input of the ith layer, and splicing the characteristic graphs from the 0 th layer to the (i-1) th layer; h _i (. -) represents a complex function of 3 consecutive operations, BN, reLU and convolution operation Conv, respectively;

step S3.3: in order to solve the problem that interdependent features in a long distance are difficult to capture, a self-attention mechanism is added in a generator and discriminator network;

in the self-attention mechanism layer, important information of a target object is highlighted or emphasized through a series of attention weight coefficients, some irrelevant detail information is suppressed, global information is associated, and local and global connections are captured flexibly and in place in one step; the formula for the self-attention mechanism is as follows:

in the formula, the vector input from the attention layer is convolved by three different weights to generate three matrices of Q (Query), K (Key) and V (Value), and the point between Q and K is multiplied first, and then divided by a scale to prevent the result from being too large

Normalizing the result of the query and key vector dimension into probability distribution by utilizing Softmax operation, and multiplying the probability distribution by a matrix V to obtain the final output of the self-attention mechanism;

step S3.4: a double time scale update rule (TTUR) with descending random gradient is applied to an improved WGAN-GP network, different learning rates are set for a discriminator and a generator, and the problems of instability and too long training time in the network training process are solved.

The step 3 is mainly to construct and generate a countermeasure network structure.

Step 4 below is the process of generating training of the countermeasure network and constructing a balanced data set.

In the step 4, after the improved WGAN-GP network is constructed, training the generator and the discriminator for n times alternately, stopping training when Nash balance is achieved, and extracting the generator network to sequentially generate samples of different fault types to supplement an unbalanced data set.

In step 4, the specific training step of the improved WGAN-GP network includes:

s4.1: in generating a countermeasure networkA new distributed distance measurement method Wasserstein distance is introduced, and a gradient penalty term is added in an original loss function to enable the weight of a discriminator to meet the 1-Lipschitz condition limit, namely for two images x ₁ X, the absolute value of the output difference of the discriminator must be less than or equal to the absolute value of the average pixel-by-pixel difference;

fixing the generator, inputting the generated data and the real data of the generator into the discriminator, comparing the obtained diagnosis result with the real result, and using the loss function L of the discriminator _D Modifying the network parameters to make the discriminator score the generated data low and score the real data high; gradient penalty

Sum discriminator loss function L _D The following were used:

in the formula, | · the luminance | | _ρ Represents the ρ norm; z represents the noise vector input by the generator; c is a category label corresponding to each image; g (Z | C) represents the samples generated by the generator; p is _z To generate a probability distribution for the sample;

an expected value representing a distribution function of the generated samples; x represents a real sample; p _r Representing a probability distribution of a real sample; />

An expected value representing a true sample distribution function; d (X | C) represents the output value of the real sample passing through the discriminator; λ represents a regular term coefficient; />

By generating a true sample x and a generated sample G _z The random interpolation sampling is obtained on the connecting line between the two lines, and the calculation formula is ^>

In which the epsilon parameter obeys [0,1 ]]Uniform distribution of the components; />

The output of the discriminator is relative to the interpolation->

A gradient of (a);

s4.2: introducing a class label as an additional input layer into the generator, and combining the class label with the noise input of the generator to form a combined hidden layer expression so as to prompt the generator to conditionally supervise and generate a sample with a specific class characteristic; then L is put ₁ Function addition to Generator loss function L _G In the method, the low-frequency characteristics of the image are captured, so that the generated picture is clearer and more real; and fixing the arbiter, passing the generator loss function L _G Modifying the generator network parameters to make the generated data probability distribution and the real data distribution as close as possible, so that the D cannot distinguish a real sample from a pseudo sample, and the effect of supplementing a data set is achieved; l is ₁ Loss function and generator loss function L _G As follows:

in the formula, y represents a real data picture; z is the noise vector of the input generator; c is a category label corresponding to each noise; n is the batch number of input samples;

an expected value representing a distribution function of the generated samples; λ is L ₁ A loss over-parameter; where lambda is set to 100.

In the step 5, fault diagnosis is performed by designing an Alxnet network, the Alxnet mainly comprises five convolutional layers and three full-connection layers, and a Dropout layer is added in the network, so that the problem of gradient disappearance generated when the number of network layers is excessive is solved, the richness of characteristics is improved, and the loss of information is reduced.

The step 5 comprises the following steps:

s5.1: taking the expanded balanced data set as input, and extracting deep features through the convolutional layer, the Relu activation function and the maximum pooling layer; the extraction of deep features is obtained by convolution, activation function, and maximum pooling.

S5.2: optimizing the network weight according to the loss functions of the Adam optimizer and the Alxnet classifier, so that the diagnosis effect can be improved to the maximum extent; wherein the Alxnet classifier adopts a cross entropy loss function, and the loss function formula is as follows:

in the formula, y ⁽ⁱ⁾ Labeling the real sample;

predicting an output value for the sample;

s5.3: inputting the test set into the trained Alxnet model, and outputting a fault diagnosis result.

The invention relates to a bearing fault diagnosis method based on improved WGAN-GP and Alxnet, which has the following technical effects:

1) The invention combines an improved WGAN-GP data enhancement method with the Alxnet diagnostic network, prompts a generator to conditionally supervise and generate a sample balance data set with specific category characteristics, and the balanced data set can obtain a good diagnostic effect in the Alxnet network.

2) In the construction of the WGAN-GP-based network model, a self-attention mechanism and a dense convolution block are fused into a generator, so that the generator network can fully extract characteristics and generate a real fault sample; the weight of the discriminator is normalized by using switchable normalization in the discriminator, so that the problems of instability and overfitting in the training process are fundamentally solved.

3) As the improved WGAN-GP network depth deepens, the TTUR strategy is applied to the improved WGAN-GP network model, the improvement can improve the stability of the network, the convergence time of the improved WGAN-GP network can be further shortened compared with the original model, and the high-quality samples can be generated more quickly.

Drawings

FIG. 1 is an overall flow diagram of the method of the present invention.

Fig. 2 is a schematic diagram of the structure of the generator of the present invention.

Fig. 3 is a schematic structural diagram of the discriminator according to the present invention.

Fig. 4 is a schematic diagram of the structure of the classifier of the present invention.

FIG. 5 is a comparison of an image generated with a training set with four data sets in accordance with the present invention and an original image.

FIG. 6 (a) is a schematic diagram of the generated image and the original image distribution under the balanced data set after the t-SNE visual dimension reduction;

FIG. 6 (b) is a schematic diagram of the generated image and the original image distribution under the unbalanced data set after the t-SNE visual dimension reduction according to the invention.

FIG. 7 (a) is a graph of classifier model training loss, test accuracy and iteration number for the present invention on a balanced data set;

FIG. 7 (b) is a graph of classifier model training loss, test accuracy and iteration number for the present invention on a data set with an equilibrium ratio of 2;

FIG. 7 (c) is a graph of classifier model training loss, test accuracy and iteration number for the present invention on a data set with an equilibrium ratio of 5;

FIG. 7 (d) is a graph of classifier model training loss, test accuracy, and number of iterations for the present invention on a data set with an equilibrium ratio of 10.

FIG. 8 (a) is a confusion matrix map obtained after a test set is input to a classification model on a balanced data set according to the present invention;

FIG. 8 (b) is a confusion matrix obtained after the present invention inputs the test set into the classification model on the data set with the balance ratio of 2;

FIG. 8 (c) is a confusion matrix map obtained after inputting a test set into a classification model on a data set with an equilibrium ratio of 5;

FIG. 8 (d) is a confusion matrix map obtained after the present invention inputs the test set into the classification model on the data set with the balance ratio of 10.

Detailed Description

The invention discloses a motor bearing fault diagnosis method fusing an improved Wasserstein generation countermeasure network (WGAN-GP) and Alxnet. Firstly, obtaining vibration signals in different states from a sensor, obtaining time-frequency signals through wavelet transformation, and creating data under different equilibrium rates according to actual conditions; inputting the data set into a WGAN-GP network, training the WGAN-GP network to achieve Nash balance, and generating data with a specific label from a generator to supplement unbalanced data; and finally, the balanced data is brought into a fault diagnosis network for feature extraction and fault classification, so that feature learning of different fault categories is realized.

The following detailed description of the embodiments of the invention is made with reference to the accompanying drawings:

as shown in fig. 1, it is an overall flow chart of the present invention, in which: the method comprises the steps of data preprocessing, model generation and fault diagnosis. The data preprocessing comprises three parts of acquiring an original vibration signal, processing the original signal by wavelet transformation and dividing the original signal into a training set and a testing set according to proportion, and the generation model mainly utilizes a mechanism of improving mutual confrontation of WGAN-GP and takes a few samples as network input to generate data for supplementing a data set. The fault diagnosis model is mainly characterized in that balanced samples are used as a training set to train the Alxnet network, and then a test set is input into the trained network to carry out fault diagnosis.

In the present invention, a Karsey university of storage (CWRU) bearing test data set was analyzed. The purpose is to diagnose bearings with different degrees of failure. The experimental equipment comprises a 2hp motor, a torque sensor decoder, a power tester and an electronic controller. The rotating speed of the bearing is 1750r/min, the sample adopts the frequency of 12kHz, under the condition that the load of the motor is 1.5kW, the data of Normal (NC), inner ring fault (IR), rolling body fault (B) and outer ring fault (OR) of the bearing are respectively selected, and the sizes of the faults are respectively 0.007 mm, 0.014 mm and 0.021mm. Thus, bearing fault samples for 10 different states can be obtained, which can be labeled NC, B007, B014, B021, IR007, IR014, IR021, OR007, OR014, OR021 by fault type and fault size, and the fault sample labels for the above types are labeled 0 to 9 in sequence.

In the present invention, a data set of the drive end bearing housing under a 0hp load scenario is employed. Vibration signals were collected at 12000 samples per second, sampled at 300 point intervals, and 1000 samples were taken for each state. In order to verify the generalization performance of the model under different imbalance rate data sets, the invention sets four different equilibrium rate data sets. The equilibrium ratios are 1, 2, 1, 5, and 10, respectively, corresponding to data sets a, B, C, and D; the imbalance ratio is defined as the ratio of the number of normal samples to the number of samples of each fault type; taking the unbalance rate of 5; and finally, constructing a training set and a test set according to the proportion of 8. More data set details are shown in table 1.

TABLE 1 Rolling bearing data set at different Balancing ratios

According to table 1, each fault category under four data sets can be obtained as the number of samples of the training set and the number of samples of the testing set of each data set; then, a few classes of samples are trained and new samples are generated by the improved generation countermeasure network. And combining a small number of balanced samples with a large number of samples, inputting the balanced data set into a fault diagnosis classifier for training, and outputting a fault diagnosis result after the trained classifier inputs a test set.

As shown in fig. 2, the generator model structure diagram of the present invention mainly comprises four convolution layers and two full-link layers, the features in the image can be better extracted by utilizing the convolution layers, and a batch normalization is added after each deconvolution layer and the full-link layer, and the output of the feature layers is normalized together by the batch normalization, thereby accelerating the training speed of the network and improving the stability of the training. A Densenet layer is added after the first deconvolution, the Densenet layer is composed of 3 Denseblocks (dense blocks), in the dense blocks, feature maps of all layers are consistent in size, and the dense blocks are composed of BN, reLU and Conv (Convolition), so that feature propagation is enhanced, the number of parameters in a network is greatly reduced, and the problem of gradient disappearance is relieved. The purpose of adding the self-attention mechanism into the generator is to enable the generated model to efficiently acquire the global dependency inside the features in the process of generating the image, improve the quality and the definition of the generated image, enable the texture details of the image to be more obvious, and improve the aesthetic feeling of the image in the aspects of image texture, brightness, definition and the like.

The generator executes the flow: setting the number of batch processing samples to be 64, the input noise dimension to be 100, the size of a convolution kernel to be 4 x 4, the step length to be 2, the dimension of an input category label to be 10, fusing the noise label and the category label to form a four-dimensional vector, inputting the four-dimensional vector into a generator network, reconstructing the vector into a characteristic diagram of 512 x 8 through two-layer full connection, and increasing the dimension through 5 layers of deconvolution layers; the deep feature extraction holding network is carried out by three dense blocks, and the size of the output image of the generator is 3 multiplied by 128 by one self-attention operation. The number of the convolution kernels is respectively 256, 128, 64 and 3, and the number of the fully-connected output nodes is respectively 512 and 32768.

As shown in fig. 3, in the structure diagram of the discriminator model according to the present invention, the discriminator needs to discriminate whether the input data is from a real picture or a generated picture. Mainly comprises 4 convolution layers and two full-connection layers, and the activating function selects LeakyRelu. Because the gradient punishment mechanism in the improved WGAN-GP network is independently applied to each sample, and the dependency relationship between different samples of the same batch of samples is introduced by using batch normalization, the invention changes the batch normalization into switchable normalization, solves the problems existing in the batch normalization and simultaneously ensures that the model has stronger generalization capability.

The discriminator executes the flow: real data and data generated by a generator are input into a discriminator together, the size of a convolution kernel is 4 multiplied by 4, the step length is 2, the dimension of an input vector is 3 multiplied by 128, dimension reduction is carried out on the input vector and the input vector through four convolution layers, a self-attention module restricts the details of a current synthetic image by using remote details, and finally, a probability value for discriminating the true and false of the image category is output through two full-connection layers. The number of convolution kernels is 64, 128, 256 and 512 respectively, and the number of fully-connected output nodes is 512 and 1 respectively.

As shown in fig. 4, the classifier model structure diagram of the present invention is mainly composed of five convolutional layers and three fully-connected layers, a Relu activation function is added behind each convolutional layer, a Dropout layer is added in the portion of the fully-connected layer, and the Dropout layer has the function of randomly removing some neural nodes in the fully-connected layer to prevent the over-fitting phenomenon.

The execution flow of the classifier: the expanded training set is used as the input of a classifier, the size of an input picture is 128 multiplied by 3, a feature map with convolution kernel of 11 multiplied by 11 and convolution layer output size of step length of 4 of 31 multiplied by 48 is output through the maximum pooling layer, the output size is 15 multiplied by 48, the output size is 5 multiplied by 5 through the convolution kernel, the convolution layer with step length of 1 and the maximum pooling layer are 7 multiplied by 128, the features are extracted continuously through the convolution layers with convolution kernel of 3 multiplied by 3 and step length of 1, the dimension is halved to 3 multiplied by 128 through the maximum pooling layer, and finally, the diagnosis classification result is output after three fully connected layers are connected.

As shown in fig. 5, for the comparison graph of the original image and the image generated by the generator under different balance rates, after the WGAN-GP is improved by training with the training set of each data set, 5 different generated fault samples are randomly selected for comparison with the original image. When the balance ratio is 1. When the balance ratio is 5. Therefore, the data generated by the method provided by the invention is similar to the original sample, and the model can learn the distribution characteristics of the data and has the capability of expanding the original data set. Therefore, the data enhancement method of the invention can be used for solving the problem of data imbalance.

As shown in FIGS. 6 (a) to 6 (b), the present invention generates a comparison graph of the results of t-SNE visualization of the distribution of the picture and the original picture under the balanced data set and the unbalanced data set respectively. Dimension reduction visualization was performed using t-SNE to extract data features and feature visualization was performed on 10 types of generated data and raw data. In each fault type, the raw data is assigned to 10 different color points and the generated data is assigned to 10 different color rice shapes.

It can be seen that, whether in a balanced data set as shown in fig. 6 (a) or an unbalanced data set as shown in fig. 6 (b), in the same fault type, except for the phenomenon that the features of individual classes are discrete, the features of most of the generated samples are very consistent with those of the original data, and therefore the features can be well clustered. It is verified that the proposed data enhancement method can generate synthetic data with similar probability distribution to the original data through the antagonistic training, thereby supplementing the unbalanced data set. Meanwhile, data characteristics of different fault types of the generated data can be clearly distinguished in the graph, which proves that the data generated by the data expansion method can be used for bearing fault classification testing.

As shown in fig. 7 (a) -7 (d), which are loss functions of the data set of 4 equilibrium rates after classification by the classifier according to the present invention, the curves show the loss of the training set and the classification accuracy of the test set. The model training results were satisfactory in each dataset. The accuracy curves in fig. 7 (a) -7 (d) initially oscillate up and down because the network is still in the learning phase, and the characteristics of the original sample are not grasped, but as the number of iterations increases, the classification accuracy of the test set can reach higher accuracy in a short time and finally tends to converge, and the loss function of the training set smoothly decreases and finally tends to be stable near 0.

The results show that the model has good diagnosis effect and can be used in the field of fault diagnosis of unbalanced data sets. Meanwhile, it can be observed that the accuracy and the loss rate curve fluctuation degree are gradually increased along with the increase of the data set unbalance rate. The reason is that when the data is extremely unbalanced, the original data is limited and fewer useful features are extracted, resulting in poor representation of the image generated by the model in details, thereby affecting the stability of the model. As can be seen from the four unbalanced images, the method of the present invention can guarantee highly accurate diagnosis results with low loss.

As shown in fig. 8 (a) -8 (d), the confusion matrix obtained by the classification model of the test set of 4 kinds of data sets with balance ratios is used in the invention, and for the convenience of analyzing the result, the classification confusion matrix of the 1 st calculation result of each data set is used in the invention for quantitative description. In the figure, the abscissa of the confusion matrix is the predicted fault type, the ordinate is the actual fault type, and the items located in the ith row and jth column represent the proportion of the ith state and are classified as the jth state. Thus, the diagonal values represent the correct classification scale for each condition, while the non-diagonal values are equal to the error scale when one condition is classified as the other.

As can be seen from the four different data sets in fig. 8 (a) to fig. 8 (d), the diagnosis accuracy of most of the classes in the different equilibrium rates can reach 100%, and the fault diagnosis accuracy of all the classes is above 98% when the equilibrium rate is 1, 2. It has been found that bearing rolling element failure and inner ring failure are prone to greater false positives as the rate of data set imbalance increases, due to the more complex and more difficult to capture distribution of inner ring failure data and rolling element failure data. But in general the method of the invention can achieve the effect of high accuracy under different fault categories of four data sets.

Claims

1. The bearing fault diagnosis method based on the improved WGAN-GP and the Alxnet is characterized by comprising the following steps of:

step 1: acquiring an original vibration signal of a bearing, and converting the original vibration signal into time-frequency domain signals of different fault categories by using wavelet transformation;

step 2: dividing the time-frequency domain signals acquired in the step 1 into a training set and a test set, and constructing an unbalanced data set;

and 3, step 3: constructing an improved WGAN-GP network, fusing a self-attention module and a DenseNet module in a generator to automatically learn important global information, and adding switchable normalization in a discriminator;

and 4, step 4: taking a training set of the unbalanced data set as the input of the improved WGAN-GP network, repeating the iteration generator and the discriminator until Nash equilibrium is reached, adding fault data generated by the generator into the unbalanced data set for data expansion, and finally obtaining a balanced data set;

2. The bearing fault diagnosis method based on improved WGAN-GP and Alxnet according to claim 1, characterized in that: the step 1 comprises the following steps:

s1.2: the method comprises the following steps of preprocessing an original vibration signal by adopting wavelet transformation, taking a time-frequency domain signal after the wavelet transformation as input data of a discriminator, wherein the formula of the wavelet transformation is as follows:

is a wavelet basis function.

3. The bearing fault diagnosis method based on improved WGAN-GP and Alxnet according to claim 1, characterized in that: in the step 2, in order to verify the generalization performance of the Alxnet model under different unbalanced data sets, the number of training sets is limited, and four comparative experimental data sets are set, wherein the imbalance rates are 1, 2, 1, 5 and 1, respectively, and correspond to data sets a, B, C and D.

4. The improved WGAN-GP and Alxnet based bearing fault diagnosis method according to claim 1, wherein: in the step 3, the discriminator consists of four convolution layers and two full-connection layers; the generator consists of four deconvolution layers and two full-connection layers; the step 3 comprises the following steps:

in the discriminator, because the improved WGAN-GP network independently applies a gradient penalty effect to each sample, the BN is changed into Switchable Normalization (SN), and the generalization capability of the model is increased;

simultaneously, changing an activation function into LeakyRelu; wherein, the SN determines an appropriate Normalization operation for each Normalization Layer IN the deep network using differentiable learning, the SN automatically finds an appropriate Normalization method IN the training process by combining BN, layer Normalization (LN), and Instance Normalization (IN) with six weights, and the output of the N operation can be defined as:

y _sn ＝w _bn y _bn +w _Ln y _Ln +w _In y _In

step S3.2: a DenseNet network is added behind the first convolution layer of the generator; wherein, densenet is composed of two parts of dense blocks and transition blocks, the Densenet structure uses three dense blocks, and in one dense block, the characteristics of the ith layer are represented by the following formula:

x _i ＝H _i ([x ₀ ,x ₁ ,…,x _i-1 ])

step S3.3: adding a self-attention mechanism in a generator and arbiter network;

the formula for the self-attention mechanism is as follows:

in the formula, the vector input from the attention layer is convolved by three different weights to generate three matrixes of Q (Query), K (Key) and V (Value), firstly, the point between Q and K is multiplied, and in order to prevent the result from being overlarge, the product is divided by a scale

5. The bearing fault diagnosis method based on improved WGAN-GP and Alxnet according to claim 1, characterized in that: in the step 4, after the improved WGAN-GP network is constructed, the generator and the discriminator are alternately trained for n times, the training is stopped when Nash balance is achieved, and the generator network is extracted to sequentially generate samples of different fault categories to supplement an unbalanced data set.

6. The improved WGAN-GP and Alxnet based bearing fault diagnosis method according to claim 5, wherein: in the step 4, the training of the improved WGAN-GP network comprises the following steps

S4.1: a new distributed distance measurement method Wasserstein distance is introduced in the generation of a countermeasure network, and a gradient penalty term is added in an original loss function to enable the weight of a discriminator to meet 1-Lipschitz condition limitation, namely for two images x ₁ ，x ₂ The absolute value of the output difference of the discriminator must be less than or equal to the absolute value of the average pixel-by-pixel difference;

fixing the generator, inputting the generated data and the real data of the generator into the discriminator, comparing the obtained diagnosis result with the real result, and using the loss function L of the discriminator _D To modify network parameters, gradient penalties

Sum discriminator loss function L _D The following were used:

in the formula, | · the luminance | | _ρ Representing a ρ norm; z represents the noise vector input by the generator; c is a category label corresponding to each image; g (Z | C) represents the generator generated samples; p _z To generate a probability distribution for the sample;

an expected value representing a distribution function of the generated samples; x represents a real sample; p is _r Representing a probability distribution of a real sample; e _z～Pr An expected value representing a true sample distribution function; d (X | C) represents the output value of the real sample passing through the discriminator; λ represents a regular term coefficient; />

By generating a sample G in the real sample x _z The random interpolation sampling is obtained on the connecting line between the two lines, and the calculation formula is ^>

In which the epsilon parameter obeys [0,1 ]]Uniformly distributed; />

Arbiter output relative to interpolation->

A gradient of (a);

s4.2: introducing a class label as an additional input layer into the generator, and combining the class label with the noise input of the generator to form a combined hidden layer expression so as to prompt the generator to conditionally supervise and generate a sample with a specific class characteristic; then L is put ₁ Function addition to Generator loss function L _G In the method, the low-frequency characteristics of the image are captured, so that the generated picture is clearer and more real; and fixing the arbiter, passing the generator loss function L _G To modify generator network parameters;

L ₁ loss function and generator loss function L _G As follows:

an expected value representing a distribution function of the generated samples; λ is L ₁ Loss over-parameter.

7. The bearing fault diagnosis method based on improved WGAN-GP and Alxnet according to claim 1, characterized in that: in the step 5, fault diagnosis is performed by designing an Alxnet network, wherein the Alxnet mainly comprises five convolutional layers and three full connection layers, and a Dropout layer is added in the network.

8. The bearing fault diagnosis method based on improved WGAN-GP and Alxnet according to claim 1, characterized in that: the step 5 comprises the following steps:

s5.1: taking the expanded balanced data set as input, and extracting deep features through the convolutional layer, the Relu activation function and the maximum pooling layer;

in the formula, y ⁽ⁱ⁾ Labeling the real sample;

predicting an output value for the sample;