CN112699722A

CN112699722A - Method for fault diagnosis based on rotating machinery

Info

Publication number: CN112699722A
Application number: CN202011118967.1A
Authority: CN
Inventors: 谢怡宁; 曹丰; 何勇军
Original assignee: Harbin University of Science and Technology
Current assignee: Harbin University of Science and Technology
Priority date: 2020-10-19
Filing date: 2020-10-19
Publication date: 2021-04-23

Abstract

The present invention relates to a fault diagnosis method for a rotary machine. The current fault diagnosis software cannot automatically diagnose the rotating machine, the precision is low, and the traditional method has low efficiency of fault diagnosis of equipment. In order to solve the problem, a fault diagnosis method fusing a CNN and a GMM is provided. In the training stage, a large amount of vibration data is used for training a CNN to extract vibration signal characteristics, and then a GMM is trained for each type of fault. In the fault diagnosis stage, new input signals are classified on a GMM model after feature extraction, and fault diagnosis is realized. Experiments show that compared with the traditional method, the method has higher accuracy in the aspect of fault diagnosis of the rotary machine. The method is applied to the field of fault diagnosis of the rotary machine.

Description

Method for fault diagnosis based on rotating machinery

Technical Field

The invention belongs to the field of fault diagnosis, and particularly relates to a fault diagnosis method fusing a Convolutional Neural Network (CNN) and a Gaussian Mixture Model (GMM).

Background

With the progress of information technology and times, modern industrial systems are gradually developing towards the direction of complication. Mechanical equipment carriers related to a control system are more and more complex and bulky, faults are easy to occur in the operation process, if the faults cannot be processed in time, the personal safety of operators is threatened to a great extent, and economic losses are caused. The rotary machine is widely used in gas engine, wind power generator, airplane engine and other equipment as the important component of mechanical equipment, and plays a key role in production. The rotary machine works ceaselessly, faults such as misalignment of a rotor, gear looseness, bearing abrasion and the like easily occur, the vibration of equipment is overlarge, the integral structure of a system is changed, the production efficiency is finally reduced, and even safety accidents occur. Fault diagnosis is therefore of great importance to ensure the safety and reliability of rotating machines.

The bearing is a core component of the rotating machine, and the failure of the bearing can cause the whole device to be incapable of operating normally. In order to better realize the fault diagnosis of the bearing, new methods are continuously tried and explored. The current fault diagnosis methods mainly comprise: bayesian inference based methods, neural network based methods, and deep learning based methods. The Bayesian inference method is based on the attribute condition independence assumption which is often not satisfied in the practice process, so that the method has certain limitation in the application of actual bearing fault diagnosis. Although the artificial neural network technology has been used as a pattern recognition technology with wide application, further intensive research is needed to improve the bearing fault diagnosis precision of the neural network due to the difficulties of sample dependence, difficult determination of the network structure and the like of the model. The deep learning has the capability of self-adaptively learning the fault characteristic quantity of the equipment, so that the defect of manually constructing the fault characteristic quantity is effectively avoided, and the mechanical fault diagnosis technology based on the deep learning has wide application space along with the continuous development of related theories. Aiming at the problem that the traditional method and the machine learning method are difficult to accurately express complex data characteristics, the CNN and the GMM are fused to form a new fault diagnosis framework. Compared with other fault diagnosis methods, the method provided by the invention improves the fault identification precision, so that the method provided by the invention has important significance.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a novel fault diagnosis framework fusing a CNN (CNN) and a GMM (Gaussian mixture model) method.

The technical scheme adopted by the invention for solving the technical problem mainly comprises the following steps:

the method comprises the following steps: and training a feature extraction model. Firstly, a convolutional neural network is utilized to train a vibration data signal, and a deep learning model capable of extracting characteristics of the vibration data signal is obtained.

Step two: a feature vector is generated. And then inputting the vibration data signals into the trained model for feature extraction, and generating feature vectors which are used for establishing a fault mode library and describing the original vibration signals.

Step three: and establishing a fault mode library. And fitting the extracted characteristic vectors with spatial distribution by using a Gaussian mixture model theory so as to obtain a Gaussian mixture model under each fault mode and successfully establish a fault mode library.

Step four: and (5) fault diagnosis. And then inputting the test sample into a CNN model for feature extraction in a test stage, finally inputting the extracted features into a plurality of GMMs, calculating a likelihood value p of the feature vector belonging to the model, and taking out the model type corresponding to the maximum value p, namely the diagnosis result.

Drawings

FIG. 1 is a general CNN-GMM fault diagnosis flow diagram;

fig. 2CNN network structure display diagram;

FIG. 3 is a schematic diagram of the GMM-UBM method training;

FIG. 4 is a time domain diagram of a vibration signal;

detailed description of the invention

The invention is further described below with reference to the accompanying drawings.

As shown in the attached figure 1, the fault diagnosis method for the rotary machine is implemented as follows.

The method comprises the following steps: the CNN is trained for feature extraction. The number of network layers, convolution kernel size and number, convolution step size, pooling size, dropout layer parameters, and activation functions are determined herein by tuning parameter optimization. The structure of the CNN network is shown in fig. 2, wherein the activation function of each convolutional layer is Relu; the first three pooling layers are maximum pooling layers, and the fourth layer is an average pooling layer for feature dimension reduction. And finally, outputting the data through the full connection layer, and using softmax by the activation function. And finally, learning the network by adopting an adm optimizer and a coordinated _ cross loss function. The network layers and model parameter settings are shown in the figure. The network takes a 6000 x 1 matrix as input vector. The first convolutional layer defines 16 filters, the output is a 2997 × 16 matrix, each column of the output matrix contains the weight values of one filter, and each filter contains 2997 weight values. The output result of the first layer of convolutional layer is input into the second layer of convolutional layer, 16 filters are defined in the layer again for training, the logic of the layer is the same as that of the previous layer, and the output matrix is a 1499 x 16 matrix. To reduce the complexity of the output features and to prevent overfitting of the data, a maximum pooling layer is added, where a pooling layer of size 2 is chosen, and the output matrix is only half the size of the input matrix, which is a 749 x 16 matrix. The data then passes through two convolutional layers, the output matrix after these two layers is a 188 x 64 matrix. And inputting the data into the pooling layer for processing, and selecting the pooling layer with the size of 2, wherein the size of the output matrix is only half of that of the input matrix, and the output matrix is a 94 × 64 matrix. Constructing two convolutional layers further processes the data, the output matrix after these two layers is a 24 x 256 matrix. When data is input into the pooling layer for processing, and the pooling layer with the size of 2 is also selected, the output matrix size is only half of the input matrix, and the output matrix is a 12 × 256 matrix. Finally, the data is processed by using two convolutional layers, and the output matrix after the two layers is a 12 x 512 matrix. The data is input into the last largest pooling layer, and a pooling layer of size 2 is also selected, so that the size of the output matrix is only half of that of the input matrix, and the output matrix is a 6 × 512 matrix. And finally, an average pooling layer is added to avoid overfitting, all weight averages in the neural network are taken at the moment, and the output matrix is a 1 x 512 matrix. The output data is input into a Dropout layer which assigns the neurons in the network randomly to zero weights, with a parameter of 0.3 meaning that 30% of the neurons will be zero weights. The output of this layer is still the same as the output of the previous layer, being a 1 x 512 matrix. The last layer uses the Softmax function as the activation function to reduce the length of 512 vectors to length of 10 vectors. This vector represents the probability of occurrence of each of the 10 classes.

Step two: and performing feature extraction on the vibration signals by using the trained CNN model to obtain a large number of feature vectors of the vibration signals. And finally, the feature vectors are used for training a GMM model to establish a fault mode library.

Step three: the vibration data signals are subjected to the CNN method characteristic extraction, and N pieces of 512-dimensional data can be obtained. Modeling the extracted 10 types of feature vectors by utilizing the space fitting capacity of the GMM to the feature vectors, thereby obtaining the GMMs of 10 fault types. The feature distribution of the vector is described by a linear combination of K gaussian functions, the probability density function of the GMM is represented by equation 1:

where x is an L-dimensional feature vector, ω_kFor the weight of each single Gaussian density function, N (x | mu)_k,∑_k) Called Gaussian distribution function, mu_kRepresents the mean vector, ∑_kRepresenting a covariance matrix, weight ω_kThe sum of coefficients satisfying the respective gaussian components is 1, that is:

generally in the field of recognition, sigma is often used for computational convenience_kSetting as a diagonal matrix:

∑_k＝(σ_k1,σ_k2,...,σ_kL) (3)

thus, a complete GMM is formed from the mean vector μ_kCovariance matrix Σ_kAnd weight ω_kThree parameters, which can be expressed as:

λ＝{ω_k,μ_k,∑_k|k＝1,2,...,K} (3)

next, values of the model parameters are determined. There are three parameters of the GMM, and learning of these three parameter variables is usually based on a maximum likelihood criterion, of which the maximum expectation algorithm (EM) is a typical one. It is a two-stage iterative algorithm, respectively an expectation calculation stage (E-step) and a maximization stage (M-step). Firstly, estimating a complete set of vectors by using given parameters and incomplete observation vectors to obtain the likelihood of the complete set of vectors; the parameters are then re-estimated, maximizing this likelihood using the maximum likelihood criterion. And finally, the two steps are continuously iterated and repeated until convergence, and finally the model parameters are trained.

Setting a group of vibration data characteristic sequence X with the length of N as (X)₁,X₂,...,X_N) And finding the parameters of the GMM according to the maximum likelihood criterion so that the likelihood of the training features of the GMM model is maximum. The likelihood of a GMM can be expressed as:

firstly, initializing a model to obtain an initial model lambda₀And then using EM algorithm to pair lambda₀And carrying out iterative operation on each parameter until convergence. The iterative formula for estimating GMM parameters with the EM algorithm is as follows:

weight ω of mth GMM model_m：

Mean μ of mth GMM model_m：

Variance σ of mth GMM model_m ²：

The posterior probability of the mth GMM model calculated from the step E is as follows:

in addition, we consider adding a Universal Background Model (UBM). The method can train the GMM of each failure mode through Maximum A Posteriori (MAP) by using a small amount of target failure training data and a large amount of non-target failure training data. Model training after UBM addition is shown in FIG. 3. The difference between the added GMM-UBM model and the original GMM model is that the vibration data of the non-target failure mode is used for pre-training, so that the adjustment and training time of the vibration data training parameters of the target failure mode is reduced.

During GMM-UBM training, feature distribution of non-target fault training data is fitted by using the UBM, the target fault training data are scattered near some Gaussian distributions of the UBM, and each Gaussian distribution of the UBM is shifted to the target fault data by using a MAP algorithm. The specific calculation method is as follows:

firstly, target fault training data X is given (X)₁,X₂,...,X_N) Calculating the similarity of the mth Gaussian distribution in X and UBM:

the weight, mean and variance parameters are then updated:

finally, parameters of the UBM model are corrected according to the result calculated above.

Get the weight after correction

Comprises the following steps:

obtaining the corrected mean value

Comprises the following steps:

the variance after correction is obtained as

Comprises the following steps:

in the formula alpha_m ^w、α_m ^m、α_m ^vCorrection factors corresponding to the model parameters, respectively, using the normalization factor gamma to weight the corrected weights

The sum is 1. The GMM-UBM is suitable for the situation that the current target fault data volume is insufficient, so that the aim of training an ideal GMM model can be fulfilled. The method comprises the steps of pre-training through non-target fault data to obtain a UBM model, and then carrying out fine adjustment on the pre-trained UBM model to a target GMM model through a self-adaptive algorithm.With this method, the amount of target failure data and training time can be reduced.

And finally obtaining a corrected model which is the GMM-UBM model through the process. In the identification stage, the output ratio of the likelihood of the feature vector to be identified and the UBM model and the target model is respectively calculated, under the log scoring criterion, the likelihood of the vibration signal to be identified is the difference between the two model likelihood logarithms, and the calculation formula is as follows:

in the formula, X_nIs the characteristic vector of the signal to be identified, the target model and the UBM model are respectively represented by lambda_TAnd λ_US is the final recognition likelihood score.

Step four: inputting the vibration signal shown in fig. 4 into a trained CNN model for feature extraction, then inputting the extracted feature vector into a plurality of GMMs, calculating a likelihood value p that the feature vector belongs to the model, and taking out the model type corresponding to the maximum value p, namely the diagnosis result.

Without departing from the spirit and substance of the invention, those skilled in the art can make corresponding modifications according to the invention, but such corresponding modifications are intended to fall within the scope of the appended claims.

Claims

1. The invention is carried out by the following steps: firstly, training a deep learning model; secondly, feature extraction; thirdly, establishing a fault mode library; fourthly, diagnosing faults.

In the stage of establishing a fault mode library, the invention provides a protection method as follows:

and modeling the extracted multiple eigenvectors by utilizing the space fitting capacity of the GMM to the eigenvectors so as to obtain the GMMs of multiple fault types. After the number of Gaussian mixture models is determined by the model, the initial value of the relevant parameter of each model is given, and then the expectation-maximization algorithm is used for calculation. Firstly, calculating an expected step, calculating the posterior probability of the latent variable by using the initial value or the parameter value obtained by iteration in the previous step, and obtaining the estimated value of the latent variable in the step. And then, carrying out maximization step calculation, and utilizing the likelihood function to maximize to obtain a new parameter value. And finally, training to obtain the GMM model of each fault mode. In addition, a generic background model is added. The method can train the GMM of each fault mode through the maximum posterior probability by using a small amount of target fault training data and a large amount of non-target fault training data. The difference between the added GMM-UBM model and the original GMM model is that the vibration data of the non-target failure mode is used for pre-training, so that the adjustment and training time of the vibration data training parameters of the target failure mode is reduced.

2. In the fault diagnosis stage, the protection method provided by the invention comprises the following steps:

the invention provides a new fault diagnosis framework which integrates the CNN method and the GMM method. And inputting the vibration signal into a CNN and GMM frame, and judging the fault type of the signal in a trained fault mode library.