CN116010900A

CN116010900A - Multi-scale feature fusion gearbox fault diagnosis method based on self-attention mechanism

Info

Publication number: CN116010900A
Application number: CN202310019070.0A
Authority: CN
Inventors: 陶洪峰; 史浩进; 沈凌志; 黄远
Original assignee: Jiangnan University
Current assignee: Jiangnan University
Priority date: 2023-01-06
Filing date: 2023-01-06
Publication date: 2023-04-25

Abstract

The invention discloses a multi-scale feature fusion gearbox fault diagnosis method based on a self-attention mechanism, which relates to the technical field of fault diagnosis and comprises the following steps: obtaining a fixed-length diagnosis sample from a one-dimensional original vibration signal of the gear box by utilizing a random window; constructing a multi-scale feature fusion fault diagnosis model based on a self-attention mechanism and training by using a Softmax function as a classifier; model training is carried out by a back propagation method by utilizing a dynamic clipping Adam optimizer; and (5) storing the trained fault diagnosis model for on-line diagnosis. According to the invention, the low-frequency characteristics and the local time domain characteristics of the original vibration signals of the gearbox are respectively extracted through convolution kernels with different scales, an improved self-attention mechanism is introduced to construct a multi-scale characteristic fusion network to replace a traditional splicing method, and the time-frequency characteristic internal connection of the vibration signals is further excavated to improve the model diagnosis performance; meanwhile, batch normalization is introduced to reduce internal variable offset, so that intelligent and efficient end-to-end fault diagnosis is realized.

Description

Multi-scale feature fusion gearbox fault diagnosis method based on self-attention mechanism

Technical Field

The invention relates to the technical field of fault diagnosis, in particular to a fault diagnosis method of a multi-scale feature fusion gearbox based on a self-attention mechanism.

Background

The gearbox is a common part in rotary mechanical equipment, and is widely applied to the fields of aerospace, industrial production, wind power generation and the like because of the advantages of large transmission ratio, strong bearing capacity, compact structure and the like. Because the working environment is worse, the rotating speed and the load are changed frequently, various faults are easy to occur, the normal operation of equipment is influenced, and even safety accidents are caused. Therefore, the fault diagnosis method for the gear box has great value in improving the reliability of equipment and reducing the occurrence of accidents.

The vibration signal of the gearbox carries a large amount of information reflecting the fault category of the vibration signal, and the fault diagnosis based on the vibration signal is critical to the acquisition of fault sensitive information of the vibration signal. However, the vibration signal variation caused by the fault is easily submerged by complex background noise and interference signals, and the characteristic extraction is difficult due to the fact that the vibration signal variation has time multi-scale characteristics. And because the distribution of fault sensitive information in the original vibration signal under the variable working condition is different from that of the signal under the working condition of the training set, aliasing is generated between different fault characteristics, and the nonlinearity degree of the signal is changed, so that the variable working condition diagnosis is difficult. Therefore, how to mine the fault-sensitive features of multi-level, nonlinear and non-stationary vibration signals is a primary problem in gearbox fault diagnosis.

In recent years, deep learning algorithms are often used to automatically obtain more advanced abstract features from large-scale data, where convolutional neural networks have gained widespread attention in the field of predictive classification due to their prominent feature extraction capabilities. Aiming at the multi-level characteristic of the vibration signal, the convolution kernels with different scales are utilized to form an effective characteristic extraction method. However, the traditional splicing method only carries out mechanical combination on the multi-scale characteristics, so that information redundancy is caused, the difficulty of network training is increased, and meanwhile, the diagnosis precision is reduced. There is therefore a need for more efficient feature fusion strategies.

Disclosure of Invention

Aiming at the problems and the technical requirements, the inventor provides a multi-scale feature fusion gearbox fault diagnosis method based on a self-attention mechanism, and the technical scheme of the invention is as follows:

a multi-scale feature fusion gearbox fault diagnosis method based on a self-attention mechanism comprises the following steps:

step one: obtaining vibration signals of the gear box in different fault modes, wherein the vibration signals are one-dimensional time sequence data;

step two: dividing the collected samples into a training set, a verification set and a test set according to a preset proportion, wherein the method comprises the following steps: intercepting one-dimensional time sequence data by utilizing sliding windows with the same size to obtain samples, and setting each sample label according to the actual fault type; dividing the obtained sample into a training set, a verification set and a test set according to a preset proportion;

step three: constructing a multi-scale feature fusion fault diagnosis model based on a self-attention mechanism;

the multi-scale feature fusion fault diagnosis model based on the self-attention mechanism comprises a low-frequency feature extraction path and a local time domain feature extraction path which are parallel, a feature fusion network and a classification network;

the low-frequency characteristic extraction path and the local time domain characteristic extraction path comprise a convolution layer, a pooling layer and a batch normalization layer, the input is an original vibration signal of the gear box, the paths comprise three layers of convolution structures, and the output is a characteristic vector with the same size; the low-frequency characteristic extraction path adopts a large convolution kernel, and the local time domain characteristic extraction path adopts a small convolution kernel; the feature fusion network comprises a multi-head self-attention (MHSA) module, a modified convolution self-attention (CBAM) module and a batch normalization layer; the classification network comprises a pooling layer, a full connection layer and a Softmax classifier, wherein the characteristic dimension is reduced by utilizing average pooling to prevent overfitting, and a layer of full connection and Softmax classifier is adopted to classify fault characteristics; the excitation function of the CBAM module adopts Sigmoid, and the rest of the activation functions of the model adopt Relu;

step four: inputting the training set into the constructed fault diagnosis model for training;

during training, firstly inputting a labeled sample in a training set into a fault diagnosis model to obtain prediction output, then calculating cross entropy loss with a real label, and then using an Adam optimizer to reversely propagate and optimize network parameters until the training loss is stabilized below a set value or reaches iteration times;

step five: intercepting a vibration signal sample to be detected according to a certain length, and inputting the vibration signal sample to be detected into the fault diagnosis model trained in the fourth step to obtain a fault diagnosis result.

The further technical scheme is that in the second step:

the size of the sliding window is 1024, and the interception mode is random interception; the training set, validation set and test set ratio is 6:3:1, sample label set to 0, 1.

The further technical scheme is that in the third step:

the low-frequency characteristic extraction path structure comprises a first convolution layer, a second convolution layer, a first pooling layer, a third convolution layer and a self-adaptive pooling layer I which are sequentially connected, and the local time domain characteristic extraction path structure comprises a fourth convolution layer, a second pooling layer, a fifth convolution layer, a third pooling layer, a sixth convolution layer and a fourth pooling layer which are sequentially connected; aiming at the vibration signal input of the one-dimensional gearbox, each convolution layer uses a one-dimensional convolution neural network; the first to fourth pooling layers adopt maximum pooling; the self-adaptive pooling layer I is used for carrying out average pooling on input according to a given output dimension, and the self-adaptive pooling layer I is used for dimension trimming; after each convolution layer and the MHSA module, batch normalization layers are introduced to adjust covariate offset, so that training performance is improved; for sample x input in bulk _i The batch normalization layer does the followingThe operation is as follows:

wherein μ and σ ² For the mean and variance, m is the total number of input samples, ε is a constant, γ and β are the learnable parameters,

is output.

The further technical scheme is that in the third step:

the MHSA module utilizes projections under different parameters to perform similarity-based vibration signal fusion characteristic learning; the MHSA module is provided with three same-dimensional inputs, namely a query matrix Q, a key matrix K and a value matrix V, and takes low-frequency feature vectors output by a low-frequency feature extraction path as Q, K inputs respectively as main features; taking the local time domain feature vector output by the local time domain feature extraction path as V input as an auxiliary feature; the main characteristic and the auxiliary characteristic are respectively subjected to multiple linear projections to carry out attention calculation, and the calculation formula is as follows:

/>

wherein d _k In order to query the dimensions of the vector,

and->

Respectively corresponding vectors after projection;

the CBAM module injects the attention map into two independent dimensions of space along the path of the feature vector; in the channel attention, a method of compressing the spatial dimension of the input feature map is adopted; in spatial attention, average pooling and maximum pooling are performed along the channel dimension and connected to describe the feature information; by distributing the respective attention weights, the input feature vectors are subjected to self-adaptive feature refinement, and fault sensitive information is further learned; the input of the CBAM module is the output of the MHSA module, and the mathematical model of the CBAM module is as follows after batch normalization processing before the input:

M _c (F)＝σ(MLP(AvgPool(F))+MLP(MaxPool(F)))；

M _s (F)＝σ(f[(AvgPool(F))；(MaxPool(F))])；

wherein F is a feature vector output by the MHSA module, M _c And M is as follows _s Channel attention and spatial attention are respectively given,

representing multiplication of corresponding position elements, MLP represents a multi-layer perceptron, avgPool and MaxPool represent average pooling and maximum pooling, respectively, f is a convolution operation, and σ is a Sigmod activation function.

The method further comprises the following steps:

because the one-dimensional convolution path is used for feature extraction, the spatial attention is improved, the spatial attention of the improved 1D-CBAM module is the same as that of the original channel, parallel full-connection layers are used for replacing convolution operation, and compared with local connection of convolution kernels, the full-connection structure has global receptive field, and attention weight distribution can be carried out on the whole; meanwhile, in order to prevent gradient attenuation caused by the Sigmoid function, residual connection is introduced between the output of the MHSA module and the main characteristic input, and the mathematical model of the improved 1D-CBAM module is updated as follows:

M _s (F)＝σ(MLP(AvgPool(F))+MLP(MaxPool(F)))；

the feature fusion network firstly utilizes an MHSA module to carry out preliminary fusion on the input multi-scale features, captures the joint corresponding relation of fault information, introduces residual connection between the output of the MHSA module and the input of main features, and emphasizes the effect of low-frequency features; secondly, the covariate offset is adjusted through batch normalization, so that the training performance of the model is improved; and finally, the CBAM module highlights the fusion characteristic fault sensitive part, and the output and the input are in the same dimension, so that the information expression capability is improved.

The further technical scheme is that in the third step:

the classified network structure comprises a flat layer, a self-adaptive pooling layer and a full-connection layer which are connected in sequence

-Softmax classifier; the self-adaptive pooling layer II is used for reducing the dimension of fusion characteristics and preventing overfitting; the Softmax classifier is a supervised learning classifier, and is output as a one-dimensional feature vector, and the value of each position of the vector corresponds to the probability of the fault type; assuming that the total sample amount of the training set is N and is divided into C categories, marking the forecast output of the ith sample as y _i (y _i E 1, 2., where, c), input sample x _i Probability belonging to class c uses P (y _i ＝c|x _i ) Representing the output value g corresponding to each position of the Softmax function _w,b (x _i ) Expressed as:

wherein w is ^c And b ^c Parameters of each fault type; the final classification result of the Softmax classifier is the fault type corresponding to the position with the highest probability value.

The further technical scheme is that in the fourth step:

the cross entropy loss function is used in combination with the Softmax classifier to calculate the degree of difference between the prediction category and the fault category, and the model internal parameters are updated by back propagation with the aim of minimizing the loss function, and the mathematical expression of the cross entropy loss function is as follows:

wherein: n is the number of samples of such faults; d, d _i And y _i The true value and the predicted value of the ith sample are respectively;

the Adam optimizer is a first-order optimization algorithm capable of replacing the traditional random gradient descent process, can accelerate network convergence, and prevents a model from sinking into local optimum. The iterative process of Adam optimizer is as follows:

wherein m is _t And n _t Gradient g as an objective function _t T represents the current iteration batch and t-1 represents the previous batch; beta ₁ And beta ₂ Representing the rate of the matrix exponential decay,

and->

Is to m _t And n _t Is corrected by the correction of (a); θ represents model parameters, η represents learning rate, ε takes 10 ^-8 ；

Because the Adam optimizer can generate extreme learning rate in the later training stage, the model convergence is affected, the learning rate needs to be dynamically cut, the upper and lower bounds of the learning rate are given, and the later model convergence is stabilized.

The beneficial technical effects of the invention are as follows:

1) According to the multi-scale feature fusion gearbox fault diagnosis method based on the self-attention mechanism, fault sensitive information can be extracted from original vibration signals of the gearbox well through the multi-scale feature extraction method and an effective feature fusion strategy to perform fault diagnosis;

2) Feature fusion is carried out through an embedded improved self-attention mechanism, the problem that the conventional multi-scale model directly carries out Concat to cause redundancy of feature information and influence the classification performance of the model is solved, the input feature dimension is identical to the fusion feature dimension, and the feature information expression capability is improved; the feature fusion strategy taking the frequency domain features as the main and the local time domain features as the auxiliary can effectively adapt to the gearbox running environment with the working condition change, and the model has certain cross-domain diagnosis performance;

3) According to the fault diagnosis method and the fault diagnosis device, the built fault diagnosis model is used, input is not required to be preprocessed by means of a traditional signal processing technology, the bottom fault characteristics can be automatically acquired, fault sensitive information is highlighted, the fault category of the gearbox is finally directly output, and the intelligent diagnosis of the end-to-end fault is realized.

Drawings

FIG. 1 is a flow chart of a gearbox fault diagnosis method provided herein.

Fig. 2 is a schematic diagram of multi-scale feature extraction provided herein.

Fig. 3 is a schematic diagram of a multi-scale feature fusion network based on a self-attention mechanism provided herein.

FIG. 4 is a block diagram of a multi-scale feature fusion fault diagnosis model based on a self-attention mechanism provided by the present application.

Detailed Description

The following describes the embodiments of the present invention further with reference to the drawings.

The application provides a gearbox fault diagnosis method based on multi-scale feature fusion (MSC-MHSA-CBAM) of a self-attention mechanism, as shown in figure 1, and the specific implementation mode of the method comprises the following steps:

step one: vibration signals of the gear box under different fault modes are obtained through the signal acquisition equipment.

Specifically, the accelerometer is responsible for collecting the change of the vibration signal of the side face of the gear box within a period of time, and the change is input into the computer through the data line for storage, so that a signal sample of one-dimensional time sequence data is obtained, and the fault mode of the gear box at the time is recorded. The experiment platform consists of a driving motor, a controller, a planetary gear box, a parallel gear box and a brake. Wherein, the motor type is a 3-phase and 3HP motor, and the power supply is three-phase alternating current (230V, 60/50 Hz). Four planetary gear failure modes and four bearing failure modes are prefabricated on the planetary gearbox. Gear failures include tooth face wear, missing teeth, root cracks, tooth breaks. Bearing faults include spherical faults, inner ring faults, outer ring faults, and hybrid faults of the three bearing faults. Thus, in a normal state, 9 kinds of vibration signals are collected in total.

Step two: and dividing a training set, a verification set and a test set according to the acquired original vibration signal sample.

Specifically, the one-dimensional vibration data is intercepted by using a random sliding window, the intercepting length is 1024, 240 samples are selected for the training set under each fault type, 120 samples are selected for the verification set, 40 samples are selected for the test set, and the sample set is disturbed before the model is input.

Step three: and constructing an MSC-MHSA-CBAM fault diagnosis model.

And constructing a model by adopting a Pytorch deep learning framework. The MSC-MHSA-CBAM model comprises three parts, namely a multi-scale feature extraction network, a feature fusion network and an end classification network. The multi-scale feature extraction network comprises a low-frequency feature extraction path and a local time domain feature extraction path, and is used for constructing convolution paths of different scales to extract periodic low-frequency features and local time domain detail features of original vibration signals according to the principle of the multi-scale feature extraction network shown in figure 2, and preliminarily extracting fault features of different levels of the signals; then, a self-attention mechanism is utilized to construct a feature fusion network, as shown in fig. 3, and fault sensitive information is captured; and finally outputting the prediction label through a pooled full-connection classification network. The overall structure of the MSC-MHSA-CBAM model is shown in FIG. 4.

Step four: and inputting the training set samples into the MSC-MHSA-CBAM fault diagnosis model in batches for training, and adjusting model parameters according to the diagnosis performance on the verification set.

In this embodiment, the batch size is set to 64, the initial learning rate and the upper limit of dynamic clipping of the adam optimizer are both 0.01, the lower limit is 0.001, and the upper limit of training is 3000 times. The internal parameters are updated by minimizing the cross entropy loss function until the training loss stabilizes below the set point or the number of iterations is reached. The model batch size, learning rate and network parameters were adjusted by experience and repeated experiments, the final model was saved, and the parameter settings are shown in table 1.

TABLE 1MSC-MHSA-CBAM model parameters

Step five: inputting the verification set into a trained gearbox fault diagnosis model to perform online fault diagnosis, obtaining a diagnosis result, and verifying the fault diagnosis performance of the model.

What has been described above is only a preferred embodiment of the present application, and the present invention is not limited to the above examples. It is to be understood that other modifications and variations which may be directly derived or contemplated by those skilled in the art without departing from the spirit and concepts of the present invention are deemed to be included within the scope of the present invention.

Claims

1. The multi-scale feature fusion gearbox fault diagnosis method based on the self-attention mechanism is characterized by comprising the following steps of:

step two: intercepting the one-dimensional time sequence data by utilizing a sliding window with the same size to obtain samples, and setting each sample label according to the actual fault type; dividing the obtained sample into a training set, a verification set and a test set according to a preset proportion;

the multi-scale feature fusion fault diagnosis model based on the self-attention mechanism comprises a low-frequency feature extraction path, a local time domain feature extraction path, a feature fusion network and a classification network which are in parallel;

the low-frequency characteristic extraction path and the local time domain characteristic extraction path comprise a convolution layer, a pooling layer and a batch normalization layer, input is an original vibration signal of the gear box, and output is a characteristic vector with the same size; the low-frequency characteristic extraction path adopts a large convolution kernel, and the local time domain characteristic extraction path adopts a small convolution kernel; the feature fusion network comprises an MHSA module, an improved CBAM module and a batch normalization layer; the classification network comprises a pooling layer, a full connection layer and a Softmax classifier, wherein the characteristic dimension is reduced by utilizing average pooling, and fault characteristic classification is carried out by adopting a full connection layer and the Softmax classifier; the excitation function of the CBAM module adopts Sigmoid, and the rest of the activation functions of the model adopt Relu;

during training, firstly inputting a labeled sample in the training set into the fault diagnosis model to obtain a prediction output, then calculating cross entropy loss with a real label, and then using an Adam optimizer to reversely propagate and optimize network parameters until the training loss is stabilized below a set value or reaches iteration times;

2. The self-attention mechanism based multi-scale feature fusion gearbox fault diagnosis method of claim 1, wherein in step two:

the size of the sliding window is 1024, and the interception mode is random interception; the ratio of the training set to the verification set to the test set is 6:3:1, sample label sets to 0, 1.

3. The self-attention mechanism based multiscale feature fusion gearbox fault diagnosis method of claim 1, wherein in step three:

the low-frequency characteristic extraction path structure comprises a first convolution layer, a second convolution layer, a first pooling layer, a third convolution layer and a self-adaptive pooling layer I which are sequentially connected, and the local time domain characteristic extraction path structure comprises a fourth convolution layer, a second pooling layer, a fifth convolution layer, a third pooling layer, a sixth convolution layer and a fourth pooling layer which are sequentially connected; aiming at the vibration signal input of the one-dimensional gearbox, each convolution layer uses a one-dimensional convolution neural network; the first to fourth pooling layers adopt maximum pooling; the adaptive pooling layer I is used for carrying out average pooling on input according to a given output dimension, and the adaptive pooling layer I is used for dimension trimming; after each convolution layer and the MHSA module, batch normalization layers are introduced to adjust covariate offset, so that training performance is improved; for sample x input in bulk _i The batch normalization layer performs the following operations:

/>

is output.

4. The self-attention mechanism based multiscale feature fusion gearbox fault diagnosis method of claim 1, wherein in step three:

the MHSA module performs vibration signal fusion characteristic learning based on similarity by utilizing projections under different parameters; the MHSA module is provided with three same-dimensional inputs, namely a query matrix Q, a key matrix K and a value matrix V, and takes low-frequency feature vectors output by the low-frequency feature extraction path as Q, K inputs respectively as main features; taking the local time domain feature vector output by the local time domain feature extraction path as V input as an auxiliary feature; the main characteristic and the auxiliary characteristic are respectively subjected to multiple linear projections to perform attention calculation, and a calculation formula is as follows:

wherein d _k In order to query the dimensions of the vector,

and->

Respectively corresponding vectors after projection;

the CBAM module injects the attention map along two independent dimensions of the channel and space of the feature vector; in the channel attention, a method of compressing the spatial dimension of the input feature map is adopted; in spatial attention, average pooling and maximum pooling are performed along the channel dimension and connected to describe the feature information; by distributing the respective attention weights, the input feature vectors are subjected to self-adaptive feature refinement, and fault sensitive information is further learned; the input of the CBAM module is the output of the MHSA module, and the mathematical model of the CBAM module is as follows after batch normalization processing before the input:

M _c (F)＝σ(MLP(AvgPool(F))+MLP(MaxPool(F)))；

M _s (F)＝σ(f[(AvgPool(F))；(MaxPool(F))])；

5. The self-attention mechanism based multiscale feature fusion gearbox fault diagnosis method of claim 4, further comprising:

the spatial attention is improved, the spatial attention of the improved 1D-CBAM module is the same as the attention of the original channel, and a parallel full-connection layer is used for replacing convolution operation; meanwhile, residual connection is introduced between the output of the MHSA module and the main characteristic input, and then the mathematical model of the improved 1D-CBAM module is updated as follows:

M _s (F)＝σ(MLP(AvgPool(F))+MLP(MaxPool(F)))；

6. the self-attention mechanism based multiscale feature fusion gearbox fault diagnosis method of claim 1, wherein in step three:

the classification network structure comprises a flat layer, a self-adaptive pooling layer II, a full-connection layer and a Softmax classifier which are connected in sequence; the self-adaptive pooling layer II is used for reducing the dimension of fusion characteristics and preventing overfitting; the Softmax classifier is a supervised learning classifier, and is output as a one-dimensional feature vector, and the value of each position of the vector corresponds to the probability of the fault type; assuming that the total sample amount of the training set is N and is divided into C categories, marking the predicted output of the ith sample as y _i (y _i E 1, 2., where, c), input sample x _i Probability belonging to class c uses P (y _i ＝c|x _i ) Representing the output value g corresponding to each position of the Softmax function _w,b (x _i ) Expressed as:

wherein w is ^c And b ^c Parameters of each fault type; and the final classification result of the Softmax classifier is the fault type corresponding to the position with the maximum probability value.

7. The self-attention mechanism based multiscale feature fusion gearbox fault diagnosis method of claim 1, wherein in step four:

a cross entropy loss function is used in combination with a Softmax classifier to calculate the degree of difference between the prediction class and the fault class, and the model internal parameters are back-propagated and updated with the aim of minimizing the loss function, wherein the mathematical expression of the cross entropy loss function is as follows:

the iterative process of the Adam optimizer is as follows:

and->

And dynamically cutting the learning rate, setting the upper and lower bounds of the learning rate, and converging a stable later model.