CN114970605A

CN114970605A - Multi-mode feature fusion neural network refrigeration equipment fault diagnosis method

Info

Publication number: CN114970605A
Application number: CN202210485072.4A
Authority: CN
Inventors: 陈志奎; 王峰; 彭宇辰; 王铁; 钟芳明; 李季; 任浩
Original assignee: Dalian Bingshan Guardian Automation Co ltd; Dalian University of Technology
Current assignee: Dalian Bingshan Guardian Automation Co ltd; Dalian University of Technology
Priority date: 2022-05-06
Filing date: 2022-05-06
Publication date: 2022-08-30

Abstract

The invention discloses a fault diagnosis method for a refrigeration device with a multi-mode feature fusion neural network, which belongs to the technical field of computers and comprises the following steps: 1) preprocessing data; 2) extracting and fusing features; 3) training and optimizing a neural network; 4) and (4) fault classification and diagnosis. The invention mainly aims at the problem of single-mode input fault diagnosis, and because the single-mode input data information is deficient, the existing machine learning method has weak feature extraction capability and low accuracy, and can easily generate misjudgment on some faults which are difficult to distinguish. Therefore, the invention utilizes the time-frequency analysis to convert the time-domain signal to increase the input mode and the feature fusion based on the attention mechanism, and realizes the full play of the complementary information between the modes so as to achieve the aim of improving the diagnosis accuracy. Experimental verification shows that the method can effectively realize multi-modal feature extraction and fusion and solve the problem of low single-modal input diagnosis accuracy.

Description

Multi-mode feature fusion neural network refrigeration equipment fault diagnosis method

Technical Field

The invention belongs to the technical field of computers, and relates to a fault diagnosis method for refrigeration equipment with a multi-modal feature fusion neural network.

Background

The refrigeration equipment is an important component of the mechanical manufacturing industry, is indispensable equipment for developing advanced productivity and enhancing national comprehensive strength, and has high importance in various fields such as social production, people's life, national defense and the like. The failure of the refrigeration equipment can cause the productivity to be reduced, bring higher maintenance cost and generate a great deal of failure downtime, even cause a series of consequences such as irreversible loss and the like. Therefore, it is imperative to develop a practical fault diagnosis method for refrigeration equipment to ensure the stable operation of the refrigeration equipment. At present, a screw compressor is mainly adopted by large-scale refrigeration equipment, in the working process of the screw compressor, the fault of a rotor usually causes abnormal vibration of parts such as a rotating shaft, a gear, a bearing and the like, the intensity distribution of a vibration signal is changed, and the screw compressor is analyzed based on the vibration signal and is a commonly used fault diagnosis method for the refrigeration equipment at present. The fault diagnosis task belongs to the mode recognition problem in the field of artificial intelligence.

Fault diagnosis methods based on vibration signal analysis can be roughly classified into two broad categories, a method based on signal processing and a method based on machine learning. The diagnosis method based on signal processing is established on the basis of professional knowledge and experience knowledge of researchers, manual feature extraction is needed, the requirement on the professional performance of the researchers is very high, and the extracted feature quality is subjective and random. In addition, such methods are time-consuming, labor-consuming, and also have poor accuracy, and it is difficult to meet the requirements for high-precision and large-quantity automatic mechanical fault diagnosis. The method based on machine learning generally applies high-efficiency time-frequency analysis methods such as short-time Fourier transform, continuous wavelet transform, Hilbert-Huang transform and the like to extract fault features contained in vibration signals, and then uses classical machine learning methods such as a support vector machine, random forest and the like to classify the features, so that the dependence on expert knowledge is reduced. However, the diagnosis accuracy of such methods is affected by the selection of the analysis method and whether the characteristics can accurately express the fault information, so that the method has great limitation.

With the development of artificial intelligence technology, as a leading-edge data driving method, deep learning takes the strong nonlinear representation capability of the method and the capability of self-adaptively extracting features from big data, and the dependence on artificial feature engineering is broken. In the face of complex conditions such as strong environmental noise, unobvious early fault characteristics, various attributes and the like, the deep learning method has excellent performance and is widely applied to the field of fault diagnosis. Many existing deep learning methods are directed to designing novel multi-modal feature fusion methods to fully utilize the complementary information existing between multiple modalities, so that when one of the modality data is missing, the multi-modal system can still operate, and therefore, a more robust diagnostic model can be made by applying the data of multiple modalities.

Disclosure of Invention

Most of the existing refrigeration equipment bearing fault diagnosis methods are based on single-mode vibration signals, and are judged by using a signal analysis method or a machine learning method, so that the robustness is poor. In order to solve the problems of the existing fault diagnosis method based on vibration signals and fully play the advantages of deep learning, the invention provides a fault diagnosis method of a refrigeration device of a multi-mode feature fusion neural network. The method adopts one-dimensional vibration signals and two-dimensional time-frequency spectrogram multi-modal input to replace the existing single-modal input, and uses a convolutional neural network with strong feature extraction capability and an LSTM network for effectively learning time sequence dependence information so as to solve the problems of poor single-modal data information and weak feature extraction capability of a machine learning method.

The technical scheme of the invention is as follows:

a multi-mode characteristic fusion neural network refrigeration equipment fault diagnosis method adopts multi-mode input and comprises the following steps:

step 1: acquiring bearing vibration signals of the screw compressor containing all states under the same load condition as original sample data, carrying out time-frequency analysis after normalization processing to obtain a time-frequency spectrogram after continuous wavelet transformation, and dividing a training set and a test set by taking the vibration signals and the time-frequency spectrogram as data sets;

step 2: constructing a neural network, respectively extracting features of training data of two modes, namely an original one-dimensional vibration signal and the two-dimensional time-frequency spectrogram converted in the step 1, adding an attention module and performing feature fusion, performing data weighted transformation on a fused feature sequence, and then inputting the data into an LSTM network and a full connection layer to classify faults;

and step 3: inputting training data, starting iterative training, optimizing hyper-parameters until a convolutional neural network and an LSTM network converge, fixing parameters of each layer of the network after model training is finished, inputting test set data into a model, evaluating whether the diagnosis accuracy meets the actual requirement, if the diagnosis accuracy does not meet the actual requirement, continuously optimizing the network structure and the hyper-parameters until the highest accuracy is reached, and storing the network model and the hyper-parameters in the optimal state;

and 4, step 4: collecting vibration signals of a screw compressor rotor bearing needing diagnosis, generating a time-frequency spectrogram according to the same processing mode as the training process, and inputting the vibration signals and the time-frequency spectrogram into a stored network model to obtain a final diagnosis result.

The invention has the beneficial effects that: aiming at the problem of fault diagnosis of a screw compressor used by large-scale refrigeration equipment, the fault diagnosis method of the refrigeration equipment with the multi-mode feature fusion neural network is designed, time-frequency spectrograms with richer information are generated by sequential vibration signals through continuous wavelet transformation, and the advantage that a large amount of complementary information exists among multi-mode data is fully exerted through multi-mode feature fusion, so that the robustness of a diagnosis model is improved; an attention mechanism is introduced, and the diagnosis accuracy is improved by utilizing the strong feature perception capability of the convolutional neural network and the time-dependent information learning capability of the LSTM network.

Drawings

FIG. 1 is an overall step diagram of the present invention.

FIG. 2 is a diagram of a model framework of the present invention.

FIG. 3 is a block diagram of an attention module employed in the present invention.

Detailed Description

The following further describes a specific embodiment of the present invention with reference to the drawings and technical solutions.

FIG. 1 is an overall step diagram of the present invention. As shown in the figure, the working process of the fault diagnosis method for the refrigeration equipment with the multi-mode feature fusion neural network comprises four stages: a data preprocessing stage; a convolutional neural network training and feature extraction stage; a multi-modal feature fusion phase; and (4) fault classification and diagnosis.

The method comprises the following specific steps:

step 1, preprocessing data;

in the data preprocessing stage, the original data in the data set is processed into a form capable of being input into a neural network, and meanwhile, a time-frequency spectrogram is generated by carrying out continuous wavelet transformation on the original time-domain vibration signal. The preprocessed sample is more beneficial to the convolutional neural network to carry out feature extraction and feature perception when being used as input. The detailed steps are as follows:

(1) collecting bearing vibration signals of the screw compressor containing all states under the same load condition as original sample data, and removing irrelevant information such as column names, serial numbers, dates and the like in the original sample data to obtain the initial sample data;

in this embodiment, a sliding window with a size of 2048 data points is used to generate training samples, the sliding step size is 1000 points, and for each fault condition, 120 samples are collected to obtain 1200 samples in total.

(2) Due to the inherent mechanical characteristics of equipment, the problem that the initial sample data is often too high or too low in numerical value, the difference of the initial sample data can reach two orders of magnitude at most, and therefore the initial sample data is subjected to normalization processing, the difference of dimensions is eliminated, gradient explosion or gradient dispersion is prevented, and the model is helped to learn and extract features better. The normalization method used when processing initial sample data is shown as formula (1):

in equation (1): x _i Represents the value of the ith data in the sample data sequence, n represents the number of sample points of each sample, max {. cndot } represents taking the maximum value, and min {. cndot } represents taking the minimum value.

(3) And carrying out continuous wavelet transform on the normalized data to generate a time-frequency spectrogram. Continuous Wavelet Transform (CWT) is inherited and developed from Short Time Fourier Transform (STFT), but its starting point is different from the localization idea of introducing a window function into the short time fourier transform and performing fourier transform in segments, and its main idea is to replace the trigonometric function base used by the short time fourier transform with an attenuated wavelet base of finite length. The wavelet transform has frequency self-adaptive capacity, and the size of a time window can be automatically adjusted according to the frequency of an original signal, namely a narrow time window is selected for a high-frequency signal, and a wide time window is selected for a low-frequency signal, so that multi-resolution analysis is performed. Based on the multi-resolution characteristic, the continuous wavelet transform can not only acquire frequency information, but also locate the time when the frequency appears. The transformation is defined as shown in equation (2):

in equation (2): parameters α and β are continuously variable control parameters, α being a scale parameter, controlling the spreading (scaling) of the wavelet used to change the center frequency (center frequency) of the continuous wavelet transform, β being a time parameter, controlling the shifting (shifting) of the wavelet basis in the signal direction on the time axis.

In this embodiment, a total of 1200 time-frequency spectrograms are generated after the continuous wavelet transform.

(4) Classifying the sample according to the fault category, and adding labels to the samples one by one to generate a label file; merging fault samples of all types, and randomly scrambling; and dividing the disordered sample into a training set and a testing set according to a certain proportion.

In this embodiment, label values are represented by label 0,1, 2., and 9, and a label file is generated. All 10 types of failure samples were combined and randomly shuffled. According to the weight ratio of 0.75: the proportion of 0.25 divides the disturbed sample into a training set D and a testing set T which are respectively used for training and testing the model.

Step 2, feature extraction and feature fusion;

the characteristic extraction process uses a multilayer convolution neural network, the convolution neural network is composed of a convolution layer and a pooling layer, wherein the convolution layer comprises a plurality of characteristic graphs composed of neurons, different convolution kernels can extract different characteristics by traversing the characteristic graphs through the convolution kernels, and the neural network can extract characteristic representation of data from low level to complex level through multilayer convolution operation. The convolutions include one-dimensional convolutions suitable for input data that are short segments of fixed length, and two-dimensional convolutions suitable for extracting features in the image, distinguished by input data dimensional differences and convolution kernel traversal patterns. Therefore, for the original one-dimensional vibration signal, one-dimensional convolution is adopted for feature extraction; and for a two-dimensional time-frequency spectrogram, extracting features by using a two-dimensional convolution neural network.

The existing multi-modal feature fusion method is generally realized through simple linear operations such as summation (summation) or concatenation (concatenation), and the operations can greatly increase the dimension of feature vectors, easily cause dimension disaster problems, and thus increase the time complexity of a model. The invention introduces an attention mechanism to improve the feature fusion capability, can filter a large amount of information irrelevant to tasks, reduce feature dimensionality after fusion, fully utilize complementary information among multi-modal features, learn more discriminative characteristics by paying attention to certain important parts of input, and effectively strengthen feature representation. The specific working steps of the attention module are as follows:

(1) given the input characteristics

Inputting the convolutional layer, generating two new feature maps B and C, wherein

(2) Reforming B and C into

Where N is H × W, matrix multiplication is performed on the transposes of B and C, and a spatial attention map is calculated

As shown in equation (3):

wherein S _ji Indicating the influence of the ith bit on the jth bit;

(3) inputting the feature A into the convolutional layer to generate a new feature map

Reforming into

Transpose of D and SMultiplication of the row matrix and reforming the result into

(4) Multiplying the result of the last step by a scale parameter delta, and carrying out sum operation with A to obtain final output

As shown in equation (4):

where δ is initialized to 0 and the weight is gradually updated through learning.

Step 3, training and optimizing a neural network;

(1) an activation function is selected as well as a loss function. Common activation functions are a linear rectification function (ReLU), a hyperbolic tangent function (Tanh), an S-type function (Sigmoid), and the like. And the loss functions include square loss, exponential loss, cross-entropy loss, and the like. Since the fault diagnosis problem solved by the present invention is one of the classification problems, a linear rectification function is selected as the activation function, as shown in formula (5), and the cross entropy loss is used as the loss function, as shown in formula (6):

in equation (5): σ (·) denotes an activation function, x denotes a piece of sample data; in equation (6): l (-) represents the loss function, y represents the true label of the sample,

the prediction label of the model output is represented, and N represents the total number of samples.

(2) Initializing a network hyper-parameter. The hyper-parameters of the convolutional neural network comprise the number of convolutional layers, the size and the number of convolutional kernels, convolution step length, the number of fully-connected layers and the number of units of each fully-connected layer. The LSTM network is a long-short term memory network, is a neural network with the capacity of memorizing long-short term information, is suitable for extracting information in a time sequence, and has hyper-parameters including the number of hidden layer nodes, the number of LSTM stacked layers and the bias value of a hidden layer.

(3) And training the convolutional neural network. After the initialization of the hyper-parameters is finished, the training set samples are input into a convolutional neural network model through a dataloader method in a pyrrch frame, and parameters in each level of neurons in the model are continuously corrected through a gradient descent method and a chain rule of error back propagation.

(4) And optimizing the training hyperparameters. The learning rate, the batch size, the iteration times and the like are important hyper-parameters which need to be carefully selected and optimized, and the parameters are dynamically adjusted according to the change condition of the loss function in the training process, and when the loss value and the precision oscillate near a certain value, the state can be considered to be close to a stable state, namely the training is sufficient. The learning rate, batch size and iteration number selection method of the invention is detailed in the experimental verification part.

Step 4, fault classification and diagnosis

The fusion characteristics E of the specific segment in the time sequence signal extracted by the convolutional neural network in the step 2 are firstly input into the LSTM network to obtain the integral tensor output of the signal to be detected with long-term dependence, and the integral tensor output is input into the full connection layer after the training optimization in the step 3. The full connected layers (FC) are located at the end of the neural network, and function as a classifier by mapping the fused feature representations to a label space.

And (3) experimental verification:

in order to verify the effectiveness of the method provided by the invention in processing the fault diagnosis task, two common public standard data sets CWRU (Kaiser university of West reservoir) bearing fault data sets and MFPT (mechanical failure prevention technical institute) bearing fault data sets are adopted for experimental verification. Evaluation index selection uses Precision, Recall and F1 values (F1 is a comprehensive evaluation index calculated from Precision and Recall). The correlation is calculated as formula (7) to formula (10):

n in formula (7) _correct Representing the number of the results with the fault category consistent with the prediction result, and N representing the number of the categories of the predicted faults; n in formula (8) _objecti Representing the total number of i-th type failure data in the data set, n _total Representing the total number of data of all categories in the test set. We first find F for each category fault in the test set separately ₁ The value, as shown in equation (9), is finally found for all the failures F in the test set ₁ Mean value of the values MeanF ₁ Overall F as a final classification task ₁ The value is shown in equation (10).

The CWRU data set is a bearing fault diagnosis data set developed by the bearing center of university of cassie storage, has become one of the most important standard data sets in the field of bearing fault diagnosis research, and is widely used by researchers to test the performance of proposed models. The data set includes four broad classes of data: normal data, fan end fault data, drive end fault data at a sampling frequency of 12kHz, and drive end fault data at a sampling frequency of 48 kHz. Under each major category, there are three types of fault location, ball, inner and outer ring faults respectively. Each fault type in turn contains three fault sizes: 0.007 inches, 0.014 inches, and 0.021 inches, plus the normal condition of no failure, for a total of ten types of failure tags. In the research, 12kHz driving end data with the motor rotating speed of 1730r/min and the load of 3hp is selected.

TABLE 1 hyper-parameter experiment table

In order to enable the network to obtain the best judgment capability, batch size, learning rate and iteration number are compared through experiments to find the best hyper-parameter combination. The batch size candidate values are 8, 16 and 32, the learning rate candidate values are 0.01, 0.001, 0.002 and 0.0001, 12 combinations can be obtained, 10 times of iterative training are respectively carried out on the neural network models with different hyper-parameter settings by using the prepared training set, and the testing set is used for testing after the training is finished. For each hyper-parameter combination, the results of 10 times of training and testing are averaged to obtain the average testing accuracy, the experimental result is shown in table 1, and under the conditions that the learning rate is 0.002, the batch size is 32, and the iteration number is 10, the classification accuracy of the neural network is the highest, and reaches 98.3%. Therefore, an Adam optimizer using a learning rate of 0.002 was selected, the batch size was set to 32, and the number of iterations was 10.

To evaluate the performance of the proposed diagnostic method, other statistical methods and deep learning methods were chosen for comparison. They are Support Vector Machine (SVM), multilayer perceptron (MLP), Convolutional Neural Network (CNN), stacked self-encoder (SAE), TCNN, WDCNN, sdiAE, respectively. The average accuracy of the comparison with each type of comparison method is shown in Table (2).

TABLE 2 comparison of the results

It can be seen from the results that the proposed method gives better results than these methods. The average prediction precision reaches 98.36%, which is better than all other methods. Compared with deep learning methods without time-frequency analysis, such as 1DCNN, TCNN and the like, the deep learning methods using fast Fourier transform, such as SAE and sdiAE, have slightly improved effects, and the effects of using continuous wavelet transform to generate a time-frequency spectrogram as input, such as 2DCNN and resnet, are further improved. The method provided by the invention simultaneously uses the original one-dimensional signal and the two-dimensional image containing rich information, thereby obtaining the best feature extraction capability and diagnosis effect.

The experimental result shows that the rotary machine fault diagnosis method based on the multi-mode feature fusion neural network can effectively complete the fault category diagnosis task of the rotor of the refrigeration equipment due to the time-frequency analysis of the vibration signals and the multi-mode feature fusion.

Claims

1. A fault diagnosis method for refrigeration equipment with a multi-mode feature fusion neural network is characterized by comprising the following steps:

step 1, preprocessing data;

(1.1) acquiring bearing vibration signals of the screw compressor containing all states under the same load condition as original sample data, and removing irrelevant information in the original sample data to obtain the initial sample data;

(1.2) carrying out normalization processing on the initial sample data, wherein the normalization method is shown as a formula (1):

in equation (1): x _i Representing the value of ith data in the sample data sequence, n representing the number of sample points of each sample, max {. cndot } representing taking the maximum value, and min {. cndot } representing taking the minimum value;

(1.3) carrying out continuous wavelet transform on the normalized data to generate a time-frequency spectrogram; the transformation is defined as shown in equation (2):

in equation (2): the parameters alpha and beta are continuously changed control parameters, alpha is a scale parameter and is used for controlling the extension of the wavelet and changing the central frequency of continuous wavelet transformation, and beta is a time parameter and is used for controlling the translation of a wavelet base along the signal direction on a time axis;

(1.4) carrying out fault category classification on the samples, adding labels to the samples one by one, and generating label files; merging fault samples of all types, and randomly scrambling; dividing the disordered sample into a training set and a testing set according to a certain proportion;

step 2, feature extraction and feature fusion;

the characteristic extraction process uses a multilayer convolution neural network, the convolution neural network is composed of a convolution layer and a pooling layer, wherein the convolution layer comprises a plurality of characteristic graphs composed of neurons, the characteristic graphs are traversed through convolution kernels, different convolution kernels extract different characteristics, and the neural network extracts characteristic representation of data from low level to complex level through multilayer convolution operation; distinguishing according to dimension difference of input data and a convolution kernel traversal mode, wherein the convolution comprises one-dimensional convolution suitable for input data being short segments with fixed length and two-dimensional convolution suitable for extracting features in an image; therefore, for the original one-dimensional vibration signal, one-dimensional convolution is adopted for feature extraction; extracting features of the two-dimensional time-frequency spectrogram by adopting a two-dimensional convolution neural network;

an attention mechanism is introduced to improve the feature fusion capability, and the specific working steps of an attention module are as follows:

(2.1) giving input characteristics

(2.2) reforming of B, C to

Where N is H W, matrix multiplication is performed on the transposes of B and C, and a spatial attention map is calculated

As shown in equation (3):

wherein S _ji Indicating the influence of the ith bit on the jth bit;

(2.3) inputting the feature A into the convolutional layer to generate a new feature map

Reforming into

Matrix multiplication of the transposes of D and S and reforming the result into

(2.4) multiplying the result of the previous step by a scale parameter delta, and performing sum operation with A to obtain final output

As shown in equation (4):

where δ is initialized to 0 and the weight is gradually updated by learning;

step 3, training and optimizing a neural network;

(3.1) selecting an activation function and a loss function: a linear rectification function is selected as an activation function, as shown in formula (5), and cross entropy loss is selected as a loss function, as shown in formula (6):

a prediction tag representing the model output, N representing the total number of samples;

(3.2) initializing network hyper-parameters: the hyper-parameters of the convolutional neural network comprise the number of convolutional layers, the size and the number of convolutional kernels, convolution step length, the number of fully-connected layers and the number of units of each fully-connected layer; the hyper-parameters of the LSTM network comprise hidden layer node number, LSTM stacking layer number and hidden layer bias value;

(3.3) training the convolutional neural network: after the initialization of the hyper-parameters is finished, inputting the training set samples into a convolutional neural network model through a dataloader method in a pyrrch frame, and continuously correcting the parameters in each level of neurons in the model through a gradient descent method and a chain rule of error back propagation;

(3.4) optimizing training hyper-parameters: including learning rate, batch size and iteration number;

step 4, fault classification and diagnosis

The fusion characteristic E of a specific segment in the time sequence signal extracted by the convolutional neural network in the step 2 is firstly input into an LSTM network to obtain the integral tensor output of the signal to be detected with long-term dependence, and the integral tensor output is input into a full connection layer after the training optimization in the step 3; the fully connected layer is positioned at the end of the neural network, and the feature representation obtained after fusion is mapped to a mark space.