Disclosure of Invention
The invention aims to provide a rolling bearing fault analysis method based on CNN and LSTM. The method fully utilizes the spatial feature extraction capability of CNN and the time sequence feature learning capability of LSTM, fully extracts the relation between the vibration image and time dependence, and carries out classification recognition and fault judgment on the vibration feature of the rolling bearing through the full-connection layer and the softmax layer, thereby enhancing the classification precision. And compared with the traditional RNN, the LSTM solves the problem of gradient disappearance and can reduce the difficulty of model training.
According to an aspect of the present application, there is provided a CNN and LSTM-based bearing fault analysis method, including:
and acquiring bearing vibration data, and processing the respiratory frequency signal of the vibration data containing noise by adopting a Butterworth filter.
Wherein the Butterworth filter has the formula:
where n is the order of the filter, ωcThe frequency at which the amplitude drops to-3 db is taken for the cut-off frequency. The default order of a filter of the filter is 2, although a high-order Butterworth filter can realize clearer roll-off near a cut-off frequency, the high-order Butterworth filter can also cause serious signal distortion and influence the precision of a result, and experiments find that the performance of the filter with the order of 1 is better, namely a first-order Butterworth filter is used for processing vibration data containing noise;
converting the description of the preprocessed signal from the time domain to the frequency domain by a Fast Fourier Transform (FFT);
the FFT calculates the spectrum of the signal x (k) using discrete signals. Wherein the formula of the FFT is:
namely, the FFT algorithm can reduce the calculation amount of frequency domain conversion and improve the conversion speed, and the time complexity is o (nolg)2n)。
And taking the time domain graph and the frequency domain graph obtained after processing as input, respectively training through a preset CNN network, carrying out convolution operation on the data on the one-dimensional time axis by the CNN, and moving along the time t axis of the time domain signal and the time frequency graph to extract image characteristics. CNN performs feature extraction by 3 convolutional layers. The pooling layer is positioned behind each convolution layer to reduce the dimension of the feature map, wherein the time feature of the data is reserved by reducing the complexity of output and preventing overfitting of the data by using the maximum pooling operation;
wherein the maximum pooling layer calculation formula is:
w and d are the preset length and width of the maximum pooling filter.
The time domain graph and the frequency domain graph obtained by CNN learning are subjected to feature fusion through the add layer, the dimensionality of the image is not increased, the information amount under each dimension is increased, and the time feature can be reserved by performing feature fusion through the add layer. Taking the fusion characteristics as the input of a long-time memory network layer;
inputting the fusion characteristics obtained after the add layer characteristics are fused into a long-time memory network layer, connecting the long-time memory network layer and the short-time memory network layer in a serial mode, and taking data containing time sequence as input to obtain space-time characteristics; the long-time memory network layer is internally provided with 3 gates: forget gate, input gate, output gate, the update formula of every moment t gate is as follows:
forget door ftThe purpose of this is to let the LSTM network forget information that was previously useless:
ft=σ(Wf·[ht-1,xt]+bf)
input door itThe role of (2) is to determine the input information of the LSTM network:
ct′=tanh(Wc·[ht-1,xt]+bc)
ct=ft*ct-1+iict′
output gate otThe role of (a) is to determine the outcome of the neuron:
ot=σ(Wo[ht-1,xt]+bo)
ht=ot*tanh(ct)
wherein, Wf、Wi、Wc、WoWeight matrix b of forgetting gate, input gate and output gatef、bi、bc、boIs its bias term; h ist-1The state of the hidden layer at the time t-1; σ is a logistic function with an output of (0, 1); x is the number oftIs the input vector at the time t; h ist-1The state of the previous moment; tanh is the activation function.
And taking the acquired space-time characteristics as input, mapping the space-time characteristics to a sample mark space in a full connection layer, and obtaining a classification probability result through subsequent softmax layer operation to identify and classify the rolling bearing faults.
Wherein the formula of the softmax function is:
denotes the sample vector x when there are K linear functionsTProbability of belonging to class j.
Detailed Description
Fig. 1 is a schematic flow chart of a CNN and LSTM-based rolling bearing fault analysis method according to an embodiment of the present application. Referring to fig. 1, a method and a system for analyzing a fault of a rolling bearing based on CNN and LSTM provided in an embodiment of the present application may include:
step S1: denoising the processed vibration data by using a Butterworth Filter (Butterworth Filter), and converting the preprocessed time domain signals into frequency domain signals by using Fast Fourier Transform (FFT);
step S2: learning by using a CNN (CNN network) to obtain image characteristics of a time domain graph and a frequency domain graph;
step S3: performing image feature fusion through the add layer;
step S4: inputting the fusion image characteristics obtained by the add layer into an LSTM network, and further learning the time sequence characteristics contained in the characteristics through the LSTM network;
step S5: the classification function is realized through a full connection layer and a Softmax function, and the trained network is used for carrying out fault classification on the test sample;
the invention aims to provide a rolling bearing fault analysis method based on CNN and LSTM. Firstly, noise filtering is carried out on a data set, vibration data are represented by a time domain graph and a time frequency graph, and feature extraction is better carried out. The vibration characteristics of the time-frequency graph and the time-domain graph are respectively extracted by utilizing the space characteristic extraction capability of the CNN, the characteristics are fused, the relation between the vibration characteristics and the time dependence is fully extracted by utilizing the time sequence characteristic learning capability of the LSTM, the vibration characteristics of the rolling bearing are classified and identified and the fault is judged through the full connection layer and the Softmax layer, and the classification precision is enhanced.
The source of an experimental data set adopted by the method is CWRU (Kaiser Sichu university bearing data center), and the data set is a rolling bearing fault data set which is most widely used internationally at present. The data set records actual test conditions for the motor and bearing fault conditions, using Electrical Discharge Machining (EDM) techniques to implant faults into the motor bearings. Faults ranging from 0.007 inches to 0.040 inches in diameter were introduced on the bearing inner race, the rolling elements and the bearing outer race, respectively. The failed bearing was reinstalled into the test motor and the bearing experiment recorded vibration data of 0 to 3 horsepower (motor speed 1797 to 1720RPM) at 12,000 samples/second and 48,000 samples/second.
And S1, acquiring bearing vibration data, and processing the respiratory frequency signal of the vibration data containing noise by adopting a Butterworth filter.
Wherein the Butterworth filter has the formula:
where n is the order of the filter, ωcThe frequency at which the amplitude drops to-3 db is taken for the cut-off frequency. The default order of the filter is 2, although the high-order Butterworth filter can realize clearer roll-off near the cut-off frequency, the high-order Butterworth filter can cause serious signal distortion and influence the precision of the result, and experiments find that the performance of the filter with the order of 1 is better, namely the filter with the first orderThe Butterworth filter processes the vibration data containing noise;
converting the description of the preprocessed signal from the time domain to the frequency domain by a Fast Fourier Transform (FFT);
the FFT calculates the spectrum of the signal x (k) using discrete signals. Where the FFT can be expressed as:
namely, the FFT algorithm can reduce the calculation amount of frequency domain conversion and improve the conversion speed, and the time complexity is o (nlog)2n)。
And S2, taking the processed time domain graph and frequency domain graph as input, respectively training through a preset CNN network, carrying out convolution operation on the data on the one-dimensional time axis by the CNN, and moving along the time t axis of the time domain signal and the time frequency graph to extract image characteristics. CNN performs feature extraction by 3 convolutional layers. The pooling layer is positioned behind each convolution layer to reduce the dimension of the feature map, wherein the time feature of the data is reserved by reducing the complexity of output and preventing overfitting of the data by using the maximum pooling operation;
wherein the maximum pooling layer calculation formula is:
w and d are the preset length and width of the maximum pooling filter.
S3, carrying out feature fusion on the time domain graph and the frequency domain graph obtained by CNN learning through the add layer, wherein the dimensionality of the image is not increased, the information content under each dimension is increased, and the time feature can be reserved by carrying out feature fusion through the add layer. Taking the fusion characteristics as the input of a long-time memory network layer;
s4, inputting the fusion characteristics obtained by fusing the characteristics of the Add layers into a long-short time memory network layer, connecting the layers of the long-short time memory network in a series mode, and taking data containing time sequence as input to obtain space-time characteristics; the long-time memory network layer is internally provided with 3 gates: forget gate, input gate, output gate, the update formula of every moment t gate is as follows:
forget door ftThe purpose of this is to let the LSTM network forget information that was previously useless:
ft=σ(Wf·[ht-1,xt]+bf)
input door itThe role of (2) is to determine the input information of the LSTM network:
ct′=tanh(Wc·[ht-1,xt]+bc)
ct=ft*ct-1+iict′
output gate otThe role of (a) is to determine the outcome of the neuron:
ot=σ(Wo[ht-1,xt]+bo)
ht=ot*tanh(ct)
wherein, Wf、Wi、Wc、WoWeight matrix b of forgetting gate, input gate and output gatef、bi、bc、boIs its bias term; h ist-1The state of the hidden layer at the time t-1; σ is a logistic function with an output of (0, 1); x is the number oftIs the input vector at the time t; h ist-1The state of the previous moment; tanh is the activation function.
And S5, taking the acquired space-time characteristics as input, mapping the space-time characteristics to a sample mark space in a full connection layer, and obtaining a classification probability result through subsequent softmax layer operation to identify and classify the rolling bearing faults.
Wherein the formula of the softmax function is:
denotes the sample vector x when there are K linear functionsTProbability of belonging to class j.
Fig. 2 is a schematic structural block diagram of a rolling bearing fault analysis method based on CNN and LSTM according to an embodiment of the present application.
The embodiment of the present application in fig. 3 also provides a computing device comprising a memory 320, a processor 310 and a computer program stored in said memory 320 and executable by said processor 310, the computer program being stored in a space 330 for program code in the memory 320, the computer program, when executed by the processor 310, implementing the method steps 331 for performing any of the methods according to the present invention.
The embodiment of the application in fig. 4 also provides a computer-readable storage medium. The computer readable storage medium comprises a storage unit for program code provided with a program 331' for performing the steps of the method according to the invention, which program is executed by a processor.
The embodiment of the application also provides a computer program product containing instructions. Which, when run on a computer, causes the computer to carry out the steps of the method according to the invention.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed by a computer, cause the computer to perform, in whole or in part, the procedures or functions described in accordance with the embodiments of the application. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be understood by those skilled in the art that all or part of the steps in the method for implementing the above embodiments may be implemented by a program, and the program may be stored in a computer-readable storage medium, where the storage medium is a non-transitory medium, such as a random access memory, a read only memory, a flash memory, a hard disk, a solid state disk, a magnetic tape (magnetic tape), a floppy disk (floppy disk), an optical disk (optical disk), and any combination thereof.
The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.