CN116186590A

CN116186590A - Bearing fault diagnosis method based on data layer feature fusion and convolutional neural network

Info

Publication number: CN116186590A
Application number: CN202310203934.4A
Authority: CN
Inventors: 张�雄; 李嘉禄; 武文博; 董帆; 万书亭
Original assignee: North China Electric Power University
Current assignee: North China Electric Power University
Priority date: 2023-03-06
Filing date: 2023-03-06
Publication date: 2023-05-30

Abstract

The invention discloses a bearing fault diagnosis method based on data layer feature fusion and convolutional neural network, which comprises the following steps: s1, performing VMD decomposition on an acquired original signal, selecting a component with a kurtosis value larger than a threshold value, performing component screening, and reconstructing the signal, so as to abandon interference information and furthest reserve fault characteristics; s2, calculating and constructing a multi-dimensional composite feature matrix of the reconstructed signal by at least considering the time domain, the frequency domain, the energy and the stability, and fusing the data features; s3, performing dimension reduction on the multi-dimensional composite feature matrix by using a kernel principal component analysis method, simplifying the matrix, and removing redundant information; s4, inputting the obtained low-dimensional matrix into a CNN model optimized by a batch of standard layers for fault identification and classification. The bearing fault diagnosis method based on the data layer feature fusion and the convolutional neural network is higher in classification effect and diagnosis accuracy.

Description

Bearing fault diagnosis method based on data layer feature fusion and convolutional neural network

Technical Field

The invention relates to a rolling bearing fault diagnosis technology, in particular to a bearing fault diagnosis method based on data layer feature fusion and convolutional neural network.

Background

Rolling bearings are critical components of machine operation that, when they fail, can affect the stability of machine operation and even safety hazards. Therefore, the effective fault identification and diagnosis of the rolling bearing is of great significance.

In the past bearing research, the research direction of people is mainly the traditional method, such as spectrum analysis, envelope spectrum analysis and decomposition of signals by using a modal decomposition algorithm, for example, on original fault signals. However, the traditional method relies on expert experience, and excessive manual intervention inevitably has a certain influence on the diagnosis result. And with the complexity of bearing fault conditions, the requirement of bearing fault diagnosis cannot be met by using a traditional diagnosis method.

With the improvement of the intelligent level, the collection rate and the magnitude of various bearing data are widely improved, and a good foundation is laid for the deep learning method to enter the field of bearing fault diagnosis. As an intelligent research method commonly used at present, deep learning can avoid manual intervention and reduce errors caused by experience in the process of analyzing bearing data, so that the method is gradually applied to the field of fault diagnosis of bearings by more and more people. For example:

zhang Jiwang in the "intelligent diagnosis method for early weak failure of rolling bearing based on VMD-CNN", the method of decomposition of variation mode (Variational Mode Decomposition, VMD for short) is combined with convolutional neural network (Convolutional Neural Network, CNN for short) to diagnose early weak failure of bearing.

Li Saiqi in EEMD-CNN-based rolling bearing fault diagnosis method, signals are decomposed by using an empirical mode decomposition (Empirical Mode Decomposition, EMD for short) algorithm, useful component reconstruction signals are selected, a series of indexes are calculated, and then the signals are input into a convolutional neural network for fault diagnosis.

Yao Fenglin in the research of rolling bearing fault diagnosis based on wavelet packet transformation and ELM, the wavelet packet transformation is used for carrying out noise elimination treatment on signals, and then the time domain characteristic value of the noise elimination signals is extracted and then input into an extreme learning machine for fault classification and diagnosis.

He Jiangjiang in the "convolutional neural network rolling bearing failure diagnosis based on improved EEMD", a support vector machine (Support Vector Machines, abbreviated as SVM) and a collective empirical mode decomposition (Ensemble Empirical Mode Decomposition, abbreviated as EEMD) algorithm are combined to perform signal decomposition, decomposed components are selected and then decomposed, an energy vector is constructed, and finally the energy vector is put into a convolutional neural network for verification.

Li Ziguo in the "Rolling bearing failure diagnosis based on parameter optimization VMD and 1D-CNN", the VMD related parameters are optimized by Harris eagle (Harris Hawks Optimization, HHO for short), the optimized VMD is used for decomposing the original vibration signal, and the optimal modal component is selected and then input into a One-dimensional convolutional neural network (One-dimensional Convolutional Neural Network, 1D-CNN) model for failure diagnosis.

Although the above method has a certain effect on the fault diagnosis of the bearing, the analysis of the data is performed by using the characteristic index of the time domain or the frequency domain. Due to the non-stationarity and non-linearity of the bearing vibration data, the time domain and frequency domain signature may be disturbed by non-periodic transient shocks and harmonic components. Namely, as the time domain characteristic index is sensitive to the impact characteristic, transient impact components in the signal can be wrongly diagnosed as bearing faults; the harmonic signals are similar to the impact components in the frequency spectrum, errors are unavoidable when the frequency domain characteristic indexes are used for diagnosis, and the diagnosis difficulty is increased.

Disclosure of Invention

Aiming at the problem of limitation of a feature matrix constructed by traditional time domain and frequency domain statistics indexes in processing complex nonlinear data, the invention provides a bearing fault diagnosis method based on data layer feature fusion and convolutional neural network, which has good processing effect and stability on bearing fault classification.

In order to achieve the above purpose, the invention provides a bearing fault diagnosis method based on data layer feature fusion and convolutional neural network, comprising the following steps:

s1, performing VMD decomposition on an acquired original signal, selecting a component with a kurtosis value larger than a threshold value, performing component screening, and reconstructing the signal, so as to abandon interference information and furthest reserve fault characteristics;

s2, calculating and constructing a multi-dimensional composite feature matrix of the reconstructed signal by at least considering the time domain, the frequency domain, the energy and the stability, and fusing the data features;

s3, performing dimension reduction on the multi-dimensional composite feature matrix by using a kernel principal component analysis method, simplifying the matrix, and removing redundant information;

s4, inputting the obtained low-dimensional matrix into a CNN model optimized by a batch of standard layers for fault identification and classification.

Preferably, the VMD decomposition described in step S1 includes the steps of:

s11, decomposing an original signal into K eigenvalue components, wherein a constrained variation expression is as follows:

wherein u is _m Representing the modal components omega _m Is a model of each orderThe center frequency of the state component, m represents the modal number, j is the imaginary symbol, t represents time, delta (t) is the dirac function,

representing the gradient operation of the function with respect to time t, f representing the original signal;

s12, introducing a penalty factor alpha, and constructing an augmented Lagrangian function, so as to obtain each IMF component, namely an optimal solution of the variational model:

wherein λ represents a lagrange multiplier and α represents a quadratic penalty factor;

s13, performing time-frequency domain transformation on the Lagrangian function and obtaining extremum thereof to obtain u _m And omega _m Is a frequency domain expression of (2):

in the method, in the process of the invention,

represents u _m Frequency domain expression of>

Representing the modal component of the nth iterative calculation, +.>

Is a Fourier transform of f (t),>

is a fourier transform of λ (t), ω representing frequency;

s14, obtaining an optimal solution of the constraint variation model by adopting an alternate direction multiplier algorithm.

Preferably, in S11, the number of modal components k=5.

Preferably, in step S2, the feature values are calculated by selecting multiple time-frequency domain features, short-time energy, two energy values of a Teager energy operator, and two entropy values of approximate entropy and sample entropy, and the calculation results are combined to construct a multidimensional composite feature matrix.

Preferably, the plurality of time-frequency domain features include kurtosis, root mean square, peak-to-peak value, skewness, margin factor, waveform factor, peak factor, pulse factor, center of gravity frequency, average frequency, frequency standard deviation, root mean square frequency;

the constructed multi-dimensional composite feature matrix is a 16-dimensional composite feature matrix.

Preferably, the method for analyzing a principal component of a core in step S3 specifically includes the following steps:

s31, mapping the dimension to be reduced to a high-dimensional space by using a kernel function:

wherein X represents a matrix to be dimension reduced, X ₁₁ 、x _1n 、x _n1 、x _nn Are all the elements in the matrix, and are all the elements in the matrix,

representing a mapping from a low-dimensional space to a high-dimensional space;

s32, calculating covariance matrix of the data in the high-dimensional space, and further calculating eigenvalues and eigenvectors:

wherein T represents a transpose, w _l Representing feature vectors in a high-dimensional space, λ being the corresponding feature value.

S33, linearly representing projection vectors of the high-dimensional feature space by using high-dimensional sample points:

wherein i represents the ith element

S34, computing a nuclear matrix

S35, enabling the kernel matrix K to be more aggregated through transformation:

K′＝K-l _n K-Kl _n +l _n Kl _n (10)

wherein K' is a new matrix after transformation; l (L) _n For an n matrix, all values are 1/n.

S36, solving the eigenvalues and eigenvectors of the matrix K, taking eigenvectors corresponding to d largest eigenvalues, and forming a vector matrix by using column vectors, namely the dimensionality-reduced data set.

Preferably, the CNN model for batch standard layer optimization in step S4 includes a first convolution layer, a first batch standard layer, a first max-pooling layer, a second convolution layer, a second batch standard layer, a second max-pooling layer, a third convolution layer, a third batch standard layer, a fourth convolution layer, a fourth batch standard layer, a fifth convolution layer, a fifth batch standard layer, a third max-pooling layer, a flat layer, a first fully-connected layer, and a second fully-connected layer, which are sequentially set;

by adding a batch of standard layers after the convolution layer, the training and convergence speed is increased, and the overfitting is prevented.

The invention has the following beneficial effects:

(1) The components are screened through VMD decomposition and kurtosis criteria, and interference components can be abandoned;

(2) Constructing a feature matrix from the aspects of time domain, frequency domain, energy, entropy value and the like, analyzing fault signals more comprehensively, and having stronger feature extraction capability;

(3) The KPCA method (kernel principal component analysis method) is used for reducing the dimension, extracting important characteristics, reducing the data volume of an input neural network and relieving the operation pressure of the network;

(4) And optimizing a neural network model (CNN model) by using a batch standard layer (BN layer), so that the training and convergence speed is increased, the stability of the model is improved, and the overfitting is prevented.

The technical scheme of the invention is further described in detail through the drawings and the embodiments.

Drawings

FIG. 1 is a flow chart of a method for diagnosing bearing faults based on data layer feature fusion and convolutional neural network of the present invention;

FIG. 2 is a time domain waveform diagram of various status signals of experimental example 1 of the present invention;

FIG. 3 is an exploded component view of the bearing inner race fault signal of experimental example 1 of the present invention;

FIG. 4 is a visual two-dimensional diagram of KPCA dimension reduction classification of experimental example 1 of the present invention;

FIG. 5 is a graph of training and testing accuracy of experimental example 1 of the present invention;

FIG. 6 is a graph of training and test loss for experimental example 1 of the present invention;

FIG. 7 is a diagram of an confusion matrix according to experimental example 1 of the present invention;

FIG. 8 is a graph of the visual effect of the convolutional neural network of experimental example 1 of the present invention;

FIG. 9 is a time domain waveform diagram of various status signals of experimental example 2 of the present invention;

FIG. 10 is an exploded component view of a failure signal of an outer race of a bearing under load in accordance with Experimental example 2 of the present invention;

FIG. 11 is a two-dimensional view of KPCA dimension reduction visualization of Experimental example 2 of the present invention;

FIG. 12 is a graph of training and testing accuracy for Experimental example 2 of the present invention;

FIG. 13 is a graph of training and test loss for experimental example 2 of the present invention;

FIG. 14 is a confusion matrix diagram of Experimental example 2 of the present invention;

fig. 15 is a graph showing the visual effect of the convolutional neural network of experimental example 2 of the present invention.

Detailed Description

The present invention will be further described with reference to the accompanying drawings, and it should be noted that, while the present embodiment provides a detailed implementation and a specific operation process on the premise of the present technical solution, the protection scope of the present invention is not limited to the present embodiment.

The bearing fault diagnosis method based on the data layer feature fusion and the convolutional neural network comprises the following steps:

it should be noted that, compared to the conventional EMD method, the VMD algorithm is a method of adaptive modal variation and signal processing. The main idea is to avoid modal aliasing by controlling the bandwidth, and to adaptively realize the effective separation of the frequency domain part and each component of the signal.

Preferably, the VMD decomposition described in step S1 includes the steps of:

wherein u is _m Representing the modal components omega _m Is the center frequency of each order modal component, m represents the modal number, j is the imaginary symbol, t represents time, delta (t) is the dirac function,

representing pairs of functionsThe gradient operation at time t, f, represents the original signal.

in the method, in the process of the invention,

represents u _m Frequency domain expression of>

Representing the modal component of the nth iterative calculation, +.>

Is a Fourier transform of f (t),>

is a fourier transform of λ (t), ω representing frequency;

Preferably, in S11, the number of modal components k=5.

Among the parameters of the VMD algorithm, the number K of modal components has a large influence on the decomposition result. If the K value is too large, the signal is excessively decomposed, a mode aliasing phenomenon can occur, and the decomposition accuracy is affected; if the K value is too small, the signal is under decomposed, the convergence speed of the center frequency of each order mode is slow, and the decomposition efficiency is affected. The embodiment verifies the condition that K takes various values before experiments are carried out, and the mode aliasing phenomenon is avoided when K=4, but the convergence speed of the center frequency of each mode is lower, and the decomposition efficiency is lower; when k=6, modal aliasing occurs, and the convergence speed is slower; the effect of K=5 is best, the central frequency fluctuation of each order component is smaller, the convergence speed is faster, and the mode aliasing phenomenon is avoided. The original signal is chosen to be decomposed into 5 modal components.

the time-frequency domain characteristic value of the bearing vibration signal can reflect the overall state of the bearing, and is an important index for measuring the signal characteristics. The common time-frequency domain features include kurtosis, root mean square, peak-to-peak value, skewness, margin factor, waveform factor, peak factor, pulse factor, center of gravity frequency, average frequency, frequency standard deviation, root mean square frequency, etc. However, in consideration of correlation and repeatability between some time-frequency domain features, the time-frequency domain features may be interfered by useless information such as transient impact and harmonic components, and the useful information of the sample may be difficult to be effectively reflected by the time-frequency domain feature indexes only when fault feature extraction is performed, so that other feature indexes need to be introduced for supplementation. In addition to time-frequency domain features, energy methods and entropy methods are also commonly used in fault diagnosis. Because the energy contained in different fault signals is different, the complexity and the stability are also different, and different signals can be distinguished according to the energy and entropy values of the signals. The method calculates various characteristic values of the signals and constructs a multi-dimensional composite characteristic matrix, so that the characteristic extraction effect can be improved, and the purpose of reflecting effective information of the sample from multiple aspects is achieved. Therefore, in step S2, the feature values are calculated by selecting various time-frequency domain features, short-time energy, two energy values of a Teager energy operator, and two entropy values of approximate entropy and sample entropy, and the calculation results are combined to construct a multi-dimensional composite feature matrix.

it should be noted that feature dimension reduction is a common data processing method, after dimension reduction, feature matrixes can be optimized, data distribution can be visually displayed, and data observation is facilitated. The KPCA (kernel principal component analysis) is a nonlinear principal component analysis dimension reduction method, and the main idea is to map nonlinear samples to a high-dimensional space first, and then use linear dimension reduction in the high-dimensional space.

representing a mapping from a low-dimensional space to a high-dimensional space.

wherein i represents the ith element

S34, computing a nuclear matrix

The matrix K is calculated as follows:

s35, enabling the kernel matrix K to be more aggregated through transformation:

K′＝K-l _n K-Kl _n +l _n Kl _n (10)

It should be noted that the conventional CNN structure is composed of an input layer, a convolution layer, a pooling layer, a full connection layer, and an output layer. The input layer is used for receiving signals of the afferent neural network, and the convolution layer is used for carrying out local perception on the input signals and extracting important features. The pooling layer is generally used for reducing the dimension of the features extracted by the convolution layer and increasing the anti-distortion capability of the model. The full connection layer is responsible for converting the two-dimensional feature matrix output after the previous series of processing into a one-dimensional vector. The output layer is positioned behind the full connection layer and classifies and outputs the characteristics obtained from the full connection layer.

CNN has many advantages: the generalization capability of CNN is very strong, which is superior to other methods, and is widely applied in various fields; the CNN has good fault tolerance, can process various complex information, and can also well process defective and distorted information; the CNN has extremely strong adaptability, is good for extracting key characteristics of data, and has good classification effect.

However, as the complexity of the model increases, the training difficulty increases, and even an overfitting condition may occur, which seriously affects the training effect. The present embodiment improves on conventional convolutional neural networks by adding a bulk standard layer (BN layer) after the convolutional layer. The presence of BN layer may serve to speed up training and convergence and prevent overfitting. The combined use of CNN and BN algorithm after multiple comparison in the experiment can greatly improve the accuracy of the diagnosis result, and the effect of the combined use is greatly improved compared with that of a single convolutional neural network.

Preferably, the CNN model for batch standard layer optimization in step S4 includes a first convolution layer, a first batch standard layer, a first max-pooling layer, a second convolution layer, a second batch standard layer, a second max-pooling layer, a third convolution layer, a third batch standard layer, a fourth convolution layer, a fourth batch standard layer, a fifth convolution layer, a fifth batch standard layer, a third max-pooling layer, a flat layer, a first fully-connected layer, and a second fully-connected layer, which are sequentially set; by adding a batch of standard layers after the convolution layer, the training and convergence speed is increased, and the overfitting is prevented.

TABLE 1 convolutional neural network parameter set table optimized by batch standard layer

/>

Experimental example 1: IMS bearing data experimental analysis

The data adopted in the experiment are derived from IMS bearing experimental data, the experimental platform consists of a motor with the rotating speed of 2000RPM, four bearings and vibration sensors, the four bearings are loaded by the motor respectively, and then the original data signals are acquired by the vibration sensors. The IMS bearing data comprises three groups of experimental data sets, and the experiment is verified by adopting inner ring fault data and rolling body fault data in the data set 1, outer ring fault data in the data set 2, normal data in the data set 3 and a rolling body fault signal part in the data set 4.

The above 4 kinds of data are respectively divided into 30 groups of 4000 sampling points each. The 25 groups are trained, and the rest 5 groups are tested to obtain the time domain waveforms of various state signals of the bearing as shown in figure 2.

Taking bearing inner ring faults as an example, signal decomposition and reconstruction are carried out. As shown in fig. 3, which is a time domain diagram of 5 components after VMD decomposition, kurtosis calculation is performed on each component, and it is known that the kurtosis values of the 5 components are all greater than 3, so that the signal is directly analyzed without reconstruction.

And calculating a plurality of eigenvalues of the reconstructed signal, and constructing a 16-dimensional composite eigenvector. And performing KPCA dimension reduction processing on the feature matrix, and extracting key features of the signal again to obtain a new two-dimensional feature matrix, wherein the classification visualization effect of the new two-dimensional feature matrix is shown in figure 4.

The deep learning framework adopted in the experiment is Tensorflow, and the hardware is configured as follows: core (TM) i5-8265U CPU processor and NVIDIA GeForce MX230 graphics card.

And putting the obtained two-dimensional feature matrix into a CNN model for training. The model adopts an Adam optimizer to automatically optimize the learning rate, and adds a BN layer to improve on the basis of the traditional CNN, so that the overfitting phenomenon is avoided, and the result is more accurate.

As can be seen from fig. 5 and 6, the accuracy of the training set is completely converged when the iteration is performed about 30 times, and reaches 100%; the loss drops rapidly with iteration, fully converging around 200 times, infinitely approaching 0. The accuracy of the test set has completely converged when iterating about 150 times, reaching 100%, the loss gradually decreases along with the iteration, and completely converged when iterating about 300 times, and is infinitely close to 0.

As shown in FIG. 7, the confusion matrix can show that the recognition accuracy of the model on each data type reaches 100%, and the bearing fault can be accurately diagnosed.

As can be seen from fig. 8, the KPCA algorithm is used to visually represent the data distribution after passing through the three largest pooling layers and the last fully connected layer, respectively. After three convolution layers and a pooling layer, the features extracted by the model gradually show a gathering trend, and extremely high classification effect is achieved after the full connection layer. Comparing with the visual graph of the feature matrix in the dimension reduction process, the feature data of the same type are more gathered after the training of the neural network, the feature data of different types are more dispersed, and the classification effect after the training is better. At the same time, the method has proved to have good effect on the fault diagnosis of the rolling bearing.

Table 2 comparison table of different diagnostic methods

Diagnostic method	Training accuracy	Accuracy of test
			Method of this experimental example	100％	100％
LSTM
		100％	96％
SimpleRNN
		100％	69％
MLP
		100％	47％
KNN				28.4％	25％

To demonstrate the superiority of the methods presented herein, it is compared to conventional classical diagnostic methods. In the comparison process, the original data are processed by the traditional typical diagnosis method by adopting the pretreatment method which is the same as the pretreatment method, and then the original data are analyzed by using a diagnosis model. For each method 5 experiments were performed and averaged and the comparison results are shown in table 2. The result shows that the method has the best effect and stable effect, and the accuracy of 100% can be achieved each time of training and testing; the average training accuracy of the LSTM algorithm is 100%, the individual test accuracy is slightly low, the average test accuracy also reaches 96.00%, the fault classification is good, and the training time is long due to the fact that a large amount of original data are directly analyzed; the simpleRNN algorithm has high training accuracy, poor testability accuracy, an average value of 69 percent and poor diagnosis capability compared with the method; the MLP algorithm and the KNN algorithm have poor effects, the test accuracy is low, and the requirement of fault diagnosis is difficult to reach.

Experimental example 2: bearing data experimental analysis of QPZZ-II laboratory bench

The sampling frequency of this experiment was 12800Hz. The bearing data adopts a 6205 deep groove ball bearing, normal data, mixed fault data of an inner ring and an outer ring, rolling body fault data under the loading condition and outer ring fault data under the loading condition are selected and analyzed, and the degree of bearing fault is set as shown in table 3.

Table 3 bearing failure degree setting table of experimental example 2

Fault type	Degree of failure
		Failure of inner ring	Cutting depth 1.5mm and width 0.2mm
Failure of outer ring	2mm deep and 0.2mm wide
		Rolling element failure	Cutting depth 1mm and width 0.4mm

The experiment also divides the data in 4 states into 30 groups of 4000 sampling points each. Of these 25 groups were trained, the remaining 5 groups were tested.

Taking the outer ring fault under the loading condition as an example, signal decomposition and reconstruction are carried out. As shown in FIG. 10, the kurtosis calculation is performed on each component of the decomposed time domain graph of each component, and the kurtosis values of 5 components are all larger than 3, so that the signal is not required to be reconstructed and the original signal is directly analyzed.

As shown in fig. 11, eigenvalues are calculated, a composite eigenvalue matrix is constructed, KPCA dimension reduction is performed, and a new two-dimensional eigenvalue matrix is obtained.

Substituting the two-dimensional feature matrix into the convolutional neural network for identification and diagnosis, obtaining a training and testing accuracy curve shown in fig. 12 and a loss curve shown in fig. 13. As can be seen from fig. 12 and 13, the accuracy of the training set has completely converged when iterating about 15 times, reaching 100%, the loss rapidly decreases with the iteration, and completely converged when iterating about 200 times, and approaches to 0 infinitely; the accuracy of the test set has completely converged when iterating about 200 times, reaching 100%, the loss gradually decreases along with the iteration, and completely converged when iterating about 300 times, and is infinitely close to 0.

The final result is analyzed by constructing a confusion matrix, as shown in fig. 14. According to the confusion matrix, the identification accuracy of the model on each data type reaches 100%, and bearing faults can be accurately diagnosed. The data distribution after passing through the three maximum pooling layers and the last full connection layer is visually represented by using a KPCA algorithm, and the result is shown in fig. 15. It can be seen from the graph that the features extracted by the model gradually show a gathering trend after passing through three convolution layers and a pooling layer, and extremely high classification effect is achieved after the full connection layer. Compared with the KPCA dimension reduction diagram, the characteristic data of the same type are more concentrated after the convolutional neural network is trained, and the characteristic data of different types are more dispersed, so that the classification effect after the training is better. At the same time, the method has good effect and extremely high accuracy in fault diagnosis of the rolling bearing.

Therefore, the bearing fault diagnosis method based on the data layer feature fusion and the convolutional neural network has good effect and extremely high accuracy aiming at the limitation of the feature matrix constructed by the traditional time domain and frequency domain statistical indexes when complex nonlinear data are processed.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention and not for limiting it, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that: the technical scheme of the invention can be modified or replaced by the same, and the modified technical scheme cannot deviate from the spirit and scope of the technical scheme of the invention.

Claims

1. The bearing fault diagnosis method based on data layer feature fusion and convolutional neural network is characterized by comprising the following steps of: the method comprises the following steps:

2. The bearing fault diagnosis method based on data layer feature fusion and convolutional neural network according to claim 1, wherein the method comprises the following steps: the VMD decomposition described in step S1 includes the steps of:

in the method, in the process of the invention,

represents u _m Frequency domain expression of>

Representing the modal component of the n+1th iteration calculation,/->

Is a Fourier transform of f (t),>

is a fourier transform of λ (t), ω representing frequency;

3. The bearing fault diagnosis method based on data layer feature fusion and convolutional neural network according to claim 2, wherein the method comprises the following steps: in S11, the number of modal components k=5.

4. The bearing fault diagnosis method based on data layer feature fusion and convolutional neural network according to claim 1, wherein the method comprises the following steps: in step S2, a plurality of time-frequency domain features, short-time energy, two energy values of a Teager energy operator, approximate entropy and two entropy values of a sample entropy are selected for calculating feature values, and calculation results are combined to construct a multi-dimensional composite feature matrix.

5. The bearing fault diagnosis method based on data layer feature fusion and convolutional neural network according to claim 4, wherein the method comprises the following steps: the plurality of time-frequency domain features comprise kurtosis, root mean square, peak-to-peak value, skewness, margin factor, waveform factor, peak factor, pulse factor, center of gravity frequency, average frequency, frequency standard deviation and root mean square frequency;

6. The bearing fault diagnosis method based on data layer feature fusion and convolutional neural network according to claim 1, wherein the method comprises the following steps: the method for analyzing the principal component of the core in step S3 specifically includes the following steps:

wherein i represents an i-th element;

s34, computing a nuclear matrix

The matrix K is calculated as follows:

s35, enabling the kernel matrix K to be more aggregated through transformation:

K′＝K-l _n K-Kl _n +l _n Kl _n (10)

wherein K' is a new matrix after transformation; l (L) _n N is an n multiplied by n matrix, and all values are 1/n;

7. The bearing fault diagnosis method based on data layer feature fusion and convolutional neural network according to claim 1, wherein the method comprises the following steps: the CNN model for batch standard layer optimization in step S4 includes a first convolution layer, a first batch standard layer, a first maximum pooling layer, a second convolution layer, a second batch standard layer, a second maximum pooling layer, a third convolution layer, a third batch standard layer, a fourth convolution layer, a fourth batch standard layer, a fifth convolution layer, a fifth batch standard layer, a third maximum pooling layer, a flat layer, a first fully-connected layer, and a second fully-connected layer that are sequentially set;