CN114429150A

CN114429150A - Rolling bearing fault diagnosis method and system under variable working conditions based on improved depth subdomain adaptive network

Info

Publication number: CN114429150A
Application number: CN202111521473.2A
Authority: CN
Inventors: 康守强; 张春萌; 王玉静; 梁欣涛; 王庆岩; 兰朝凤
Original assignee: Harbin University of Science and Technology
Current assignee: Harbin University of Science and Technology
Priority date: 2021-12-30
Filing date: 2021-12-30
Publication date: 2022-05-03

Abstract

A rolling bearing fault diagnosis method and system under variable working conditions based on an improved depth subdomain adaptive network relates to the technical field of rolling bearing fault diagnosis and is used for solving the problem that an existing fault diagnosis model is low in fault diagnosis accuracy rate of vibration data with large distribution difference under different working conditions. The technical points of the invention comprise: performing short-time Fourier transform on the vibration data of the source domain and the target domain to obtain a time-frequency spectrogram; a channel attention mechanism and a first-layer wide convolution kernel mechanism are introduced to improve a residual error network and extract deep features in a time-frequency spectrogram; and performing sub-domain adaptive processing on the source domain characteristics and the target domain characteristics by using the local maximum mean difference, reducing the distribution difference between the sub-domains of the source domain and the target domain, and realizing fault diagnosis of the rolling bearing under complex working conditions. The rolling bearing fault diagnosis method can realize the rolling bearing fault diagnosis under the variable working conditions and the generalization of the working conditions, and has higher accuracy. The invention can be widely applied to fault diagnosis of the rolling bearing.

Description

Rolling bearing fault diagnosis method and system under variable working conditions based on improved depth subdomain adaptive network

Technical Field

The invention relates to the technical field of rolling bearing fault diagnosis, in particular to a rolling bearing fault diagnosis method and system under variable working conditions based on an improved depth subdomain adaptive network.

Background

Rolling bearings are widely used in industrial production and military facilities as one of the most critical parts in the field of rotating machinery^[1]. The working conditions of the rolling bearing are often variable and complex, and the change of the working conditions can also cause the change of the vibration characteristics of the bearing^[2]. Therefore, the method solves the problem of fault diagnosis of the rolling bearing under variable working conditions, and has important significance for ensuring the healthy operation of the rotary machine^[3]。

A common intelligent fault diagnosis method comprises the steps of firstly extracting the characteristics of a bearing vibration signal and then diagnosing faults through a mode identification method^[4]. The time-frequency analysis methods such as Empirical Mode Decomposition (EMD), integrated EMD, Variational Mode Decomposition (VMD) and the like are widely applied to feature extraction of vibration signals of rolling bearings. The mode identification methods of the support vector machine, the artificial neural network, the random forest and the like also have good effects in the field of fault diagnosis of the rolling bearing. In recent years, the method for extracting features in a self-adaptive manner overcomes the defect of artificial feature extraction in deep learning, and is widely applied to the field of fault diagnosis of rolling bearings^[5]. Document [6]Normalizing data obtained by a plurality of sensors, storing the data in a gray graph mode, and building a convolutional neural network based on LeNet-5 to realize fault diagnosis of the rolling bearing; document [7]]Deep separable convolution and multi-branch structure is introduced into Convolutional Neural Networks (CNN), and model size is compressed while avoidingThe problem of gradient disappearance is avoided, and the precision of fault diagnosis is ensured; document [8]Providing a deep convolution neural network of a first-layer wide convolution kernel, and applying the deep convolution neural network to the field of fault diagnosis of a rolling bearing; document [9 ]]The deep convolution neural network of the first layer of multi-scale convolution kernels is provided, and multi-scale features are extracted from the original vibration signals of the bearing by utilizing one-dimensional convolution kernels with different sizes, so that intelligent diagnosis of the health state of the bearing is realized; document [10]]An inclusion stacking idea is introduced on the basis of ResNet, a deep residual error hedging network is provided, and the bearing fault diagnosis accuracy is improved while the training time is shortened by 1/3.

The traditional deep learning method generally adopts CNN as a feedforward network model, but with the increase of the number of network layers, the CNN can generate the phenomenon of network degradation^[11]. The residual error network ResNet constructs a residual error block through identity mapping, so that the neural network avoids network degradation while the depth is increased. However, the difference of the dependence degree of the network on each channel is not considered by the ResNet, so that the network has partial redundant information in the training process. Attention mechanisms may direct computational resources to focus on the informative portion of deep features. Introducing a mechanism of attention to ResNet theoretically allows for better extraction of deep-layer network features.

Under the condition of variable working conditions, the vibration characteristics of the bearing are more complicated and changeable than those of the constant working conditions, and the training data of the model is often greatly different from the actual test data in distribution, so that the generalization of the established fault diagnosis model is poor^[12]. And the rolling bearing under part of working conditions is difficult to acquire the vibration signal with the label, and the data volume required by training the fault diagnosis model is difficult to meet.

Transfer learning is increasingly used in the diagnosis of faults in rolling bearings, since it makes it possible to solve different, but related, field problems with known knowledge. The document [3] introduces semi-supervised transport Component Analysis (SSTCA) to align the data edge distribution of a source domain and a target domain, and obtains a variable working condition fault diagnosis model by utilizing SVM training; document [13] proposes an improved Joint Distribution Adaptation (JDA), which reduces the difference between the edge Distribution and the condition Distribution of the source domain and the target domain, and completes the bearing fault diagnosis of variable working conditions. Document [14] proposes Balanced Distribution Adaptation (BDA) on the basis of JDA, adaptively adjusts importance of edge Distribution difference and condition Distribution difference, and realizes cross-domain fault diagnosis of a bearing.

Compared with the above migration learning method, deep feature migration learning can utilize deep networks to learn more migratable features for domain adaptation. In terms of depth feature migration, document [15 ]]The deep belief network is combined with the mixed kernel JDA, so that data distribution divergence is avoided, and multi-state identification of the bearing under variable load is realized; document [16]]An adaptation layer is added to the ResNet network model, and the Maximum Mean Difference (MMD) is used as the domain adaptation loss, so that the global distribution difference between a source domain and a target domain is reduced; document [17 ]]Aiming at the defects that the MMD adopts the first moment to measure the distribution difference and neglects the contribution of the high-order moment, a Polynomial Kernel induced MMD (Polynomial Kernel MMD, PK-MMD) is provided, a domain shared residual error network is constructed, and the fault diagnosis of the rolling bearing is completed; document [18]The method comprises the steps that a pre-trained Deep Adaptation Network (DAN) is finely tuned by using training data, a multi-core Kernel MMD (MK-MMD) domain is adapted to the global distribution of a source domain and a target domain, and bearing faults in different running states are effectively identified; document [19 ]]A deep convolution Wassertein countermeasure network model is provided, Wassertein distance is used as a domain adaptive loss term, variance constraint is combined, and variable working condition bearing fault diagnosis is achieved. The research content carries out full-local-area adaptation on the sample distribution of the source domain and the target domain through a depth characteristic migration means, and the fault diagnosis of the rolling bearing under different working conditions is realized. However, the domain adaptation method often ignores the relationship between sub-domains in different domains, easily causes aliasing of features between sub-domains, and loses fine-grained information, resulting in poor migration effect^[20]。

Disclosure of Invention

In view of the above problems, the invention provides a rolling bearing fault diagnosis method and system under variable working conditions based on an improved depth subdomain adaptive network, which are used for solving the problem that the fault diagnosis accuracy of the existing fault diagnosis model for vibration data with large distribution difference under different working conditions is not high.

According to one aspect of the invention, a fault diagnosis method for a rolling bearing under variable working conditions based on an improved depth subdomain adaptive network is provided, and the method comprises the following steps:

acquiring a rolling bearing vibration signal sample with known fault information under one working condition to construct a source domain data sample set, and acquiring a rolling bearing vibration signal sample with unknown fault information under other working conditions to construct a target domain data sample set, wherein the target domain data sample set comprises a training sample set and a testing sample set;

respectively performing short-time Fourier transform on the rolling bearing vibration signals in the source domain data sample set and the target domain data sample set, converting one-dimensional time domain signals into two-dimensional time-frequency images, and obtaining a source domain image sample set and a target domain image sample set;

constructing an improved residual error network, introducing an attention mechanism to dynamically distribute resources among channels, extracting deep features of a source domain image sample set and a target domain image sample set, and obtaining a soft label of a label-free target domain sample;

calculating the local maximum mean difference between the source domain and the target domain by utilizing the deep features of the source domain and the target domain, the real label of the source domain and the soft label of the target domain, taking the local maximum mean difference as a domain adaptive loss item, and measuring the distribution difference between sub-domains of the source domain and the target domain;

step five, the domain adaptive loss term and the cross entropy loss term of the improved residual error network are jointly used as a target function for optimization, and a fault diagnosis model of the rolling bearing under the variable working condition is obtained through training of specified iteration times;

and step six, inputting the target domain test sample set into the trained fault diagnosis model, comparing the fault diagnosis result of the test sample with the real label of the test sample, and obtaining the accuracy of the fault diagnosis model on the fault diagnosis of the rolling bearing so as to measure the performance of the fault diagnosis model.

Further, the improved residual error network in step three is an improvement on the existing residual error network, and the improvement is that: firstly, the convolution kernel size of the first convolution layer of the residual error network is modified from 7 multiplied by 7 to 15 multiplied by 15; secondly, the following improvements are made to the residual modules in the residual network model ResNet-50: performing convolution operations of 1 × 1 and 3 × 3 on the input features by each channel respectively, and preliminarily extracting the features; compressing and reducing the dimension of the characteristics of each channel through a global pooling layer to obtain global characteristics representing the context information of the channel; inputting the obtained global features into two fully-connected layers for processing, and modeling the correlation among all channels; improving nonlinearity between the two full connection layers by using a ReLU activation function; normalizing the obtained channel weight through a Softmax layer; and finally, weighting the original channel characteristics by using the normalized weight as output to obtain a characteristic image.

Further, the characteristic image V is output after weighting_cComprises the following steps:

V_c＝a_cU_c

in the formula, a_cRepresents the weight of the c channel; u shape_cRepresenting a characteristic image obtained after passing through a 1 × 1 convolution layer and a 3 × 3 convolution layer; v_cRepresenting the feature image output after weighting by the channel attention mechanism.

Further, in the fourth step, the local maximum mean difference LMMD is used for mapping and aligning each sub-domain of the source domain and each sub-domain of the target domain, and the calculation formula is as follows:

wherein p and q represent the distribution to which the source domain and target domain samples obey, respectively; h represents a regenerative Hilbert space with a characteristic kernel; c represents a category, and C represents a total number of categories;

and

respectively represent the source domain D_sTo (1) ai samples and a target Domain D_tThe jth sample of (a);

and

respectively represent

And

weights belonging to class c;

a feature mapping is shown that maps the original sample data to H.

Further, in the fifth step, the cross entropy loss term is used for judging the performance of the model on the source domain sample, and the calculation formula is as follows:

wherein C represents a category and C represents a total number of categories; n represents the total number of samples, i represents the sample number; y is_icIs a sign function, if the real category of the ith sample is c, the value is 1, otherwise the value is 0; p is a radical of formula_icIndicating the prediction probability that the ith sample belongs to the class c.

Further, the fault information in the step one comprises faults of inner rings, rolling bodies and outer rings with different damage diameters or no faults; the working conditions are obtained according to the combination of different loads and different motor rotating speeds.

According to another aspect of the invention, a fault diagnosis system for a rolling bearing under variable working conditions based on an improved depth subdomain adaptive network is provided, and comprises:

the data acquisition module is configured to acquire a rolling bearing vibration signal sample construction source domain data sample set with known fault information under one working condition and acquire a rolling bearing vibration signal sample construction target domain data sample set with unknown fault information under other working conditions, wherein the target domain data sample set comprises a training sample set and a test sample set; the fault information comprises inner ring faults, rolling body faults and outer ring faults or no faults with different damage diameters; the working conditions are obtained according to the combination of different loads and different motor rotating speeds;

the preprocessing module is configured to respectively perform short-time Fourier transform on the rolling bearing vibration signals in the source domain data sample set and the target domain data sample set, convert one-dimensional time domain signals into two-dimensional time-frequency images and obtain a source domain image sample set and a target domain image sample set;

the characteristic extraction module is configured to construct an improved residual error network, introduce an attention mechanism to dynamically distribute resources among channels, extract deep characteristics of a source domain image sample set and a target domain image sample set and obtain a soft label of a label-free target domain sample;

a distribution difference measurement module configured to calculate a local maximum mean difference between the source domain and the target domain using the source domain and the target domain deep features, the source domain real label, and the target domain soft label as a domain adaptive loss term, and measure a distribution difference between sub-domains of the source domain and the target domain;

the model training module is configured to optimize the domain adaptive loss term and the cross entropy loss term of the improved residual error network as a target function together, and obtain a fault diagnosis model of the rolling bearing under variable working conditions through training of specified iteration times;

and the model testing module is configured to input the target domain test sample set into the trained fault diagnosis model, compare the fault diagnosis result of the test sample with the real label thereof, and obtain the accuracy of the fault diagnosis model on the fault diagnosis of the rolling bearing so as to measure the performance of the fault diagnosis model.

Further, the improved residual error network in the feature extraction module is an improvement on the existing residual error network, and the improvement is that: firstly, the convolution kernel size of the first convolution layer of the residual error network is modified from 7 multiplied by 7 to 15 multiplied by 15; the second step,The following modifications are made to the residual module in the residual network model ResNet-50: performing convolution operations of 1 × 1 and 3 × 3 on the input features by each channel respectively, and preliminarily extracting the features; compressing and reducing the dimension of the characteristics of each channel through a global pooling layer to obtain global characteristics representing the context information of the channel; inputting the obtained global features into two fully-connected layers for processing, and modeling the correlation among all channels; improving nonlinearity between the two full connection layers by using a ReLU activation function; normalizing the obtained channel weight through a Softmax layer; finally, weighting the original channel characteristics by using the normalized weight as output to obtain a characteristic image; weighted output characteristic image V_cComprises the following steps:

V_c＝a_cU_c

Further, the local maximum mean difference LMMD in the distribution difference metric module is used for mapping and aligning each sub-domain of the source domain and each sub-domain of the target domain, and the calculation formula is as follows:

and

respectively representing a source domain D_sThe ith sample and the target domain D_tThe jth sample of (c);

and

respectively represent

And

weights belonging to class c;

a feature mapping is shown that maps the original sample data to H.

Further, the cross entropy loss term in the model training module is used for judging the performance of the model on the source domain sample, and the calculation formula is as follows:

According to another aspect of the present invention, there is provided a computer device comprising a memory, a processor and a computer program stored in the memory and executable by the processor, wherein the processor implements the method as described above when executing the computer program.

According to another aspect of the present invention, there is provided a computer-readable storage medium, being a non-volatile readable storage medium, having stored therein a computer program, which when executed by a processor, implements the method as described above.

According to yet another aspect of the application, there is provided a computer program product comprising computer readable code which, when executed by a computer device, causes the computer device to perform a method implementing the above.

The beneficial technical effects of the invention are as follows:

according to the method, a channel attention mechanism and a first-layer wide convolution kernel mechanism are introduced to improve a residual error network, so that deep features which are more relevant to rolling bearings under different working conditions are extracted; the distribution difference between sub-domains of a source domain and a target domain is measured by using local maximum mean difference LMMD, the global distribution difference and the local distribution difference between a labeled sample set of the source domain and a unlabeled sample set of the target domain are reduced in the model training process, and compared with the traditional method of measuring the global distribution difference of the source domain and the target domain by using the maximum mean difference, the LMMD can align the global distribution and focus on the local distribution; by utilizing the domain invariant feature among different domains extracted by the network, the problem that the rolling bearing has tag data scarcity under partial working conditions is solved, and meanwhile, the problem of fault diagnosis of the rolling bearing under variable working conditions and generalized working conditions is solved, so that the fault diagnosis accuracy of the rolling bearing is higher compared with other models.

Drawings

The present invention may be better understood by reference to the following description taken in conjunction with the accompanying drawings, which are incorporated in and form a part of this specification, and which are used to further illustrate preferred embodiments of the present invention and to explain the principles and advantages of the present invention.

FIG. 1 is a diagram of an original residual block structure;

FIG. 2 is a schematic diagram of an improved residual error network structure in an embodiment of the present invention;

FIG. 3 is a schematic diagram of an improved residual error module according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of the principle of domain adaptation and sub-domain adaptation in an embodiment of the present invention;

FIG. 5 is a block diagram of a fault diagnosis process in an embodiment of the invention;

FIG. 6 is a schematic view of a bearing data collection from a bearing test stand according to an embodiment of the present invention;

FIG. 7 is a diagram illustrating an example of a time-domain vibration signal interception scheme according to an embodiment of the present invention;

FIG. 8 is a comparison graph of visualization of features before and after improvement of a residual error network in an embodiment of the present invention; wherein, graph (a) corresponds to the original residual network and graph (b) corresponds to the improved residual network;

FIG. 9 is a local maximum mean difference LMMD and a local maximum mean difference MMD domain adaptation feature visualization comparison diagram in the embodiment of the present invention; wherein, graph (a) corresponds to the maximum mean difference MMD, and graph (b) corresponds to the local maximum mean difference LMMD;

FIG. 10 is a graph comparing the present invention method with a classical deep migration learning method;

FIG. 11 is an exemplary illustration of a confusion matrix for fault diagnosis of rolling bearings in an embodiment of the present invention;

FIG. 12 is a diagram illustrating the results of a generalized comparative experiment for operating conditions in an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the disclosure, exemplary embodiments or examples of the disclosure are described below with reference to the accompanying drawings. It is obvious that the described embodiments or examples are only some, but not all embodiments or examples of the invention. All other embodiments or examples obtained by a person of ordinary skill in the art based on the embodiments or examples of the present invention without any creative effort shall fall within the protection scope of the present invention.

The embodiment of the invention provides a rolling bearing fault diagnosis method under variable working conditions based on an improved depth subdomain adaptive network, aiming at the problems that the rolling bearing has large vibration data distribution difference under different working conditions, labeled vibration data under partial working conditions are difficult to obtain, and the accuracy of a diagnosis model is low. By introducing a channel attention mechanism into the residual error network, the attention degree among each convolution channel in the residual error network is dynamically distributed, useless information is suppressed, and more accurate deep fault characteristic information of the rolling bearing is mined; local Maximum Mean Difference (LMMD) is used for replacing the traditional MMD as an index for measuring the distribution difference between sub-domains of a source domain and a target domain, the distribution of related sub-domains of the same category in the source domain and the target domain is accurately adjusted, and domain invariant features are extracted, so that cross-domain fault diagnosis of the bearing is better realized. Finally, fault diagnosis of different fault positions and different fault degrees of the rolling bearing under different working conditions is realized. The process of the present invention is described in detail below.

1. Residual error network and improvements thereof

In CNN, as the number of network layers increases, the network may have problems of gradient disappearance, gradient explosion, and network degradation. The residual error network uses a gating mechanism in a long-time and short-time memory network for reference, adds identity mapping between network layers of the CNN, and superposes the input identity mapping and nonlinear transformation thereof, so that the problems are solved. The residual block is used as a basic unit of the residual network, and the structure of the residual block is shown in fig. 1.

The input of the residual block is z, the output is h (z), the residual refers to the difference between the output value h (z) and the input value identity mapping z, that is:

f(z)＝H(z)-z (1)

the learning object of the residual error network is residual error f (z), only the difference between the input and the output of the residual error block needs to be learned in the network training process, and compared with the traditional CNN, the learning difficulty is reduced. In the process of back propagation of the model, input z directly transmits information from the input end to the output end of the residual block through identity mapping, and the integrity of the information in the transmission process is ensured.

The nonlinear activation function can enhance the generalization ability of CNN and effectively prevent gradient diffusion. The invention selects a corrected Linear Unit (ReLU) as an activation function in a network model, and the calculation formula is as follows:

g(x)＝max(0,x) (2)

where x is the input characteristic, g (x) is the output obtained after nonlinear mapping. By means of the nonlinear feature representation capability provided by the ReLU function, the model convergence speed can be increased, and the parameter configuration is simpler.

Because the traditional residual error network model has limited receptive field and lacks correlation among cross channels, the traditional residual error network model can not carry out targeted feature extraction in the actual task processing process and can not obtain deep features with stronger correlation. Based on the reasons, the invention improves the residual error network, and the improved residual error network structure is shown in fig. 2.

In order to better extract deep features in the bearing time-frequency diagram, the convolution kernel size of convolution layers in the residual error network can be increased so as to increase the receptive field and extract features from a wider time-frequency interval. However, the size of the convolution kernel is increased once, so that the parameter quantity and the calculated quantity of the model are increased sharply, and therefore, the method only increases the convolution kernel of the first convolution layer of the residual error network so as to better extract the characteristics related to the bearing time-frequency characteristics in the shallow layer of the network. In selecting the convolution kernel size, the diameter of the convolution kernel is typically set to an odd number in order to make it easier to find the convolution anchor point. Considering the above aspects comprehensively, the convolution kernel size of the first convolution layer of the original residual network is modified from 7 × 7 to 15 × 15. Compared with the traditional residual error network model ResNet-50, the invention improves the residual error module in the network model, and the structure of the residual error module is shown in FIG. 3.

In the improved residual block, each channel respectively performs convolution operations of 1 × 1 and 3 × 3 on the input features, and the features are preliminarily extracted. And carrying out compression and dimension reduction on the characteristics of each channel through a global pooling layer to obtain global characteristics representing the channel context information:

wherein, U_c(i, j) is the feature obtained after the feature vector in the c channel passes through the 1 × 1 convolutional layer and the 3 × 3 convolutional layer; h and W represent dimensions in spatial dimensions; s_cRepresents the weight of the c channel; inputting the obtained global features into two fully-connected layers for processing, and modeling the correlation among all channels; the nonlinearity is improved between the two full connection layers by using a ReLU activation function so as to better fit the complex relationship between the channels; the resulting channel weights are normalized by the Softmax layer. And finally, weighting the original channel characteristics by using the normalized weight as output, wherein the calculation formula is as shown in the formula (4):

V_c＝a_cU_c (4)

wherein, a_cRepresents the weight of the c channel; u shape_cRepresenting a characteristic image obtained after passing through a 1 × 1 convolution layer and a 3 × 3 convolution layer; v_cRepresenting the feature image output after weighting by the channel attention mechanism.

The cross entropy loss item of the improved residual error network is used for judging the performance of the model on the source domain sample, and the calculation formula is as follows:

wherein C represents a category and C represents a total number of categories; n represents the total number of samples, i represents the sample number; y is_icIs a sign function, if the real category of the ith sample is c, the value is 1, otherwise the value is 0; p is a radical of_icIndicating the prediction probability that the ith sample belongs to the class c.

2. Sub-field adaptation method

In transfer learning, a given one contains n_sSource domain of individual labeled exemplars

And one contains n_tTarget domain of unlabeled exemplars

Wherein, in the step (A),

is the ith source domain sample

The corresponding one-hot label is marked with one-hot label,

it indicates that the corresponding sample belongs to the mth class of the source domain,

representing the jth unlabeled target domain sample. The sample sets of the source domain and the target domain are generally subject to similar two distributions.

The domain adaptation problem is one of the contents of the transfer learning research, and is used for solving the problems that the feature space is consistent with the category space and the feature distribution is inconsistent. In a conventional domain adaptation method, global domain transformation is generally performed on a source domain and a target domain, so that feature distributions of the transformed source domain and target domain are similar as much as possible, and domain invariant features suitable for the global domain are extracted. The MMD is a common index for measuring the global distribution difference between a source domain and a target domain, and is widely applied to a transfer learning method. The MMD is defined as shown in formula (5):

wherein, X^sAnd X^tRespectively representing source domain and target domain samples; p and q represent the distributions to which the source domain and target domain samples obey, respectively; h is a regenerative Hilbert Space (RKHS) with a characteristic nucleus;

representing a feature map that can map the original sample data to H.

Representing source domain samples X obeying a p-distribution^sMathematical expectations after mapping to RKHS.

Some less relevant data may become indistinguishable as the feature maps are mapped during the global distribution of the alignment features. In order to solve the problems, the sub-domains in which the same category is located in different domains can be respectively aligned, so that the local distribution of the source domain and target domain feature samples can be matched, the global distribution can be matched, and the global invariant feature can be extracted. A schematic diagram of the domain adaptation and subdomain adaptation principle is shown in fig. 4.

The invention uses the LMMD to replace the traditional MMD for global distribution alignment, and maps and aligns each subdomain of the source domain and the target domain respectively. The calculation formula of LMMD is shown in formula (6):

wherein the content of the first and second substances,

and

respectively representing the ith sample of the source domain and the jth sample of the target domain,

and

respectively represent

And

weights belonging to class c. It should be noted that it is preferable that,

and

are all equal to 1. Sample x_iCorresponding weight

Can be calculated by equation (7):

wherein, y_icIs the ith sample label vector y_iThe c element of (1)。

If the target domain does not contain the tag sample, the model may have an error in the tag (hard tag) predicted by the target domain data, and thus the equations (6) and (7) cannot be directly calculated

Is a probability distribution representing the network input x_iProbability of membership to each category. In order to reduce errors that may result from hard tag prediction while computing LMMD in the target domain, the target domain samples use probabilistic prediction in the model training process

(Soft Tab) calculation equations (6) and (7).

3. Rolling bearing fault diagnosis method and process under variable working conditions

Under the condition of variable working conditions, the whole flow diagram of the rolling bearing fault diagnosis system based on the improved residual error network depth subdomain adaptation method is shown in fig. 5, and the specific steps are as follows:

(1) construction of sample sets

Selecting a sample with known fault information under a certain working condition to construct a source domain data sample set, selecting a sample with unknown fault information under other working conditions to construct a target domain data sample set, wherein the target domain data sample set also comprises a training sample set and a test sample set;

(2) data pre-processing

Performing short-time Fourier transform on the time domain vibration signals of the source domain and the target domain, converting the original one-dimensional time domain signal into a two-dimensional time-frequency image, and taking the two-dimensional time-frequency image as the input of a subsequent network model;

(3) model construction

And calculating the local maximum mean difference between the source domain and the target domain by utilizing the deep characteristics of the source domain and the target domain, the real label of the source domain and the soft label of the target domain, and taking the local maximum mean difference as a domain adaptive loss term to measure the distribution difference between sub-domains of the source domain and the target domain. And (3) optimizing by taking the cross entropy loss items of the LMMD and the improved residual error network as a target function, and establishing a fault diagnosis model of the rolling bearing under variable working conditions through training of specified iteration times.

(4) Fault diagnosis

Inputting the target domain test sample set into the trained network model, and comparing the fault diagnosis result of the test sample with the real label by the model to obtain the accuracy of the fault diagnosis of the model so as to measure the performance of the model.

4. Application and analysis

Bearing data used in the experiment is collected from a bearing test bed, and as shown in fig. 6, the test bed consists of a motor, a load and a control circuit. The 6205-2RS deep groove ball bearing installed at the driving end of the motor is taken as a research object, the acquisition of a vibration signal is completed by a 16-channel data recorder arranged above a bearing seat at the driving end of the motor, and the sampling frequency is 12 kHz.

The fault of the bearing is a pitting fault generated by processing an inner ring, a rolling body and an outer ring of the bearing by an electric spark machine, the damage diameter of each fault position comprises three types of 0.1778mm, 0.3556mm and 0.5334mm, and the fault can be divided into 10 types in a normal state. The vibration signal of the rolling bearing which works in a normal state and does not have faults is represented by N. For convenience of description, the fault position and the fault degree of the rolling bearing are simplified, and the fault state representation method in the experimental data is shown in table 1. With 1024 points as the length of one sample, 100 sample data are selected in each working state and are acquired in a random interception manner, wherein the interception manner is shown in fig. 7.

Table 1 experimental data presentation method

Table 1Experimental data representation

Experimental data are acquired under four different loads of 0hp, 1hp, 2hp and 3hp (1hp is approximately equal to 0.75kW), and the rotating speed of the motor is changed between 1730rpm and 1797rpm according to the difference of the loads, wherein the corresponding relation between the working condition of the bearing and the load and the rotating speed is shown in a table 2.

TABLE 2 bearing operating conditions and load and rotation speed corresponding relationship

In the experiment, a total of 8 migration tasks were set, and the composition of the data set used for each task is shown in table 3. Taking task 6 as an example, 2000 known fault information samples working under the working conditions B and D are used as source domain data samples, 2000 unknown fault samples working under the working conditions A and C are used as target domain training samples, and the other 2000 unknown fault samples working under the working conditions A and C are used as target domain test samples.

TABLE 3 data set composition for each task

The proposed method uses the ReLU function as the activation function, the total number of iterations epochs is set to 200, the learning rate lr is set to 0.01, the domain adaptation weight coefficient param is set to 0.3 each experiment is repeated 3 times, the average value is taken as the final result. The CPU model is Intel Xeon W-2123; 32GB of the memory; the GPU model is NVIDIA GeForce GTX1080 Ti.

Firstly, in order to verify the classification effect of deep features extracted by an improved residual error network, a rolling bearing time-frequency spectrogram extracted by short-time Fourier transform is respectively input into an original residual error network and the improved residual error network for deep feature extraction, and dimension reduction visualization operation is carried out on the extracted deep features by utilizing a t-distribution stored neighboring embedding (t-SNE) algorithm. Taking the working condition of the source domain B and the working condition of the target domain C as an example, the characteristic visualization effect graphs before and after the residual error network improvement are shown in fig. 8.

As can be seen from fig. 8, the features extracted before and after the residual error network is improved are visualized to have a larger difference. After the deep features of the samples extracted by the original residual error network are visualized, two kinds of fault damage part feature samples are aliased in the region L1-L5, and a large amount of feature samples are aliased in the region L6. And only a small amount of sample aliasing occurs between the categories in the L1 area after the deep feature visualization of the improved residual network extraction. The preliminary judgment of the improved residual error network can better extract the deep features of the rolling bearing, and is beneficial to subsequent fault diagnosis.

In order to further verify that the deep features of the sample can be better extracted after the residual network is improved, the features extracted before and after the residual network is improved are respectively subjected to fault diagnosis by using Softmax, and experiments are carried out in 8 migration tasks, and the results are shown in Table 4.

TABLE 4 Fault diagnosis accuracy before and after residual error network improvement

Comparing the fault diagnosis accuracy before and after the improvement of the residual error network in table 4, the fault diagnosis accuracy obtained by the classification of the deep features extracted by the improved residual error network is obviously higher than that of the deep features extracted by the original residual error network. The result shows that the accuracy of fault diagnosis of the rolling bearing under variable working conditions can be improved by adding a channel attention mechanism and a first-layer wide convolution kernel mechanism into a residual error network.

Further, in order to verify that the LMMD can better realize the domain self-adaptation between the source domain and the target domain, the LMMD and the MMD are respectively utilized to carry out the domain adaptation operation on the deep features extracted from the network model, and a t-SNE method is used for processing the domain adaptation features. And taking the working condition of the source domain D and the working condition of the target domain B as examples, the LMMD and the classical MMD domain are adapted to have the characteristic visualization effect.

It can be seen from the observation of fig. 9(a) that severe aliasing of the feature samples occurs in the multiple types of fault damage samples in the region L1, and in addition, aliasing of a small number of feature samples occurs in the region L2-L6. The above results show that the MMD can reduce the overall distribution difference between the source domain and the target domain, but neglects the relationship between each subdomain, and causes a large amount of aliasing between each class. In fig. 9(b), only a small amount of sample aliasing occurs in the L1 region, so that it can be preliminarily determined that the LMMD can better complete the feature mapping between the source domain and the target domain compared with the conventional domain adaptation index MMD, so that the features of various samples in the target domain can be clearly distinguished.

In order to better illustrate the superiority of the LMMD in the domain adaptation problem, the improved residual error network is used for extracting features, the LMMD and the MMD are respectively used as domain adaptation loss terms, and the Softmax is used for fault diagnosis. The results of the fault diagnosis performed in 8 migration tasks are shown in table 5.

Table 5 Domain Adaptation method Fault diagnosis accuracy before and after improvement

As can be seen from table 5, compared with MMD, the average fault diagnosis accuracy obtained in 8 migration tasks using LMMD as the domain adaptive loss term is greatly improved, reaching 4.18%. The LMMD can better finish the fault diagnosis of the rolling bearing under variable working conditions by reducing the distribution difference between each sub-domain of the source domain and each sub-domain of the target domain by combining the fault diagnosis experiment and the characteristic visualization experiment to draw a conclusion.

In order to further prove the effectiveness of the method provided by the invention in the variable working condition fault diagnosis of the rolling bearing, 4 classical Deep migration learning methods such as a Domain adaptive Neural Network (DanN), a Deep Adaptation Network (DAN), a Deep-core, a Dynamic anti-Domain Adaptation Network (DAAN) and the like are selected for comparison experiments. The same sample set was used in the experiment and the test was performed in the same 8 migration tasks, the comparison results are shown in fig. 10.

The experimental results in fig. 10 show that the average accuracy of fault diagnosis by the method of the present invention can reach 96.99% in 8 different variable condition migration tasks, and the average accuracy is improved by at least 5.75% compared with 4 classical Deep migration learning methods such as DaNN, DAN, Deep-coral, DAAN, etc. In order to more intuitively observe the effectiveness of the method in the variable working condition fault diagnosis, a multi-classification confusion matrix is introduced to analyze the diagnosis result. Taking the experimental result of task 4 as an example, the confusion matrix is drawn as shown in fig. 11.

As can be seen from fig. 11, of the 1000 target domain test set samples, only 3 samples were diagnosed as erroneous. Of the misjudged 3 samples, 2 normal samples were misjudged as an OR14 fault, and 1B 14 fault was misjudged as an OR14 fault. The diagnosis accuracy rate of all other fault types reaches 100 percent. Therefore, the improved residual error network depth sub-domain adaptation method provided by the invention can effectively solve the problem of fault diagnosis of the rolling bearing under variable working conditions.

In order to further verify that the method can cope with the complex working environment of the rolling bearing, the trained model can accurately extract experience knowledge and learning domain invariant features of different but related fields under the complex working condition environment, and 4 groups of working condition generalization experiments shown in table 6 are designed. Inputting the data acquired under the generalization working condition into a model for fault diagnosis, and comparing the data with 5 classical Deep migration learning methods such as DanN, DAN, Deep-coral, DAAN, MRAN and the like to obtain a fault diagnosis result as shown in FIG. 12.

TABLE 6 composition of the working conditions generalization experiment data set

The experimental result of fig. 12 shows that the method of the present invention achieves an effect of an average fault diagnosis accuracy of 85.16% in 4 different working condition generalization migration tasks, i.e., tasks 9 to 12. Compared with 4 classical Deep migration learning methods such as DanN, DAN, Deep-coral, DAAN and the like, the method has better fault diagnosis effect in all tasks, and the average fault diagnosis accuracy is improved by at least 3.43%. Therefore, the method can better extract the domain invariant feature of the bearing under the complex working condition, and solve the problem of fault diagnosis of the rolling bearing under the generalization working condition.

According to the invention, a channel attention mechanism and a first-layer wide convolution kernel mechanism are introduced to improve the residual error network, deep-layer characteristics with more relevance of rolling bearings under different working conditions are extracted, and the improved residual error network is proved to have more excellent performance compared with the original residual error network through evidence of characteristic visualization and fault diagnosis accuracy; the distribution difference between sub-domains of the source domain and the target domain is measured by using the local maximum mean difference, and compared with the traditional method of measuring the global distribution difference of the source domain and the target domain by using the maximum mean difference, the LMMD can focus on local distribution while aligning the global distribution; experiments prove that the variable working condition fault diagnosis of the rolling bearing can be better realized by using the local maximum mean difference as the domain adaptive loss; the invention provides a new end-to-end deep migration learning method, which reduces global distribution difference and local distribution difference between a source domain labeled sample set and a target domain unlabeled sample set while extracting deep features of a rolling bearing; by utilizing the domain invariant feature among different domains extracted by the network, the problem that the rolling bearing has tag data scarce under partial working conditions is solved, and the problem of fault diagnosis of the rolling bearing under variable working conditions and generalized working conditions is solved; the effectiveness of the method is verified by comparing 12 migration tasks with 4 deep migration learning methods.

Another embodiment of the present invention provides a system for diagnosing a rolling bearing fault under variable conditions based on an improved depth sub-domain adaptive network, including:

the feature extraction module is configured to construct an improved residual error network, introduce an attention mechanism to dynamically distribute resources among channels, extract deep features of the source domain image sample set and the target domain image sample set, and obtain soft labels of the unlabeled target domain samples;

the model training module is configured to optimize the domain adaptive loss term and the cross entropy loss term of the improved residual error network as a target function together, and obtain a fault diagnosis model of the rolling bearing under the variable working conditions through training of specified iteration times;

In this embodiment, optionally, the improvement of the residual error network in the feature extraction module is an improvement of an existing residual error network, and the improvement is as follows: firstly, the convolution kernel size of the first convolution layer of the residual error network is modified from 7 multiplied by 7 to 15 multiplied by 15; secondly, the following improvements are made to the residual modules in the residual network model ResNet-50: performing convolution operations of 1 × 1 and 3 × 3 on the input features by each channel respectively, and preliminarily extracting the features; compressing and dimensionality reduction is carried out on the characteristics of each channel through a global pooling layer to obtain context information representing the channelThe global characteristic of (2); inputting the obtained global features into two fully-connected layers for processing, and modeling the correlation among all channels; improving nonlinearity between two full connection layers by using a ReLU activation function; normalizing the obtained channel weight through a Softmax layer; finally, weighting the original channel characteristics by using the normalized weight as output to obtain a characteristic image; weighted output characteristic image V_cComprises the following steps:

V_c＝a_cU_c

In this embodiment, optionally, the local maximum mean difference LMMD in the distribution difference metric module is used to map and align sub-domains of the source domain and the target domain, and a calculation formula thereof is:

and

respectively representing a source domain D_sThe ith sample and the target domain D_tThe jth sample of (a);

and

respectively represent

And

weights belonging to class c;

a feature mapping is shown that maps the original sample data to H.

The system provided by this embodiment may execute the method provided by the method embodiment, and the detailed process is described in the method embodiment and is not described herein again.

Embodiments of the present application further provide a computing device comprising a memory, a processor and a computer program stored in the memory and executable by the processor, the computer program being stored in the memory for space of program code, the computer program, when executed by the processor, implementing a method for performing any of the steps of the method according to the present invention.

An embodiment of the application also provides a computer-readable storage medium comprising a storage unit for program code, the storage unit being provided with a program for performing the steps of the method according to the invention, the program being executed by a processor.

The embodiment of the application also provides a computer program product containing instructions. Which, when run on a computer, causes the computer to carry out the steps of the method according to the invention.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed by a computer, cause the computer to perform, in whole or in part, the procedures or functions described in accordance with the embodiments of the application. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium; the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable media may be magnetic, optical, or semiconductor media, among others.

While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as described herein. The present invention has been disclosed in an illustrative rather than a restrictive sense, and the scope of the present invention is defined by the appended claims.

The documents cited in the present invention are as follows:

[1] wangyizhi rolling bearing vibration signal feature extraction and state evaluation method research [ doctor academic thesis ]. Harbin Industrial university, Harbin, 2015.

[2]ChenZQ,Deng SC,Chen XD,Li C,Sanchez R,Qin H F.Deep neural networks-based rolling bearing fault diagnosis.Microelectronics Reliability,2017,75,pp.327–333.

[3] The method for diagnosing the fault of the rolling bearing under the variable working condition based on the characteristic transfer learning is applied to the technical field of rolling bearing fault diagnosis, namely health conservation, Humingwu, Wangyuyejing, Xijinbao and V.I. Mikulovich, China Motor engineering reports 2019,39(03):764 772+955.

[4] Kang defend, Zhouyaoyue, Wang Yujing, Xie jin Bao, V.I. MIKULUOVICH. Rolling bearing fault diagnosis method under variable load based on unsupervised feature alignment, China Motor engineering newspaper 2020,40(01):274 + 281 393.

[5]ShenC Q,Qi Y M,Wang J,et al.An automatic and robust features learning method for rotating machinery fault diagnosis based on contractive autoencoder.Engineering Applications of Artificial Intelligence,2018,76,pp.170–184.

[6] Zhudan essence, Zhangyongxiang, Panyangyang, etc. rolling bearing fault diagnosis based on multisensor signals and convolutional neural networks vibration and shock 2020, 39(04): 172-.

[7] Liu Heng, Yao German minister, Yang Jian, Zhang Jia, research on fault diagnosis of rolling bearing based on multi-branch depth separable convolutional neural network, vibration and impact, 2021,40(10):95-102.

[8]Wei Z,Peng G,Li C,et al.A new deep learning model for fault diagnosis with good anti-noise and domain adaptation ability on raw vibration signals.Sensors,2017,17(3):425.

[9]Fu L,Zhang L,Tao J.An improved deep convolutional neural network with multiscale convolution kernels for fault diagnosis of rolling bearing.IOP Conference Series:Materials Science and Engineering,2021,1043(5):052021(10pp).

[10] The aeronautical report of 1-11[2021-07-31]. http:// kns.cnki.net/kcms/tail/11.1929. V.20210720.1052.002.html.

[11]He K M,Zhang X Y,Ren S Q,Sun J.Deep residuallearning for image recognition.2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2016,pp.770-778.

[12]An Z H,Li S M,Wang J R,Xin Y,Xu K.Generalization of deep neural network for bearing fault diagnosis under different working conditions using multiple kernel method.Neurocomputing,2019,352.

[13]Qian WW,Li S M,Yi P X,Zhang K C.A novel transfer learning method for robust fault diagnosis of rotating machines under variable working conditions.Measurement,2019,138:514-525.

[14]Gu J W,Wang Y X.A cross domain feature extraction method for bearing fault diagnosis based on balanced distribution adaptation.2019Prognostics and System Health Management Conference(PHM-Qingdao).2019.

[15]Kang S Q,Chen W W,Wang Y J,Na X D,Wang Q Y,V.I.Mikulovich.Method of state identification of rolling bearings based on deep domain adaptation under varying loads.IET Science,Measurement&Technology,2020,14(3):303-313.

[16] The deep migration diagnosis method of the faults of mechanical equipment under big data of Dada of Thailand, Dabin, Domega Jun and Luna. mechanical engineering reports 2019, 55(07):1-8).

[17]Yang B,Lei Y G,Jia F,Li N P,Du Z J.A polynomial kernel induced distance metric to improve deep transfer learning for fault diagnosis of machines.IEEE Transactions on Industrial Electronics,2020,67(11):9747-9757.

[18]Wen J,Pan B S,Luo L P,Zhang K W,Wu Q H.A new bearing fault diagnosis framework with deep adaptation networks for industrial application.2019 Prognostics and System Health Management Conference(PHM-Qingdao),2019.

[19]Zou Y S,Liu Y Z,Deng J L,Jiang Y L,Zhang W H.A novel transfer learning method for bearing fault diagnosis under different working conditions.Measurement,2021,171:108767.

[20]Zhu Y C,Zhuang F Z,Wang J D,Ke G L,Chen J W.Deep subdomain adaptation network for image classification.IEEE Transactions on Neural Networks and Learning Systems,2020,PP(99):1-10.

Claims

1. A rolling bearing fault diagnosis method under variable working conditions based on an improved depth subdomain adaptive network is characterized by comprising the following steps:

2. The method for diagnosing the rolling bearing fault under the variable working condition based on the improved depth subdomain adaptive network as claimed in claim 1, wherein the improved residual error network in the third step is an improvement of the existing residual error network, and the improvement is that: firstly, the convolution kernel size of the first convolution layer of the residual error network is modified from 7 multiplied by 7 to 15 multiplied by 15; secondly, the following improvements are made to the residual modules in the residual network model ResNet-50: performing convolution operations of 1 × 1 and 3 × 3 on the input features by each channel respectively, and preliminarily extracting the features; compressing and reducing the dimension of the characteristics of each channel through a global pooling layer to obtain global characteristics representing the context information of the channel; inputting the obtained global features into two fully-connected layers for processing, and modeling the correlation among all channels; improving nonlinearity between the two full connection layers by using a ReLU activation function; normalizing the obtained channel weight through a Softmax layer; and finally, weighting the original channel characteristics by using the normalized weight as output to obtain a characteristic image.

3. The method for diagnosing the rolling bearing fault under the variable working condition based on the improved depth subdomain adaptive network as claimed in claim 2, wherein a characteristic image V output after weighting is adopted_cComprises the following steps:

V_c＝a_cU_c

in the formula, a_cRepresents the weight of the c channel; u shape_cRepresenting a characteristic image obtained after passing through a 1 × 1 convolutional layer and a 3 × 3 convolutional layer; v_cRepresenting the feature image output after weighting by the channel attention mechanism.

4. The method for diagnosing the rolling bearing fault under the variable working condition based on the improved depth subdomain adaptive network according to claim 3, wherein the local maximum mean difference LMMD in the fourth step is used for respectively mapping and aligning each subdomain of a source domain and a target domain, and the calculation formula is as follows:

wherein p and q respectively represent the distribution of obedience of the source domain and the target domain samples; h represents a regenerative Hilbert space with a characteristic kernel; c represents a category, and C represents a total number of categories;

and

and

respectively represent

And

weights belonging to class c;

a feature mapping is shown that maps the original sample data to H.

5. The method for diagnosing the rolling bearing fault under the variable working condition based on the improved depth subdomain adaptive network as claimed in claim 4, wherein the cross entropy loss term in the fifth step is used for judging the expression of the model on the source domain sample, and the calculation formula is as follows:

6. The method for diagnosing the fault of the rolling bearing under the variable working condition based on the improved depth subdomain adaptive network as claimed in claim 5, wherein the fault information in the step one comprises faults of inner rings, rolling bodies, outer rings or no faults with different damage diameters; the working conditions are obtained according to the combination of different loads and different motor rotating speeds.

7. A rolling bearing fault diagnosis system under variable working conditions based on an improved depth subdomain adaptive network is characterized by comprising:

8. The system of claim 7, wherein the system is configured to diagnose the rolling bearing fault under the variable operating conditions based on the advanced sub-domain adaptive networkThe improved residual error network in the extraction module is an improvement on the existing residual error network, and the improvement is as follows: firstly, the convolution kernel size of the first convolution layer of the residual error network is modified from 7 multiplied by 7 to 15 multiplied by 15; secondly, the following improvements are made to the residual modules in the residual network model ResNet-50: performing convolution operations of 1 × 1 and 3 × 3 on the input features by each channel respectively, and preliminarily extracting the features; compressing and reducing the dimension of the characteristics of each channel through a global pooling layer to obtain global characteristics representing the context information of the channel; inputting the obtained global features into two fully-connected layers for processing, and modeling the correlation among all channels; improving nonlinearity between the two full connection layers by using a ReLU activation function; normalizing the obtained channel weight through a Softmax layer; finally, weighting the original channel characteristics by using the normalized weight as output to obtain a characteristic image; weighted output characteristic image V_cComprises the following steps:

V_c＝a_cU_c

9. The system according to claim 8, wherein the local maximum mean difference LMMD in the distribution difference metric module is used for mapping and aligning each sub-domain of the source domain and the target domain, respectively, and the calculation formula is as follows:

and

and

respectively represent

And

weights belonging to class c;

a feature mapping is shown that maps the original sample data to H.

10. The system according to claim 9, wherein the cross entropy loss term in the model training module is used to determine the performance of the model on the source domain sample, and the calculation formula is as follows: