CN117171907A

CN117171907A - Rolling bearing residual life prediction method and system

Info

Publication number: CN117171907A
Application number: CN202311112215.8A
Authority: CN
Inventors: 蒋全胜; 陆星驰; 沈晔湖; 吴石磊; 姚琴; 王报祥; 谢鸥; 朱其新
Original assignee: Suzhou University of Science and Technology
Current assignee: Suzhou University of Science and Technology
Priority date: 2023-08-31
Filing date: 2023-08-31
Publication date: 2023-12-05

Abstract

The invention relates to a method and a system for predicting the residual life of a rolling bearing, wherein the method comprises the following steps: acquiring samples, wherein the samples comprise labeled source domain samples and unlabeled target domain samples; constructing a feature extractor, a domain adaptation module and an RUL predictor, wherein the feature extractor adopts a reinforced residual convolution network and a convolution attention module to obtain source domain features and target domain features; the source domain features and the target domain features are input into a domain adaptation module to calculate the maximum mean difference loss; the source domain features are input into an RUL predictor to output a prediction error; training a network by using a sample to obtain a rolling bearing residual life prediction model; and obtaining a residual life prediction value of the rolling bearing based on the residual life prediction model of the rolling bearing. According to the method, the degradation characteristics are extracted by using the enhanced residual convolution network, the degradation characteristics are screened by using the convolution attention module, and the domain offset problem is solved by combining the maximum mean value difference, so that RUL prediction under the data distribution difference is realized.

Description

Rolling bearing residual life prediction method and system

Technical Field

The invention relates to the technical field of bearing fault prediction and health management, in particular to a method and a system for predicting the residual life of a rolling bearing.

Background

Fault Prediction and Health Management (PHM) is an important means to improve the reliability and benefits of mechanical devices. Wherein the residual service life (RUL) prediction is used as an important component of PHM, and can provide valuable advice guidance for operation and maintenance decision-making, thereby guaranteeing the reliable operation and efficiency of mechanical equipment. Rolling bearings are important factors in maintaining proper operation of the machine system as key components of the rotating machine. In industrial practical application, the mechanical system is often failed due to high-speed and heavy-load factors under complex working conditions, and finally the mechanical system is caused to be failed. According to statistics, half of faults in a rotary mechanical system are caused by bearing faults, so that RUL prediction is performed on the rolling bearing, predictive maintenance is performed according to effective prediction results, and the method has important significance for improving the reliability of mechanical equipment.

With the continuous development of sensors and deep learning technologies, data-driven RUL prediction methods have been widely studied, for example, long-short-term memory networks (LSTM) are used to construct a bearing RUL life prediction model, so as to improve the cognitive ability of the bearing carrier state. And a gated double-attention unit (GDAU) neural network is adopted to effectively realize RUL prediction of the rolling bearing. And a convolution long-short time memory (CNNLSTM) network combining the convolution and the long-short time memory module to reflect the time-space correlation of the representative characteristics and realize effective bearing RUL prediction. However, the existing method must satisfy the assumption that the training data (source domain) and the test data (target domain) are from the same data distribution, so as to be able to work effectively. However, the bearings may be affected by various factors such as different speeds and loads during actual operation, and the changes in failure modes often accompany the operation to failure, resulting in differences in distribution of degradation characteristics in the vibration signal data collected under different conditions, which are called domain shifts. Therefore, the above assumption is difficult to be established in many practical application scenarios.

In order to solve the above problems, a new method for realizing RUL prediction for bearings under unknown or new conditions, i.e., solving the domain offset problem, is urgently needed. Transfer Learning (TL) is currently commonly employed to provide a solution for RUL prediction under domain-offset conditions. For example, a new data-driven domain-adaptive prediction method (LSTM-DANN) is proposed using LSTM networks and domain-opposing neural networks (DANN), where DANN is used for RUL prediction of the target domain. The time convolution network and residual error self-attention can be utilized to extract degradation characteristics, and a cross-domain adaptive structure is designed by adopting contrast loss and multi-core maximum mean difference based on the degradation characteristics, so that cross-domain RUL prediction is effectively realized. The domain invariant feature is extracted by using a Deep Feature Decomposition Transfer Learning Network (DFDTLN), the network can effectively decompose the shared and private feature representation of each domain, and experimental results show that the network has better domain invariant extraction and cross-domain RUL prediction performance. A metric-versus-domain-adaptation (MADA) cross-domain RUL prediction method can also be proposed based on the local semantics of the degradation features and mutual information of the target-specific data. Or combining depth metric learning with migration learning to solve the regression problem. Or a transition multi-stage shrinkage attention time convolution network (TMSACT) for enhancing the attention degree of degradation information and cross-domain invariant feature learning is adopted, and a cross-working condition RUL prediction method is constructed based on the network.

The above work mainly investigated migration RUL predictions under different conditions on the same device. However, in an actual application scene, some mechanical devices may have the problem of high difficulty in operation data acquisition and labeling processing, so that the follow-up operation and maintenance lack of prior operation data, and the operation and maintenance difficulty and cost are improved. If the prior data on other mechanical equipment can be effectively utilized to help realize the RUL prediction of the mechanical equipment without the prior data, the operation and maintenance cost can be greatly saved, and the working efficiency is improved. In addition, due to the inherent characteristics of the work platforms, there may be a large distribution difference of vibration signal data collected from different work platforms, which may also cause difficulty in RUL prediction.

Disclosure of Invention

The invention provides a method and a system for predicting the residual life of a rolling bearing, which are used for solving the technical problems.

In order to solve the technical problems, the invention provides a method for predicting the residual life of a rolling bearing, which comprises the following steps:

acquiring samples, wherein the samples comprise labeled source domain samples and unlabeled target domain samples;

constructing an enhanced residual convolution field adaptation network, wherein the enhanced residual convolution field adaptation network comprises a feature extractor, a field adaptation module and an RUL predictor, the feature extractor adopts an enhanced residual convolution network and a convolution attention module and is used for extracting degradation features, and screening and enhancing the degradation features to obtain source domain features and target domain features respectively; the domain adaptation module comprises a maximum mean value difference, and the source domain characteristics and the target domain characteristics are input into the domain adaptation module to calculate a maximum mean value difference loss; the source domain features are input into the RUL predictor to output prediction errors;

Training the adaptation network in the enhanced residual convolution field by using the sample to obtain a rolling bearing residual life prediction model;

and obtaining a residual life predicted value of the rolling bearing based on the residual life predicted model of the rolling bearing.

Preferably, each convolution block of the enhanced residual convolution network is a residual structure.

Preferably, the residual structure is: and connecting the input before the two-layer convolution nonlinear transformation with the output after the transformation, and performing activation-pooling operation.

Preferably, a convolution-normalized convolution layer is added to each of the convolution blocks.

Preferably, the domain adaptation module further comprises a parallel full connection structure, and the parallel full connection structure is arranged between the feature extractor and the RUL predictor and between the feature extractor and the domain adaptation module.

Preferably, the maximum mean value difference adopts multi-core maximum mean value difference.

Preferably, as the number of training iterations increases, the specific gravity of the multi-core maximum mean difference in total loss gradually decreases.

Preferably, the parameters of the rolling bearing residual life prediction model are updated by using an Adam optimizer until the loss approaches an expected value.

Preferably, after training to obtain the rolling bearing residual life prediction model, the method further comprises testing the rolling bearing residual life prediction model by using an unlabeled target domain sample.

The invention also provides a rolling bearing residual life prediction system, which comprises:

the system comprises a sample acquisition unit, a sample analysis unit and a sample analysis unit, wherein the sample comprises a labeled source domain sample and an unlabeled target domain sample;

the network construction unit is used for constructing an enhanced residual convolution field adaptation network, the enhanced residual convolution field adaptation network comprises a feature extractor, a field adaptation module and an RUL predictor, the feature extractor adopts the enhanced residual convolution network and a convolution attention module and is used for extracting degradation features, screening and enhancing the degradation features to obtain source field features and target field features respectively; the domain adaptation module comprises multi-core maximum mean value difference, and the source domain characteristics and the target domain characteristics are input into the domain adaptation module to calculate maximum mean value difference loss; the source domain features are input into the RUL predictor to output prediction errors;

the model training unit is used for training the adaptation network in the enhanced residual convolution field by utilizing the sample to obtain a rolling bearing residual life prediction model; and

And the life prediction unit is used for obtaining a residual life prediction value of the rolling bearing based on the residual life prediction model of the rolling bearing.

Compared with the prior art, the method and the system for predicting the residual life of the rolling bearing have the following advantages:

1. the invention designs an Enhanced Residual Convolution Network (ERCN) to extract degradation characteristics, and utilizes a convolution attention module (CBAM) to screen key information of the degradation characteristics, and finally combines maximum mean value difference (MMD) to help solve the domain offset problem so as to realize the RUL prediction under the data distribution difference;

2. aiming at the problem of negative migration in the field adaptation process, the invention designs a parallel full-connection structure between the feature extractor and the RUL predictor as well as the field adaptation module so as to avoid the influence of MMD loss on the RUL predictor; then designing a decremental parameter mechanism for MMD loss when calculating loss so as to gradually relieve excessive influence of MMD on the network;

3. an Enhanced Residual Convolution Domain Adaptation Network (ERCDAN) is constructed by using ERCN, CBAM and MMD to extract domain invariant features in vibration signals, and a cross-equipment rolling bearing RUL prediction method based on ERCDAN is provided;

4. The invention performs two RUL prediction experiments on PHM2012 and XJTU-SY public data sets: cross-device and cross-device conditions. According to experimental results, the RUL prediction method based on ERCDAN can effectively realize RUL prediction of a label-free target domain, and has good generalization performance.

Drawings

FIG. 1 is a flow chart of a convolution attention module in an embodiment of the present invention;

FIG. 2 is a system block diagram of an enhanced residual convolution domain adaptation network in accordance with an embodiment of the present invention;

FIG. 3 is a network block diagram of an enhanced residual convolution network in accordance with one embodiment of the present invention;

FIG. 4 is a diagram showing a comparison of RMSE, MAE, R2 when different backbone networks are employed;

FIG. 5 is a graph of RMSE, MAE, R2 versus penalty factor;

FIG. 6 is a graph comparing experimental results of network performance when certain module(s) and structure(s) are removed;

FIG. 7 is a graph comparing experimental results of RUL predictive performance of various methods in a structure ablation experiment;

FIG. 8 is a vibration signal across multiple rolling bearings of an apparatus;

FIG. 9 is a graph comparing experimental results of RUL predictive performance of a cross-device rolling bearing;

FIG. 10 is a graph showing the comparison of experimental results of various tasks under the same conditions;

FIG. 11 is a graph comparing experimental results of various tasks under different conditions.

Detailed Description

In order to describe the technical solution of the above application in more detail, the following specific examples are listed to demonstrate technical effects; it is emphasized that these examples are illustrative of the application and are not intended to limit the scope of the application.

First, the convolution attention module and the maximum mean difference used in the present application will be described in detail.

The convolved attention module (CBAM) mechanism is a hybrid attention mechanism that combines both channel and spatial attention mechanisms. For a given feature map, the CBAM can generate the attention feature map information in both the channel and space dimensions in a serialized manner, and then the two feature map information are multiplied or added to the previous input feature map to perform adaptive feature correction, thereby generating the final feature map. Corresponding to the RUL prediction, key information of degradation characteristics can be supplemented when the characteristics are extracted from the vibration signals, and the quality of the extracted characteristics is improved, so that the characteristic extraction capability of a network is enhanced.

The general flow of CBAM is shown in figure 1 # _a ) As shown, the feature map for the backbone network is defined asThe 1D channel attention feature of the channel attention mechanism in CBAM maps to +. >The 2D spatial attention feature of the spatial attention mechanism is mapped to +.>The formula of the overall flow is expressed as follows:

in the formula (1)For element-level addition, X' is the output after the channel attention mechanism, and X "is the final output after the entire CBAM.

Fig. 1 (b) and 1 (c) are the principles of channel attention and spatial attention mechanisms, respectively. For the channel attention mechanism, firstly, the input features are subjected to average pooling and maximum pooling operation, and the generated two features are respectively defined asAndthese two features are then input to a shared network structured as convolution (Conv) -activation (ReLU) -convolution (Conv) to generate the final feature map +.>For spatial attention mechanisms, input is first doneThe features are subjected to an average pooling and a maximum pooling operation along the channel axis, and the two generated 2D feature maps are defined as +.>And->The two are then concatenated and subjected to convolution operations and Sigmiod activation to generate the final 2D spatial attention map. The above formula is expressed as follows:

in formulas (2) and (3), σ represents Sigmoid activation function, avgPool (·) represents average pooling, maxPool (·) represents maximum pooling, F ^C Shared network structured as convolution-ReLU-convolution in the representative channel attention mechanism, f ^7×7 Representing a convolution operation of size 7 x 7.

Maximum Mean Difference (MMD) is one of the distance metric loss functions widely used in the current field of adaptive methods. It is defined as the square of the distance between different data distributions in the Regenerated Kernel Hilbert Space (RKHS), assuming that the data sets from the source and target domains follow the distribution p and q, respectively, MMD is calculated as follows:

in the formula (4) of the present application,representing the Regenerated Kernel Hilbert Space (RKHS), φ (·) is a nonlinear mapping from the original feature space to RKHS. Based on kernel mean embedding in RKHS, a flat can be obtained according to a kernel functionEmpirical estimation of the MMD of the party is as follows:

in formula (5), k (·, ·) is the feature kernel, n _s And n _t Representing the number of samples of the source domain and the target domain, respectively. In order to better represent the data distribution difference in a high-dimensional space and improve the characteristic characterization capability of a model, the application selects the MMD (namely MK-MMD) measurement of a multi-core and the characteristic kernel k (x) related to the characteristic mapping phi (#) _s ，x _t )＝φ(x _s )·φ(x _t ) Defined as a weighted sum of m kernels, as follows:

beta in formula (6) _u Representing the weights. Thus, the MK-MMD distance metric loss function between the data distributions of the source domain and the target domain is defined as:

the application provides a method for predicting the residual life of a rolling bearing based on the theoretical basis, so as to achieve the following purposes.

For data collected under known conditions, it is defined as n _s Source domain of each tagged sampleAnd for data collected under unknown conditions, it is defined as n _t Target Domain of individual Label-free samples->Wherein P is _s (X _s ) And P _t (X _t ) Probability distribution in feature space of source domain and target domain data, respectively, since they are acquired from different sourcesConditions, therefore, their data and probability distributions are also different, i.e. X _s ≠X _t And P (X) _s )≠P(X _t ). The method aims at migrating the supervision information of the active domain to the target domain without labels to generateTo help complete the RUL prediction for the target domain. Specifically, a mapping function is constructed by learning domain invariant features between source domain and target domain>So that a value similar to the real RUL can be predicted when testing on the target domain, i.e

In order to achieve the technical purpose, the method provided by the invention comprises the following steps:

In order to realize cross-equipment RUL prediction, the method is divided into two parts, as shown in fig. 2, firstly, an Enhanced Residual Convolution Network (ERCN) is designed for improving the capability of a network model for extracting degradation characteristics from redundant one-dimensional vibration signals, a characteristic extractor is constructed by combining CBAM, and then a domain adaptation module is constructed according to the scheme; in some embodiments, training to obtain the rolling bearing residual life prediction model further comprises testing the rolling bearing residual life prediction model with unlabeled target domain samples.

Specifically, the success of the translation invariance of the Convolutional Neural Network (CNN) in the aspect of feature extraction on the image task provides a thought for bearing RUL prediction based on one-dimensional vibration signals. However, as the performance requirements for network model prediction increase, the performance of traditional CNN-based network models is challenged. Furthermore, for the RUL prediction task, there is a timing dependence since the collected vibration signal data is life cycle data of the bearing from operation to failure. Therefore, it is necessary to improve the linear fitting capability of CNN by improving its structure and to have a certain time-series signal processing capability, so that it is more suitable for the RUL prediction task based on vibration signals.

For increasing the fitting ability of the CNN model, the width of the network can generally be increased by increasing the number of channels in each convolutional layer, or the depth of the network can be increased by increasing the number of network layers. Both of these methods have certain limitations. For example, increasing the number of channels per convolutional layer can improve the linear fitting ability of the network model, but a small increase in the number of channels is limited to performance improvement, while an increase in the number of channels by one order of magnitude can effectively improve performance, but can greatly increase the computational burden. In addition, as the network layer increases, the problems that the objective function is easy to be in local optimum, overfitting, gradient vanishing and the like can occur, so that the network degradation influences the prediction performance.

Aiming at the two improvement directions and the limitations thereof, the invention provides the following schemes: 1. the improvement of the channel number adopts a linear superposition network, namely, a convolution layer of convolution-normalization (BN) is added in each convolution block on the basis of the traditional CNN structure, and the convolution layers of two convolution-BN are utilized to simulate the convolution layer of a large channel, so that the linear fitting capacity of the network is improved, and meanwhile, the calculation load is reduced. 2. In order to solve network performance degradation caused by network layer deepening, a convolution block with a structure of convolution-BN-ReLU-MaxPool is improved into a residual structure, namely, an input before two-layer convolution nonlinear transformation is connected with an output after transformation, and then activation-pooling operation is carried out. Finally, an Enhanced Residual Convolution Network (ERCN) was designed according to the above scheme, as shown in fig. 3.

Meanwhile, ERCN also considers timing information present in the vibration signal. Although the linear superposition network has succeeded in the fault diagnosis task, unlike the fault diagnosis task, the vibration signal used by the RUL prediction task has strong timing, and after filtering by two convolution layers, although some redundant information in the original signal is removed, the timing information therein may be damaged, which may cause degradation of network performance. And as the residual structure connects the filtered output with the original input, the time sequence information in the original input can be supplemented, the time sequence signal processing capacity of the network model is enhanced to a certain extent, and the proposed ERCN is more suitable for the RUL prediction task. The relevant experiments in the above are compared with the backbone network of the ablation experiment of the application.

Further, CBAM is used after ERCN to screen and enhance the degraded features, considering that some redundant information may still be present in the original input, which may degrade the quality of the extracted degraded features. The ERCN and CBAM are ultimately combined to form the feature extractor of the present application to extract degraded features.

In addition, when vibration signal data are collected under the cross-equipment condition, due to the inherent characteristics of different platforms, different rotation speeds and loads and other factors influence each other, so that distribution differences exist in the collected vibration signal data. If the purpose of effectively implementing the RUL prediction across devices is to be achieved, the domain offset problem caused by the cross-domain needs to be solved. The performance of the network model of non-migratory learning will be greatly affected by the presence of domain offsets. To solve the above problems, the present application chooses to reduce the distribution difference by calculating MK-MMD between source domain and target domain features to solve the domain offset problem and achieve unsupervised domain adaptation.

Although domain adaptation can cope with domain offsets, the negative migration that it may produce is not negligible. That is, when the distribution difference of the two data distributions is large (for example, when the degradation trend difference of the bearing is large), if the knowledge of the source domain is excessively relied on to guide the RUL prediction of the target domain, misguidance of the source domain knowledge on the target domain is easily caused, so that the two data distributions are fitted in error.

Therefore, the invention designs a negative migration mitigation scheme: 1. a parallel full-connection structure (shown as FC1 in FIG. 2) is designed between the feature extractor and the RUL predictor and domain adaptation module to avoid the influence of MK-MMD on the RUL predictor when the return is lost. 2. In the early stage of network training, MK-MMD between a source domain and a target domain is mainly calculated to reduce the distribution difference between the two domains so as to extract domain invariant features. The specific gravity of MK-MMD in total loss is gradually reduced along with the increase of training iteration times, the influence of MK-MMD on a network is weakened, and feature learning and RUL prediction are performed by more depending on a feature extractor and an RUL predictor consisting of ERCN and CBAM, so that the purpose of relieving negative migration is achieved. With respect to this protocol design, details are set forth in the ablation experiments. Final domain adaptation loss calculation and decremental penalty coefficient λ _decreasing The formula of (c) is as follows:

L _DA (θ _f ，θ _DA )＝L _MMD (G _fc1 (G _f (x _s ))，G _fc1 (G _f (x _t )))*λ _decreasing

in the formula (8), θ _f Is the parameter of the feature extractor, θ _DA For adapting parameters of modules in the field, G _fc1 For the fully connected network FC1, epoch is the number of iterative training.

In the constructed training modelData samples from the source and target domains are input to a feature extractor G consisting of ERCN and CBAM _f Is a kind of medium. Then the feature extractor G _f The generated source domain and target domain features are input into a domain adaptation module to calculate MK-MMD loss to reduce distribution difference, and the generated source domain features are input into an RUL predictor G _y To calculate a prediction error. The formula for the prediction error loss function is as follows:

in the formula (9), θ _y G is a parameter of RUL predictor _y In the case of the RUL predictor,to predict the resulting RUL value, y _i For sample x _i Corresponding real RUL values. Then, based on the parameters θ of the feature extractor, the RUL predictor, and the domain adaptation module, respectively _f ，θ _y And theta _DA In combination with the above loss function, the loss objective function of the proposed ERCDAN model is defined as:

L _total (θ _f ，θ _y ，θ _DA )＝L _y (θ _f ，θ _y )+L _DA (θ _f ，θ _DA ) (10)

to optimize the loss objective function and obtain the best model, the final objective is to determine the parametersAnd->So that they satisfy:

in some embodiments, the present invention updates the model parameters with an Adam optimizer until the loss approaches the desired value. θ _f ，θ _y And theta _DA The update equation of (2) is as follows:

in the formula (12), η is a learning rate. The training process for the ERCDAN model for cross-device RUL prediction may be as shown in algorithm 1. Finally, the trained model is used for cross-device RUL prediction.

Algorithm 1

In the following, the validity and generalization of the RUL prediction method provided by the application are verified by performing extensive experiments on an IEEE PHM2012 bearing data set (hereinafter referred to as PHM) and an XJTU-SY bearing data set (hereinafter referred to as XJTU). First, a cross-device RUL prediction experiment between two data sets is performed to verify the cross-device RUL prediction performance of the method. In addition, in order to further verify the effectiveness and generalization of the application, RUL prediction experiments between bearings under different working conditions and under different working conditions under the same working condition are respectively carried out on the two data sets as supplementary verification.

Data set 1: the PHM2012 dataset was provided by the FEMTO-ST institute and contains relevant monitoring data for 17 bearings from run to failure under 3 different conditions completed on the pro ostia platform, as shown in table 1. Wherein the vibration signal running to failure is acquired by an acceleration sensor fixed to the bearing outer ring. The sampling frequency was 25.6kHz, the sampling interval was 10 seconds, the duration was 0.1 seconds, i.e. each sampling contained 2560 data points, and the test was stopped when the amplitude of the vibration signal exceeded 20 g.

TABLE 1

	Condition1	Condition2	Condition3
				Load(N)	4000	4200	5000
Speed(rpm)	1800	1650	1500
				Bearings	Bearing1_1～1_7	Bearing2_1～2_7	Bearing3_1～3_3

Data set 2: the XJYU data set is obtained on an accelerated degradation experiment platform, the adjustable working condition of the platform mainly comprises radial force and rotating speed, and the model of the experiment bearing is LDK UER204. The data set comprises data of full life cycle vibration signals of 15 bearings under three different working conditions (shown in table 2), and the vibration signals are acquired by acceleration sensors in the horizontal and vertical directions of the bearings. The sampling frequency was 25.6kHz, the sampling interval was 1 minute, and each sampling lasted 1.28 seconds.

TABLE 2

	Condition1	Condition2	Condition3
				Load(kN)	12	11	10
Speed(rpm)	2100	2250	2400
				Bearings	Bearing1_1～1_5	Bearing2_1～2_5	Bearing3_1～3_5

The above two data sets are utilized to design 33 cross-domain RUL prediction tasks, wherein the tasks comprise prediction tasks among different bearings under the same working condition, among different bearings under different working conditions and among different bearings under different equipment, and the task details are shown in table 3. According to the designed cross-equipment task, the method is subjected to network model training and cross-equipment prediction performance test, the effectiveness and generalization of the method are verified, and other tasks are utilized to carry out supplementary verification of the network model performance.

TABLE 3 Table 3

Although the degradation features contained in the original vibration signal can be directly extracted, these features may contain redundant information, and interference signals, such as noise pollution, may be present in the vibration signal, which may greatly increase or decrease the training difficulty, affecting the life prediction result. Therefore, the invention adopts corresponding data processing means for the vibration signal data in the data set: a) For PHM2012 and XJTU data sets, the invention carries out sample collection and service life percentage labeling treatment on the sample set of each selected experimental bearing; b) Z-Score can be applied to numerical data and is not affected by the magnitude of the data, as it itself acts to eliminate the magnitude inconvenience to the analysis. Thus, when sampling a sample on a sample set, the collected sample is Z-normalized.

In the formulas (12) and (13), x is the original sample, and x= { x ₁ ，x ₂ ，...x _N μ is the mean of the sample population data, σ is the standard deviation of the sample population data, and z is the normalized sample.

In order to effectively evaluate the prediction performance of the method and quantify the prediction result, three evaluation metrics are adopted: root Mean Square Error (RMSE), mean Absolute Error (MAE), and a decision coefficient R2 (R-Square). RMSE and MAE are two common indicators of the accuracy of a measured variable, where RMSE represents the degree of dispersion of samples and MAE represents the absolute value of the absolute error between a predicted value and an observed value (true value). The R2 reflects the proportion of the total variation of the dependent variable that can be interpreted by the independent variable through the regression relationship, also known as the goodness of fit, the closer the value of R2 is to 1, the better the regression fit is shown. The specific calculation formula of the index is as follows:

in formulas (14), (15) and (16)For predicted RUL, y _i Is true RUL->Is the average of the true RUL values.

The experiment adopts the original one-dimensional vibration signals as the input of each network model, and the signals are uniformly Z-standardized and then sent into the network model for training. All models are trained by using an ADAM optimizer, the learning rate is set to be 0.001, and a decreasing learning rate strategy is adopted; the batch_size is set to 128 and the weight decay parameter is set to 3e-6; the maximum number of iterative training is set to 200. The structural parameters of the network model are shown in table 4.

TABLE 4 Table 4

In order to verify the effectiveness of ERCN in the present invention, in the framework of the proposed RUL prediction method, experiments and performance analysis are performed on backbone networks of different structures by using all 33 cross-domain RUL prediction tasks under the condition that only the backbone network is changed. The following several backbone networks have the structural parameters as shown in table 5. As can be seen from fig. 4 and table 5, method 3 uses one layer of convolution-BN added in each convolution block to enhance the linear fitting capability of the network, but the overall performance of the network is degraded, which shows that the time-sequential signal processing capability of the network model is not enhanced or even impaired, and that the improvement of CNN by means of only the linear superposition structure is not suitable for RUL prediction, compared to Method1 (Method 1). Although the method 2 adds a residual structure to the conventional CNN structure, its performance is not improved compared to the method 1. Therefore, merely improving the conventional CNN structure into a residual structure or a linear superposition structure cannot effectively improve the performance of the network model, but rather causes degradation of the network performance.

The proposed ERCN improves the method 3 into a residual structure, and then the network performance is greatly improved, which indicates that the residual structure can make up for the defect of weak time sequence signal processing capability of the method 3, and the four methods perform best, which verifies the effectiveness of the proposed ERCN, and the proposed ERCN has better time sequence signal processing capability and is more suitable for the RUL prediction task compared with the CNN, the ResNet and the linear superposition network.

TABLE 5

In order to realize the relief of negative migration in the network training process, the invention designs a decreasing punishment coefficient lambda _decreasing The method is endowed on MMD loss, so that the specific gravity in the total loss is continuously reduced during back propagation, and the purpose of relieving negative migration is realized. To verify the validity of this design, the present invention uses different penalty coefficients (λ _increasing 0.25,0.5, 1) replaces the decreasing penalty coefficient λ in the present method _decreasing Model training and performance studies were performed under the same conditions, including fixed parameters 0.25,0.5,1, and increasing parameters. As can be seen from fig. 5, as the penalty coefficient λ increases, the network performance is improved, which indicates that MK-MMD has a smaller specific gravity and a corresponding smaller gain, which is not beneficial for realizing the domain adaptation. Increasing penalty coefficient lambda _increasing When the corresponding performance is slightly worse than λ=1, this also indicates that when the gain of early MK-MMD is smaller, it is not beneficial to realizing domain adaptation, and after the gain of late MK-MMD is increased, the instability of the network is aggravated, and it is also not beneficial to realizing domain adaptation. When the penalty coefficient is set in a decreasing form, the early stage larger MK-MMD gain is utilized to help realize field adaptation, and the later stage gradually reduces the gain to help improve stability. The final model of the invention has the best performance on each index and the best stability, which shows the decreasing penalty coefficient lambda _decreasing Is effective in the following.

To illustrate the effectiveness of each module and structure in the present method, ablation experiments were performed on different modules and structures. The method details for the different modules and structural compositions are as follows. Method 1: a linear superposition convolutional network. Method 2: the residual convolution network is enhanced. Method 3: enhanced residual convolution network + parallel full connection + decrementing MK-MMD, no CBAM. Method 4: enhanced residual convolution network+cbam, no parallel full connection, and decreasing MK-MMD shares full connection with RUL predictor. Method 5: enhanced residual convolution network+cbam, no parallel full connection, decreasing MK-MMD and RUL predictor do not share full connection layer but no FC1. The invention comprises the following steps: enhanced residual convolution network+cbam+parallel full join+decremental MK-MMD. The ablation experiment uses 33 cross-domain RUL prediction tasks including cross-equipment, same-working-condition cross-bearing and cross-working-condition to conduct performance analysis research, and the quantification results are shown in table 6.

TABLE 6

Method	RMSE	MAE	R2
				Method1	0.169±0.069	0.135±0.058	0.599±0.351
Method2	0.169±0.064	0.135±0.052	0.605±0.311
				Method3	0.129±0.072	0.102±0.062	0.738±0.370
Method4	0.153±0.093	0.118±0.081	0.615±0.514
				Method5	0.133±0.077	0.104±0.068	0.717±0.434
Proposed	0.102±0.040	0.077±0.032	0.856±0.125

Table 6 and fig. 6 show the comparison results of the present method after removal of certain module(s) and structure. First, it can be seen that after the linear superimposed convolutional network is improved to a residual structure, the ensemble average RMSE and MAE values are consistent, but the ensemble average R2 value of the enhanced residual convolutional network is higher than that of the enhanced convolutional network, and the stability is better. Secondly, after the attention mechanism is removed, the overall performance of the network model is reduced due to the lack of the attention mechanism to supplement the key information of the degradation characteristics. In addition, since the method 4 uses the decreasing MK-MMD and the RUL predictor to share the full connection, the RUL predictor is affected by MK-MMD in the back propagation, so that the final RUL predicted value is more easily misled by negative migration, and thus the performance is poor. Finally, the method 5, in which the MMD and the RUL predictor are not in a shared full connection layer, is improved compared with the method 4, but is worse than the method, so that the influence of negative migration can be effectively weakened by the parallel full connection structure, and the robustness of RUL prediction is enhanced.

FIG. 7 shows RUL predictive performance visualization of various methods in a structural ablation experiment, wherein (a) P1.1; (b) P1.3; (c) P2.4; (d) P2.5; (e) X1.1; (f) X1.4; (g) X2.2; (h) X2.3; (i) PX1; (j) PX4; (k) XP1; (l) XP4. It can be seen that the RUL curve predicted by the method can be better fitted to a real RUL line. As can be seen from fig. 7 (b), (f) and (i), in the method 4, since the decreasing MK-MMD shares the full link layer with the RUL predictor, the RUL predictor is affected by MK-MMD during back propagation, so that the final output RUL predicted value is more easily misled by negative migration, and a situation occurs in which the prediction trend of the local RUL curve is opposite to that of the real RUL, and even a significant prediction error occurs, as shown in fig. (j) and (l). It can be seen from figures (i) and (l) that methods 4 and 5 both present a prediction trend of the RUL curve opposite to that of the real RUL, whereas the method does not present a situation where the prediction trend of the RUL curve is opposite to that of the real RUL, which indicates the effectiveness of the parallel full connection designed in the method for alleviating negative migration.

To further measure the cross-device RUL predictive performance of the present method, the following methods are provided as a comparison in the experiments: method 1: CNN, method 2: ERCN, method 3 of the invention: CNN-Attention, method 4: CNN-LSTM. Wherein the Attention mechanism in CNN-Attention is the same as in the present method. In addition, to further verify the effectiveness and generalization of the method, the experiment also performed a cross-bearing RUL prediction experiment under two conditions on two data sets as a supplementary verification.

Tables 7 and 8 are performance representations of all comparative methods and the present method over 12 cross-device RUL prediction tasks. From the quantization results in the following table, it can be seen that the method achieves better cross-device prediction performance according to the three indices RMSE, MEA and R2 than the three classical methods and ERCN in the table, except for tasks PX2, PX5 and XP 4. Compared with CNN, in two cross-device scenes, the overall average RMSE of the method is reduced by 26.1% and 42.6%, the overall average MAE is reduced by 28.1% and 47.9%, and the overall average R2 is improved by 37.1% and 27.3%. Although the present method is not optimal on tasks PX2, PX5 and XP4, there is still a better performance improvement on other tasks, in particular on tasks PX4, PX6, XP1 and XP2. Fig. 9 illustrates the performance visualization of all the comparative methods and the present method over 12 cross-device RUL prediction tasks.

TABLE 7

TABLE 8

According to the performance visualization, the method can be used for predicting the degradation trend and the RUL of the target domain bearing well when facing to the task of predicting the RUL of the cross-equipment rolling bearing. It can be observed by combining fig. 8 and fig. 9 that when the degradation trends of the bearings collected from the two devices are similar (the domain offset degree is smaller), for example, tasks PX3 and XP3, although the method is not greatly improved compared with the comparison method, the method predicts smaller RUL fluctuation, the prediction result is more stable, and a relatively real RUL curve can be fitted. However, when the degradation trend of the bearing is greatly different, performance degradation of the network model may be caused, and different methods may exhibit large performance differences, such as tasks PX4, PX6, XP1, and XP2. Especially for task PX4, it can be seen in connection with fig. 8 that there is a large difference in the degradation trend of the two bearings, especially in the last degradation of XJTU-Bearing1_4 to failure time, which also results in that the comparison method fails to predict the degradation trend and RUL effectively, whereas the present method predicts a more accurate degradation trend and RUL. For tasks PX6, XP1 and XP2, the prediction results of the method are more stable and more nearly real RUL. Furthermore, for the task PX2, it can be observed in conjunction with fig. 8 (c) and fig. 9 (b) that the degradation of the network model occurs due to the large difference in the degradation trend of the two bearings in the middle of the operation, and the method is similar to the RUL curves predicted by other methods, although the best prediction effect is not obtained. By combining the quantization result and the performance visualization analysis, when the domain offset problem is faced, the performance of other comparison methods is reduced, but the method can still keep better prediction performance, which also shows that the method has better robustness and can well execute cross-equipment rolling bearing RUL prediction.

Under different application scenarios, these methods may exhibit performance differences and different generalization performance. To further verify the effectiveness and generalization of the method, the experiment used a cross-bearing RUL prediction experiment on both conditions on both datasets as a supplementary verification. Fig. 10 and 11 show the performance quantification results of these methods for 21 cross-bearing RUL prediction tasks under two conditions on both datasets. FIG. 10 shows tasks P1.1-1.5 and tasks X1.1-1.4 under the same conditions, (a) RMSE, (b) MAE, and (c) R2. FIG. 11 shows tasks P2.1-2.8 and tasks X2.1-2.4, (a 1& 2) RMSE, (b 1& 2) MAE, (c 1& 2) R2 under different conditions.

From the above performance quantification results, it can be observed from fig. 10 and 11 that the method obtains the best prediction result except for the tasks X1.2 and P2.3 in all the running prediction tasks under 21 different conditions, and the method can still maintain good accuracy when the performance of other methods is reduced. The method has the advantages of good RUL prediction performance of the same-working-condition cross-bearing, good RUL prediction under the cross-working-condition, and good generalization performance.

According to the analysis content of the ablation experiment and the comparison experiment, when facing the RUL prediction task under the cross-equipment condition, the method can effectively extract the degradation characteristics in the vibration signal by means of the combination of the designed ERCN and the CBAM, and perform field adaptation by using MK-MMD so as to realize the purpose of assisting the cross-equipment RUL prediction of the label-free target domain by using the supervision information of the source domain. When the RUL prediction task with larger degradation trend difference of two bearings crossing equipment is faced, the method can effectively extract domain invariant features by means of a designed negative migration relief scheme consisting of parallel full connection and descending MK-MMD, avoid misleading of source domain knowledge to a target domain, and finally realize effective RUL prediction. In addition, the method not only can show better robustness on the cross-equipment task, but also can still keep better prediction precision on the cross-bearing and cross-working condition tasks under the same working condition, and the effectiveness and generalization of the method are verified.

In other embodiments, the present invention provides a rolling bearing remaining life prediction system comprising:

In summary, the method and the system for predicting the residual life of the rolling bearing provided by the invention aim at solving the domain offset problem existing under the cross-equipment condition and the negative migration problem possibly caused when the domain offset problem is solved by utilizing the domain adaptation means, and provide a cross-equipment rolling bearing RUL prediction method and system based on ERCDAN, which are formed by ERCN and negative migration alleviation domain adaptation. Extracting degradation characteristics from the vibration signal by means of ERCN and supplementing key information of the degradation characteristics by means of CBAM; the negative migration relief field adaptation module consists of a parallel full-connection structure and a decrementing MK-MMD, wherein the parallel full-connection structure ensures that the RUL predictor is not influenced by MK-MMD loss when the domain is learned, so that a target domain is misled by source domain information, and moreover, the decrementing MK-MMD ensures that enough cross-domain information is available in the early training stage to realize field adaptation, and simultaneously ensures that the stability of a network model and the effect of field adaptation are not influenced along with the increase of iterative training times, and finally negative migration relief is realized. The ablation experiment proves that the ERCN has better time sequence signal processing capability under the framework of the method, and the domain adaptation module consisting of parallel full connection and decreasing MK-MMD can effectively inhibit negative migration. Finally, a cross-device RUL prediction experiment between PHM2012 and the XJTU dataset is utilized to perform performance verification for the method. Experimental results show that the method has good cross-equipment RUL prediction performance. In addition, good prediction effects are obtained on the same-working-condition cross-bearing and cross-working-condition prediction tasks on the two data sets, and the effectiveness and generalization of the method are further verified.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. The method for predicting the residual life of the rolling bearing is characterized by comprising the following steps of:

2. The rolling bearing residual life prediction method according to claim 1, wherein each convolution block of the enhanced residual convolution network is a residual structure.

3. The rolling bearing residual life prediction method according to claim 2, wherein the residual structure is: and connecting the input before the two-layer convolution nonlinear transformation with the output after the transformation, and performing activation-pooling operation.

4. A method of predicting the residual life of a rolling bearing as claimed in claim 2, wherein a convolution-normalized convolution layer is added to each of said convolution blocks.

5. The rolling bearing residual life prediction method according to claim 1, wherein the domain adaptation module further includes a parallel full connection structure disposed between the feature extractor and the RUL predictor and the feature extractor and the domain adaptation module.

6. The method for predicting the remaining life of a rolling bearing according to claim 1, wherein the maximum mean difference employs a multi-core maximum mean difference.

7. The rolling bearing remaining life prediction method according to claim 6, wherein the specific gravity of the multi-core maximum mean difference in total loss gradually decreases as the number of training iterations increases.

8. A method of predicting the remaining life of a rolling bearing as claimed in claim 1, wherein the parameters of the rolling bearing remaining life prediction model are updated with Adam optimizer until the loss approaches the expected value.

9. The method for predicting the residual life of a rolling bearing according to claim 1, wherein after training to obtain the residual life prediction model of the rolling bearing, further comprising testing the residual life prediction model of the rolling bearing by using unlabeled target domain samples.

10. A rolling bearing remaining life prediction system, comprising: