CN117171713A

CN117171713A - Cross self-adaptive deep migration learning method and system based on bearing service life

Info

Publication number: CN117171713A
Application number: CN202310987241.9A
Authority: CN
Inventors: 程德俊; 李泽健; 张胜文; 方喜峰; 张春燕; 朱成顺; 张辉
Original assignee: Jiangsu University of Science and Technology
Current assignee: Jiangsu University of Science and Technology
Priority date: 2023-08-07
Filing date: 2023-08-07
Publication date: 2023-12-05

Abstract

The invention discloses a multi-feature cross self-adaptive deep migration learning method and system for bearing residual service life considering the diversity of a decay process. And then acquiring comprehensive characteristics through time domain characteristics in the PCA dimension-reducing multidimensional signal, acquiring degradation indexes and gradient characteristics of the bearing by utilizing MMD, and carrying out self-adaptive phase division on the data of the rolling bearing full life cycle by utilizing the proposed self-adaptive gradient iterative division algorithm to obtain different degradation phases of the bearing. The concept of decay rate is proposed according to the actual working condition of the bearing and the test set and the training set are labeled according to the concept. And putting the data into a multi-feature cross migration network to realize the prediction of the incomplete target domain RUL. The method utilizes the known data which are easy to acquire to train, and realizes the effective prediction of incomplete life cycle data.

Description

Cross self-adaptive deep migration learning method and system based on bearing service life

Technical Field

The invention relates to the technical field of bearing life prediction, in particular to a cross self-adaptive deep migration learning method and system based on bearing life.

Background

Rolling bearings, one of the key components of industrial rotary machines, are widely used in various equipment, and once a failure occurs in industrial production, the equipment is shut down, resulting in serious economic loss. It is therefore necessary to discover and identify various potential anomalies and faults as early as possible, while the prediction of the remaining life (RUL) of the bearing plays a critical role in avoiding dangerous situations before the risk of operation. Fault diagnosis and RUL prediction have been studied by a large number of scholars in recent years as key links to prognosis and health management. The fault diagnosis has achieved a great deal of success with years of research and application, while the prediction of RUL is still in a development stage, and how to realize the prediction of multi-working-condition cross-working conditions is still a great problem. In recent years, transfer learning has been rapidly developed as a method capable of solving the difference of different domain feature distributions.

The Domain Adaptation (DA) can map features in different domains to the same feature space to perform alignment solution, so that the DA provides a new and effective solution for solving the residual life under the condition of cross-working conditions. Although the above-described prediction of RUL has made great progress, the following problems remain:

(1) The existing DA method is used for carrying out feature alignment on the full life cycle of the bearing for the most part of cross-working-condition RUL prediction, however, it is well known that bearing decay processes are decayed in a multi-stage decay method in most cases, and in long-term operation, decay modes of the bearing decay processes are quite different even in similar working environments. It is therefore difficult to perform feature alignment at the level of the decay phase with existing methods.

(2) The prior prediction method is to divide the bearing degradation level by constructing a health index curve and then manually setting a threshold value to divide the health index curve in stages. The method ignores the differences of the decay modes under different working conditions, and cannot automatically adapt to the actual decay process of the bearing, so that accurate multistage information of the decay of the bearing cannot be obtained.

(3) In the existing method for calculating the residual life label of the bearing in the modeling thought of the model, the RUL of the bearing is specified by removing the residual service time by the health state (the health state is 100% at the moment) when a fault signal occurs for the first time. The method cannot follow the service condition of the bearing in the actual industrial scene, and neglects the influence of the fading degree and the fading rate of different fading stages, thereby influencing the accuracy of the migration learning on the residual life prediction result.

(4) Most of the existing models for the RUL migration learning of the bearing are trained based on the condition that a source domain and a target domain of a full life cycle exist, and the problem that the full life cycle data of one bearing is difficult to acquire in the practical engineering application is solved. The use of complete source data and partial target data from early decay periods for domain adaptation can lead to data mismatch at similar decay levels across domains, thereby creating the problem of reduced accuracy in bearing RUL cross-condition prediction.

Disclosure of Invention

The invention aims to: in order to overcome the defects of the prior art, the invention provides a cross self-adaptive deep migration learning method based on the service life of a bearing, which can solve the technical problems existing in the background art, and also provides a cross self-adaptive deep migration learning system based on the service life of the bearing.

The technical scheme is as follows: the invention discloses a bearing life-based cross self-adaptive deep migration learning method, which comprises the following steps:

(1) Collecting original vibration signals of the rolling bearing, and obtaining multi-section vibration signal data by adopting a sliding window segmentation method;

(2) Performing data preprocessing on the original vibration signal obtained in the step (1), and performing feature extraction to obtain time domain and frequency domain feature parameters;

(3) The acquired multidimensional time domain features are subjected to a PCA dimension reduction technology to acquire time domain comprehensive features;

(4) Dividing the time domain comprehensive characteristics in the step (3) into a plurality of sections by utilizing a sliding window, and calculating the maximum mean difference MMD between each section and the first section to obtain a degradation index and a corresponding gradient curve; the maximum averaged difference Max mean discrepancy is used to measure the difference between two data distributions.

(5) The acquired gradient curve is utilized to self-adaptively and automatically identify a stage crossing point by using an AGIP algorithm, so that different stages of the bearing are obtained;

(6) Dividing a training set and a testing set according to the obtained dividing identification information and the characteristic parameters, and marking corresponding labels according to the decay rate, wherein the training set is a labeled source domain and a label-free early target domain;

(7) The data of each stage in the training set is sent into a time sequence prediction network to be trained to obtain prediction models of different stages;

(8) And (3) sending the divided test set into a trained time sequence prediction network in the step (7) for prediction, and outputting a final rolling bearing residual life prediction result.

Further, the method comprises the steps of:

the step (1) specifically comprises the following steps: the original vibration signal data of the rolling bearing comprises horizontal vibration signal data and vertical vibration signal data, the two types of vibration signal data are respectively collected through an acceleration sensor arranged on a bearing base, full life cycle data of bearing operation are obtained, and the vibration amplitude of the bearing exceeds a certain threshold value to serve as a signal for ending the collection; the concrete method of the sliding window segmentation method is that the whole original vibration signal data is segmented equidistantly according to a certain step length l.

Further, the method comprises the steps of:

the calculating of the degradation index in the step (4) includes:

the obtained bearing degradation index is obtained by performing dimension reduction according to the multidimensional time domain feature obtained in the step (2) to obtain comprehensive time domain feature, and calculating MMD of the first section window by utilizing sliding window segmentation.

Further, the method comprises the steps of:

in the step (3), the AGIP algorithm includes:

first, the average value of the 5% gradient before the gradient curve is calculated as a reference gradient threshold value [ g ] ₀ ]And then automatically updating and searching a gradient threshold value of each stage so as to determine a first prediction time point FPT and a stage transition point STP, thereby realizing the division of multistage declining stages of rolling bearing full life cycle data.

Further, the method comprises the steps of:

the step (5) of dividing the training set and the testing set comprises the following steps:

(51) Calculating the decay rate of different stages, marking corresponding decay labels for different decay stages by utilizing the difference of the decay rate, and comprising the following steps:

calculating the decay rate of each decay stage:

wherein: v (V) _n Representing the decay rate of the nth decay phase, S _n For the degree of degradation of the nth stage of degradation, three stages of degradation are specified here as 100% -60%, 60% -10% and 10% -0, respectively, noted as: mild, slow and accelerated phases of decay, T _n Is the decay time corresponding to the decay phase;

the tag is then calculated:

wherein y is _RULi The label representing the i-th sample,and->Is a cross-phase point derived from the AGIP algorithm;

(52) The three decay phases of the source domain and the early phase data of the target domain are used as training sets, and the slow decay phase and the rapid decay phase of the target domain are used as test sets.

Further, the method comprises the steps of:

and (6) sending the data of each stage in the training set into a time sequence prediction network model, wherein the time sequence prediction model comprises a characteristic extraction layer, a multi-characteristic cross migration layer and a network prediction layer, the characteristic extraction layer is a transducer network structure, the degradation data of each stage in each source domain and the degradation data of the first degradation stage in the target domain are input into the transducer network structure, after the input layer and an N-layer encoder and decoder are passed, the input characteristics are subjected to linear transformation, and activated by an activation function, so that output characteristics are finally obtained, the multi-characteristic cross migration layer comprises a plurality of cross attention layers, the source domain is used as a query Q, the target domain is used as a key K and a value V, and a new target loss function is proposed.

Further, the method comprises the steps of:

The objective loss function includes:

regression loss L _regression ：L _regression The method is used for minimizing the prediction error of the training data, adaptively adjusting the network training parameters and enabling the predicted RUL to be closer to an actual result; the invention selects radial basis function RBF as a kernel function, calculates L by mean square error MSE _regression The following are provided:

wherein y is _i Andthe real RUL label and the model predicted RUL label are respectively represented, and n represents the batch size in the training process;

distribution difference loss L _MMD (D ^S ，D ^T )：L _MMD (D ^S ，D ^T ) Representing the distribution difference of high-dimensional features between a source domain and a target domain, enabling the network to better extract domain invariant features as a non-parametric metric MMD that measures first-order distribution divergence between the two domains; thus, the distribution difference loss formula after the introduction of the kernel learning can be expressed as:

wherein,represents the ith element, n, of the source domain _S And n _T The size of each batch of the source domain and the target domain is respectively represented, and k (·, ·) represents a kernel function;

knowledge distillation loss L _distillation : distillation loss is used for cross attention module mainly used for characteristic distribution of Ji Yuanyu and target domains, wherein the source domain and the target domain are respectively expressed as teachers and students, and training of the source domain is used for guiding the target domain so as to go through a network to the characteristic distribution of Ji Yuanyu and the target domain;

Thus, the total loss of the network is:

where θ represents a learnable parameter in the network.

In another aspect, the present invention also provides a bearing life-based cross-adaptive deep migration learning system, comprising:

the acquisition module is used for acquiring original vibration signals of the rolling bearing and acquiring multi-section vibration signal data by adopting a sliding window segmentation method;

the curve construction module is used for acquiring a root mean square value by utilizing the multi-section vibration signal data, calculating the gradient of the root mean square value, and constructing a gradient curve and a conversion curve of the root mean square value through normalization;

the stage calculation module is used for adaptively and automatically identifying stage crossing points of the acquired gradient curve by using an AGIP algorithm so as to obtain different stages of the bearing;

the preprocessing module is used for preprocessing data of the original vibration signals and extracting characteristics to obtain characteristic parameters;

the dividing module is used for dividing a training set and a testing set according to the obtained dividing identification information and the characteristic parameters, and marking corresponding labels according to the fading rate;

the training module is used for sending the data of each stage in the training set into the time sequence prediction network model for training to obtain the prediction models of different stages; the time sequence prediction network model is a RUL prediction model based on multi-feature cross adaptive transfer learning.

And the prediction module is used for sending the test set obtained by division into a trained time sequence prediction network in the training stage to perform prediction, and outputting a final rolling bearing residual life prediction result.

Further, the method comprises the steps of:

in the stage calculation module, the stage crossing point is self-adaptively and automatically identified by using an AGIP algorithm, and the stage crossing point comprises:

(31) Calculate the average g of the 5% gradient before ₀ Generating f (g) by AGIP algorithm ₀ ) Is the reference gradient threshold value [ f (g) ₀ )]Where f (g) is a function of the generation of the gradient threshold, notably f (g) +.g;

(32) The gradient value and [ f (g) ₀ )]In contrast, when [ f (g) ₀ )]Recording the point as a phase crossing point P ₁ The gradient value at this time is g ₁ Then iterated by the AGIP algorithm and the threshold updated to [ f (g) ₁ )]；

(33) From new [ f (g) ₁ )]To find the next phase crossing point P ₂ The threshold value is then updated in sequence iteratively until all stage dividing points P are found ₁ 、P ₂ 、P ₃ Four stages of division are realized.

Further, the method comprises the steps of:

in the dividing module, the dividing the training set and the testing set includes:

Calculating the decay rate of each decay stage:

the tag is then calculated:

On the above basis, the present invention also provides a computer storage medium having stored thereon a computer program which, when executed by a computer processor, implements the method described above.

The beneficial effects are that: (1) The invention provides a novel dividing algorithm which can be used for adaptively dividing multistage fading stages under different working conditions. Compared with the traditional dividing method, the method can update and iterate the dividing threshold value of each stage according to the dynamic data of the bearing, accurately divide the stage of bearing multistage degradation, and improve the calculation efficiency by about 60%.

(2) The invention provides a method for calculating a residual life label of a bearing by using the decay rates of different stages of the bearing. Compared with the traditional method, the method is more in line with the service condition of the bearing in the actual industrial scene, so that the prediction accuracy is improved by about 5% compared with the traditional method.

(3) The invention provides a domain self-adaptive method based on multi-feature cross migration in early-stage fading stage, which does not need to rely on data in full life cycle like traditional migration learning, and can learn unobtainable middle-stage fading features only by early-stage fading data, thereby realizing effective completion of migration of two domains under the condition of incomplete prediction of target domain data.

(4) The invention designs a layered self-adaptive RUL model based on multi-feature cross self-adaptive transfer learning, and a multi-feature cross transfer module in the model effectively transfers multi-stage fading features from one working condition to multiple working conditions, which reduces about 0.04 compared with the traditional transfer model RMSE loss and improves about 5% of prediction accuracy.

Drawings

FIG. 1 is a general flow chart of a multi-feature cross-adaptive deep migration learning method for bearing residual service life considering diversity of a decay process according to the present invention;

FIG. 2 is a schematic diagram of a multi-stage decay phase adaptive partitioning and recognition method based on an AGIP algorithm according to the present invention;

FIG. 3 is a schematic diagram of a multi-feature cross-migration network architecture according to the present invention;

FIG. 4 is a diagram showing the comparison result of the method of the present invention and the migration module without multiple features.

Detailed Description

In order that the above objects, features and advantages may be more clearly understood, a further detailed description of the invention will be rendered by reference to the appended drawings and specific embodiments thereof. It should be noted that the present invention may be applied in various ways, and is not limited to the life prediction of the bearing, but should not be limited to the examples set forth herein. Rather, these examples are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. For the purpose of facilitating an understanding of the embodiments of the present invention, reference will now be made to the drawings, by way of example, and specific examples of which are illustrated in the accompanying drawings and are not intended to limit the embodiments of the invention. As shown in fig. 1, a multi-feature cross adaptive deep migration learning method for bearing residual service life considering diversity of a decay process includes the following steps:

(1) Collecting original vibration signals of rolling bearing

The original vibration signal data of the rolling bearing is adopted, and a sliding window segmentation method is adopted for the original vibration signal data to obtain multi-section vibration signal data. The original vibration signal data of the rolling bearing comprise data of two dimensions, namely a horizontal vibration signal and a vertical vibration signal. Data acquisition is carried out through an acceleration sensor arranged on the bearing base, the data of the whole life cycle of the bearing is obtained, and the vibration amplitude of the bearing exceeds 20g as a threshold value for ending the acquisition. The sliding window segmentation method is used for carrying out equidistant segmentation on the original vibration signal according to the step length of l, so that the preprocessing of data is realized. Wherein the value of l in this experiment was chosen to be 64.

(2) Time-frequency domain feature extraction on sliding window data

The invention firstly carries out the characteristic processing of the time-frequency domain for the data processing, and sequentially extracts 10 characteristics including root mean square value, standard deviation, waveform factor and the like. And then, carrying out standardization processing on each characteristic vector so as to reduce inaccuracy caused by the difference of the dimensions of different variables. The invention adopts a Min-Max normalization mode to normalize the characteristics between [0,1 ]. The specific formula is as follows:

Wherein X is _i max represents the maximum value of the characteristic i of all samples, X _i min represents the minimum of the feature i of all samples.The i-th eigenvalue of sample j is represented. Through normalization processing, the network training speed can be improved and the samples are constrained.

(3) Acquiring degradation index and corresponding gradient change curve

The time domain features are used for measuring statistics of the change degree of the data, the deviation from the mean degree can be well represented, but a single index may not be comprehensive enough, so that the invention performs the dimension reduction processing on a plurality of time domain features by using the PCA technology to obtain a comprehensive time domain feature, then on the basis, a sliding window is used for dividing to obtain a plurality of pieces of data, and the difference value between each piece and the first piece is calculated by using an MMD to represent the degradation index of the bearing, wherein the calculation formula of the MMD is represented as follows:

in the present inventionIs a set of sliding window data of length 256, and each set of sliding window data acquired later is calculated once with the first set of MMD to acquire the degradation indicator. Subsequently for acquisitionThe change trend of the degradation index can well reflect the change trend of the data, so that the change trend of the RMS is reflected by adopting a mode of calculating the gradient, wherein the calculation formula of the gradient is as follows:

Wherein it was found experimentally that the more stable the feature obtained when the value of a is larger, and conversely, the more fluctuating. In order to make the obtained gradient data more explicitly represented, the value of a in the present invention is set to 3.

(4) Automatic identification of stage crossing points by AGIP algorithm to realize division of multistage fading stages

The full life cycle of the bearing can be divided into a healthy phase and a declining phase, wherein the points dividing the healthy phase and the declining phase are FPT points, and the data distribution of the bearing is changed at the points. The declining phase is subdivided again according to the invention in turn, and the full life cycle data of the bearing is divided into 4 phases. The method comprises the following steps:

1) Health stage: the bearing operates stably at the stage, and no obvious trend of decreasing health indexes exists.

2) Mild decay phase: the bearings begin to fade and the health indicator shows a slight downward trend.

3) Slow decay phase: the health index tends to show a large number of irregular fluctuations compared to the mild decay phase, which indicates that the bearing starts to decay significantly.

4) Acceleration of the decay phase: at this stage, the health indicator begins to drop at a faster rate and the bearing will eventually fail at this stage.

The AGIP algorithm provided by the invention is used for dividing the multistage declining stages of the bearing based on the principle, and the main principle is that the information is transmitted through gradient by utilizing the principle that data in different stages are distributed differently, so that the multistage stage division is realized. As shown in fig. 2, the concrete flow of the AGIP algorithm is as follows:

1) Calculate the average g of the 5% gradient before ₀ Generating f (g) by AGIP algorithm ₀ ) Is the reference gradient threshold value [ f (g) ₀ )]Where f (g) is a function of the generation of the gradient threshold, notably f (g) noteqg;

2) The gradient value and [ f (g) ₀ )]In contrast, when [ f (g) ₀ )]Recording the point as a phase crossing point P ₁ The gradient value at this time should be g ₁ An iterative update of the threshold value to [ f (g) ₁ )]；

3) From new [ f (g) ₁ )]To find the next phase crossing point P ₂ The threshold value is then updated in sequence iteratively until all stage dividing points P are found ₁ 、P ₂ 、P ₃ Four stages of division are realized.

The whole process of realizing the self-adaptive automatic division of the bearing multistage declining stages by the AGIP algorithm is realized.

(5) Training set test set partitioning for time-frequency feature data

The invention aims to solve the problem that the alignment of two domains can be completed under the condition of incomplete target domains, thereby realizing the prediction of the target domain RUL. Therefore, the training set selects a plurality of data of the decay phases of the source domain, and the target domain selects light decay data as the training set; the test set selects different decay phases of the target domain.

(6) Calculating decay rate for data tag fabrication

The training set in step 5 is represented as form X as follows _train ＝(x ₁ ，x ₂ ，…，x _K ) Wherein x is _i ＝(x ¹ ，x ² ，…，x ^k ) And K represent the number of samples and the number of feature parameters, respectively, the test set is similar. Since the original vibration signal does not contain a tag set, it is necessary to fabricate the tag set. Based on the actual industrial situation, the invention proposes that the degradation degree of the same degradation stage under different working conditions should be the same, but the life cycle is different due to different degradation rates.

Based on the concept, the method utilizes the decay rate to manufacture the RUL label in the following calculation mode. First, the decay rate of each decay phase is calculated:

wherein: v (V) _n Representing the decay rate of the nth decay phase, S _n For the degree of decay of the nth decay phase (here we define three decay phases 100% -60%, 60% -10% and 10% -0 respectively), T _n Is the decay time corresponding to the decay phase. The tag is then calculated:

wherein y is _RULi The label representing the i-th sample,and->Is the cross-phase point resulting from the AGIP algorithm in step 3.

(7) The training set is sent into the RUL prediction model based on multi-feature cross self-adaptive transfer learning to train for cross-working condition bearing RUL prediction, and the multi-feature cross transfer network realizes the transfer of multiple incomplete target domains by utilizing multiple decay phases of a source domain and a slight decay phase of a target domain. The residual life prediction models of different fading stages, including a mild fading prediction model, a slow fading prediction model and an acceleration fading prediction model, are subjected to a segmentation training mode, and model loss of each fading stage is reduced through a hierarchical self-adaptive tuning method. As shown in fig. 3, the model mainly comprises 3 parts: the system comprises a characteristic feature extraction layer, a multi-characteristic cross migration layer and a network prediction layer.

The invention divides the bearing degradation process into 4 stages, wherein 0,1,2 and 3 represent the health, mildness, slowness and acceleration of the bearing. Thus, T1 represents a slight stage of the target domain, and in the present invention, the problem of requiring target domain data of a full life cycle in the conventional migration process is solved, so that only early target domain data is used as an example in the present method. The following is the core migration idea of the method:

firstly, calculating differences among different degradation stages in a source domain, and aligning a learned physical degradation mechanism on a space plane along a degradation direction (the inside of the source domain) through a space mapping principle; in addition, for limited unlabeled data in the target domain, the method uses a cross adaptation layer to align slight degradation features between the source domain and the target domain, so that degradation tracks are aligned in one spatial plane (the source domain and the target domain), thereby realizing the transfer of degradation information from the source domain to the target domain. On this basis we assume that the invisible information of the degradation phase has the same physical degradation mechanism between the source domain and the target domain. Meanwhile, through the two spatial mappings, invisible degradation information of an incomplete target domain can be aligned along a degradation direction, so that RUL prediction under the condition that failure data are not enough to run is realized.

The feature extraction layer is a transducer network structure, and the transducer gives up a traditional sequential structure, so that better parallelism is obtained. In particular, position coding enables it to remember the location of each piece of information, which makes it easier to use a distributed GPU for parallel training, thereby improving the efficiency of model training. The transducer mainly comprises the following structures:

input layer: including a source domain data embedding layer, a target domain data embedding layer, and a position encoder layer. The embedded layer maps the data into a vector of dimension d_model dimension according to the task requirements of network computation. Since the transducer does not contain recursion and convolution, the model adds position coding to represent the absolute or relative position of the information in the sequence, allowing the network to use the sequence information. The position code selects sine and cosine functions, and the formula is as follows:

where pos is the position and i is the dimension. Thus, each dimension of the position code corresponds to a sinusoidal signal, and experiments prove that the function can enable the model to easily learn the rule of the position.

An encoder: consists of a stack of N encoder layers, each of which is in turn connected to two sublayers, a multi-headed attention Mechanism (MHA) and a position feed forward network (PWFFN). Each sub-layer adopts residual jump connection and layer normalization, so that sub-layer parameters are fully trained, and convergence speed is increased. MHA solves the defect that the self-attention mechanism will pay excessive attention to its own position when encoding the current position information, learns different queries (Q), keys (K) and values (V) by performing different linear projections on the data using h sets of different attention headers, and then merges the h sets of different Q, K, V in parallel. Finally, the pool outputs of the h groups are connected together and varied by another learnable linear projection to produce the final output. The calculation formula is as follows:

MH(Q，K，V)＝Concat(head ₁ ，head ₂ ，…，head _h )W ^O (8)

where head _i ＝Attention(QW _i ^Q ，KW _i ^K ，VW _i ^V )

Wherein the learnable parameters includeAndbased on this design, each header may focus on a different portion of the input data.

A decoder: consists of stacked N decoder layers, each encoder having three sub-layers connected in sequence: masking multi-headed attention layers (masked MHA), MHA and PWFFN. Masking the multi-headed attention layer is the same as MHA in the encoder block except that masking is added. The mask indicates that certain values are masked so that they are not validated when the parameters are updated. In time-series prediction, the output of the decoder only needs to depend on the output before time t for time step t, so that the mask MHA is applied to the input of the decoder to obtain the output information of the previous prediction, which corresponds to recording the information between the inputs at the current time.

Output layer: after passing through the N layers of encoders and decoders, the input features are subjected to linear transformation and activated by an activation function, so that final output is obtained. In the RUL prediction of bearings, sigmoid is typically chosen as the activation function of the output layer.

The multi-feature cross migration layer is composed of a plurality of cross attention layers, a source domain is used as a query (Q), and a target domain is used as a key (K) and a value (V). The network can enable the source domain to learn the knowledge of the target domain by the cross learning mode, so that the distribution difference of the two domains is reduced better.

The multi-feature cross migration method provided by the invention utilizes a plurality of decay phases of the source domain to perform internal alignment, and utilizes the target domain mild decay phase to perform feature alignment with the source domain mild decay phase so as to realize that unknown target domain data can be well aligned. In back propagation, in order to effectively update the network weights, reduce the inter-domain distribution difference, a new objective loss function is proposed, which consists of three parts:

1) Regression loss L _regression ：L _regression The method is used for minimizing the prediction error of the training data, and can adaptively adjust the network training parameters so that the predicted RUL is closer to the actual result. The present invention selects Radial Basis Function (RBF) as kernel function, calculates L by Mean Square Error (MSE) _regression The following are provided:

wherein y is _i Andrespectively represent trueReal RUL labels and model predicted RUL labels, n represents the batch size during training.

2) Distribution difference loss L _MMD (D ^S ，D ^T )：L _MMD (D ^S ，D ^T ) Representing the distribution difference of high-dimensional features between the source domain and the target domain, enables the network to better extract domain invariant features. As a non-parametric measure MMD, it can measure the first-order distribution divergence between two domains. Thus, the distribution difference loss formula after the introduction of the kernel learning can be expressed as:

Wherein,represents the ith element, n, of the source domain _S And n _T The size of each batch of source and destination domains is represented, respectively, and k (·, ·) represents the kernel function.

3) Knowledge distillation loss L _distillation : distillation loss is used for cross-attention modules primarily to profile Ji Yuanyu and target domains. The source domain and the target domain are respectively represented as a teacher and a student in the module, and training of the source domain is used to guide the target domain, so as to go through the network to the characteristic distribution of Ji Yuanyu and the target domain.

There is a knowledge distillation network framework that includes a student model and a teacher model. During the training process, the output of the student model is constrained, forcing the student model to have its predicted distribution close to the feature distribution of the teacher model. Following this principle, in the cross-domain adaptation layer (i.e. migration between the source domain and the target domain, on which the outputs of both domains are represented by knowledge distillation losses), the feature distribution of the target domain (student model) should learn the feature distribution of the source domain (teacher model). Meanwhile, the difference between the source domain and the target domain is evaluated by knowledge distillation loss to monitor the knowledge level learned by the target domain from the source domain.

Thus, the total loss of the network is:

Where θ represents a learnable parameter in the network.

The total loss here is the sum of three losses, the first one is used to train the prediction, the second one is inside the source domain, the source domain is divided into three phases, and it is desirable to align this in a certain direction by the network and the computation of the differences of MMD (it is conceivable that the data of the source domain is distributed like a curve, it can be changed into a straight line by the alignment inside the phases) and then the distillation loss is for Ji Yuanyu and the target domain, which is equivalent to the feature learning inside MMD, the distillation loss is the feature learning outside, and these three losses are simultaneously combined.

(8) Obtaining the prediction result of the testing machine

And (3) sending the test set into the RUL prediction model trained in the step (6) for prediction, and outputting a final rolling bearing residual life prediction result.

In order to verify the present invention, a number of experiments were performed under 7 cross-domain conditions at two operating conditions on the XJTU-SY dataset to predict the RUL of a rolling bearing.

The experimental results are shown in fig. 4, where (a) in fig. 4 and (b) in fig. 4 are the RUL prediction results of the bearings 1_1 and 1_3, and (c) in fig. 4 and (d) in fig. 4 are the RUL prediction results of the bearings 2_5. The bearing data under different working conditions are selected in the experiment to respectively predict a light decay phase, a slow decay phase and an acceleration decay phase. Wherein the red curve represents the predicted results of the proposed multi-feature cross-migration network with FCA methods applied and the blue curve represents the predicted results without FCA methods applied. From the results, the fitting effect of the red curve on the cross-working condition is better. The method not only has good precision, but also well solves the problem that the traditional migration method cannot solve, namely, the domain alignment cannot be completed well under the condition of facing the incomplete target domain. The transition of the multi-stage domain invariant feature from one working condition to the cross working condition is realized, and the prediction precision is obviously improved.

the training module is used for sending the data of each stage in the training set into the time sequence prediction network for training to obtain the prediction models of different stages;

The cross-adaptive deep migration learning system based on the bearing life is the same as other technical features of the cross-adaptive deep migration learning method based on the bearing life, and is not repeated here.

Based on the above embodiments, in the embodiments of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the cross-adaptive deep migration learning method based on bearing life in any of the above method embodiments.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various modifications and variations can be made to the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims and the equivalents thereof, the present invention is also intended to include such modifications and variations.

Claims

1. The cross self-adaptive deep migration learning method based on the service life of the bearing is characterized by comprising the following steps of:

(4) Dividing the time domain comprehensive characteristics in the step (3) into a plurality of sections by utilizing a sliding window, and calculating the maximum mean difference MMD between each section and the first section to obtain a bearing degradation index and a corresponding gradient curve;

(7) The data of each stage in the training set is sent into a time sequence prediction network model for training to obtain prediction models of different stages;

2. The method for learning the cross-adaptive deep migration based on the bearing life of claim 1, wherein the step (1) specifically comprises: the original vibration signal data of the rolling bearing comprises horizontal vibration signal data and vertical vibration signal data, the two types of vibration signal data are respectively collected through an acceleration sensor arranged on a bearing base, full life cycle data of bearing operation are obtained, and the vibration amplitude of the bearing exceeds a certain threshold value to serve as a signal for ending the collection; the concrete method of the sliding window segmentation method is that the whole original vibration signal data is segmented equidistantly according to a certain step length l.

3. The method for learning the cross-adaptive deep migration based on the bearing life of claim 1, wherein in the step (4), the obtained bearing degradation index is obtained by performing dimension reduction according to the obtained multi-dimensional time domain feature to obtain a comprehensive time domain feature, and calculating MMD of the first segment window by using sliding window segmentation.

4. The method for learning the cross-adaptive deep migration based on the bearing life according to claim 1, wherein in the step (5), the AGIP algorithm comprises:

5. The method for learning the cross-adaptive deep migration based on the bearing life according to claim 1, wherein the step (5) of dividing the training set and the test set includes:

Calculating the decay rate of each decay stage:

wherein V is _n Representing the decay rate of the nth decay phase, S _n For the degree of degradation of the nth stage of degradation, three stages of degradation are specified here as 100% -60%, 60% -10% and 10% -0, respectively, noted as: mild, slow and accelerated phases of decay, T _n Is the decay time corresponding to the decay phase;

the tag is then calculated:

6. The method for learning the cross-adaptive deep migration based on the bearing life according to claim 5, wherein the step (6) is characterized in that data of each stage in a training set is sent to a time sequence prediction network model, the time sequence prediction network model comprises a feature extraction layer, a multi-feature cross-migration layer and a network prediction layer, the feature extraction layer is a transform network structure, the degradation data of each stage in a source domain and the degradation data of a first degradation stage in a target domain are input into the transform network structure, after the input layer and an N-layer encoder and decoder are passed, the input features are subjected to linear transformation, and activated by an activation function, so that finally output features are obtained, the multi-feature cross-migration layer comprises a plurality of cross-attention layers, the source domain is used as a query Q, the target domain is used as a key K and a value V, and a new target loss function is proposed.

7. The bearing life-based cross-adaptive deep migration learning method of claim 6, wherein the objective loss function comprises:

regression loss l _regression :L _regression The method is used for minimizing the prediction error of the training data, adaptively adjusting the network training parameters and enabling the predicted RUL to be closer to an actual result; the invention selects radial basis function RBF as a kernel function, calculates L by mean square error MSE _regression The following are provided:

distribution difference loss L _MMD (D ^S ，D ^T )：L _MMD (D ^S ，D ^T ) Representing the distribution difference of high-dimensional features between a source domain and a target domain, enabling the network to better extract domain invariant features as a non-parametric metric MMD that measures first-order distribution divergence between the two domains; thus, after the introduction of the kernel learningThe distribution difference loss formula can be expressed as:

Thus, the total loss of the network is:

where θ represents a learnable parameter in the network.

8. A cross-adaptive deep migration learning system based on bearing life, comprising:

the training module is used for sending the data of each stage in the training set into the time sequence prediction network model for training to obtain the prediction models of different stages;

9. The bearing life-based cross-adaptive deep migration learning system of claim 8, wherein the stage calculation module, using AGIP algorithm adaptation, automatically identifies stage crossing points, comprises:

10. The bearing life-based cross-adaptive deep migration learning system of claim 9, wherein the partitioning of the training set and the test set in the partitioning module comprises:

calculating the decay rate of each decay stage:

the tag is then calculated: