CN116415485A

CN116415485A - Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption

Info

Publication number: CN116415485A
Application number: CN202211706024.XA
Authority: CN
Inventors: 吕燚; 温振飞; 张启晨
Original assignee: University of Electronic Science and Technology of China Zhongshan Institute
Current assignee: University of Electronic Science and Technology of China Zhongshan Institute
Priority date: 2022-12-29
Filing date: 2022-12-29
Publication date: 2023-07-11

Abstract

The invention discloses a residual service life prediction method for multi-source domain transfer learning based on dynamic distribution self-adaption, which comprises the following steps: 1) Given the existing active domain and target domain degradation data; 2) Preprocessing the degradation data; 3) Extracting degradation characteristic representations of degradation data of a source domain and a target domain; 4) Aligning degradation characteristic distribution of each source domain and each target domain to obtain multiple degradation characteristic representations of the target domain; 5) And merging the RUL labels obtained by the multiple degradation characteristics of the target domain through the predictors of the specific domains to serve as final RUL prediction labels. The migration learning can utilize similarities between data, tasks, or models to apply models and knowledge learned from old domains to new domains. The RUL prediction method based on transfer learning utilizes the existing degradation data set to train a prediction model, and applies the learned knowledge transfer to data sets of different working conditions to realize cross-domain RUL prediction.

Description

Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption

Technical Field

The invention relates to the technical field of fault processing, in particular to a residual service life prediction method based on dynamic distribution self-adaption multi-source domain migration learning.

Background

With the rapid development of the intelligent industrial age, the artificial intelligent technology is widely applied to the field of mechanical equipment fault and predictive management (PHM), and greatly improves the operation reliability of mechanical equipment while reducing manpower and material resources. As one of key technologies of PHM, the residual service life prediction utilizes condition monitoring data and fault mechanism of equipment to build a degradation model, analyzes degradation trend of the equipment to judge the fault time of the equipment, and has wide prospect in the fields of manufacturing industry, aerospace and the like. The development of the technology aims at giving an early warning to equipment which is about to fail, preventing the equipment from suddenly failing to cause huge loss and safety problems, reducing the maintenance cost of involved pens and improving the operation reliability of the equipment. However, the accuracy of RUL predictions is susceptible to various factors, such as uncertainty factors, e.g., operating conditions, operating environment, and noise in the monitored data. Therefore, RUL prediction of mechanical devices has been a challenging task.

In recent years, many RUL prediction methods have been proposed, and these methods can be classified into three categories: model-based methods, data-driven methods, and hybrid methods of the former two. The model-based approach is to build a mathematical or physical degradation model to predict the RUL of the mechanical component based on the system failure mechanism. The data driving method is to use a large amount of historical data to extract degradation characteristics of equipment, establish a mapping relation between the degradation characteristics and the residual service life, and fit a degradation curve to achieve the purpose of predicting RUL. The mixing method is to utilize the historical operation data and fault mechanism of the equipment at the same time, and fully combine the advantages of the two methods to carry out RUL prediction. Because model-based methods require a lot of a priori knowledge and for complex devices it is very difficult to build corresponding degradation models, data-driven methods have been a research hotspot for RUL prediction. The deep learning has been developed in the field of residual service life prediction due to its strong feature extraction capability and accurate regression analysis capability, but a key problem still exists. In RUL prediction for most devices, it is generally assumed that the test set and the training set are from the same operating conditions, follow the same distribution, and therefore the model only has accurate prediction results under the same operating conditions. However, in the actual operation process of the equipment, the working conditions of most of the equipment are different, and certain difference exists in the distribution of the data collected by the sensors, so that the accuracy of RUL prediction is drastically reduced.

Disclosure of Invention

Aiming at the problems existing in the prior art, the invention aims to provide a residual service life prediction method based on dynamic distribution self-adaption multi-source domain transfer learning.

In order to solve the problems, the invention adopts the following technical scheme.

A residual service life prediction method based on dynamic distribution self-adaption multi-source domain transfer learning comprises the following steps:

1) Given the existing active domain and target domain degradation data;

2) Preprocessing the degradation data;

3) Extracting degradation characteristic representations of degradation data of a source domain and a target domain;

4) Aligning degradation characteristic distribution of each source domain and each target domain to obtain multiple degradation characteristic representations of the target domain;

5) And merging the RUL labels obtained by the multiple degradation characteristics of the target domain through the predictors of the specific domains to serve as final RUL prediction labels.

As a further improvement of the present invention,

in step 1), the source domain and target domain degradation data:

given existing multisensor degradation data { X } _s×b As shown in formula (1).

The degradation data is represented in a matrix form, where s represents the number of sensors that can monitor the degradation state and n is the length of the degradation data, typically on a time period scale, to characterize the lifetime of the device.

As a further improvement of the present invention,

in the step 2), since the magnitude and size of the monitoring data of the plurality of sensors are greatly different, normalization is needed before the monitoring data are applied to a model, and the degradation data are preprocessed by adopting a maximum and minimum normalization method, and a calculation formula is shown in the step (2).

Wherein max (x _i ) And min (x) _i ) Representing the ith characteristic signal x in the data sample x _i Maximum and minimum of (a), normalized data x' _i ∈[0,1]. The sliding time window method is then used to convert the degraded data into a time series input. The size of the input time window is T _w The time step is t _d . The input data may be expressed as:

as a further improvement of the present invention, in the steps 3) to 5), three modules are included: the system comprises a degradation characteristic extraction module, a dynamic distribution self-adaptive module and a regression prediction module.

And the degradation characteristic extraction module is used for: the module consists of a common feature extractor and a domain-specific feature extractor for extracting degradation feature representations of source domain and target domain degradation data.

A common feature extractor: the function of this section is mainly to extract low-level feature representations of the multi-source and target domains, which are common feature extractors composed of four convolution blocks for extracting low-level feature representations of the source and target domains. The convolutional neural network is a deep learning structure utilizing convolutional operation, which allows the neural network to reduce feature space, effectively filter input and prevent overfitting, and CNN can effectively filter noise in a time sequence through the convolutional operation so as to generate a series of robust features excluding outliers, the convolutional layer uses two-dimensional convolutional operation, the size of a convolutional kernel can be expressed as (kernel_size, 1), and the convolutional layer only carries out the convolutional operation along the direction of a feature dimension without destroying the time dependence of the direction of the time dimension.

Feature field feature extractor: the part is mainly used for extracting unique characteristics of a specific field, and low-level characteristic representations obtained by each source field and target field through a previous layer module are respectively obtained by a specific field characteristic extractor to be used as final degradation characteristics. The special domain feature extractor consists of four layers of GRU units, the low-level feature representation obtains the high-level feature representation of the source domain and the target domain through the part, the GRU is a variant form of the cyclic neural network, the concept of a Reset Gate (Reset Gate) and an Update Gate (Update Gate) is introduced to modify the calculation mode of the hidden state in the cyclic neural network, the problem that the gradient in the cyclic neural network is easy to attenuate or explode is solved, and the time step distance in the degradation data serving as a time sequence can be better captured to greatly depend on the time step distance.

Dynamic distribution self-adaption module: the method is used for dynamically adjusting the influence of edge distribution difference and condition distribution difference, aligning degradation characteristic distribution of each source domain and each target domain, obtaining multiple degradation characteristic representations of the target domain, and providing that edge distribution self-adaption and condition distribution self-adaption are not equally important by a dynamic distribution self-adaption method. Dynamic distribution adaptation dynamically adjusts the distance between two distributions by employing a balancing factor μ:

D(D _s ,D _t )≈(1-μ)D(p _s (x,p _t (x+μD(p _s (y|x,p _t (y|x(4)

when mu is close to 0, the degradation data of the source domain and the target domain have larger difference, and the edge distribution adaptation is more important; μ is close to 1, which means that the source domain and the target domain data sets have higher similarity, and the condition distribution adaptation is more important.

The invention adopts a multi-core maximum mean difference (MK-MMD) method to measure the data edge distribution difference D (p) of the source domain and the target domain _s (x,p _t (x) edge distribution adaptation is achieved by minimizing MK-MMD, which is calculated in the manner shown in (5).

Wherein x is ^s Representing source domain degradation characteristics obeying p distribution, x ^t Representing the degradation characteristics of the q-distributed target domain, phi (·) is a mapping function, and the degradation data is mapped into a renewable hilbert space (RKHS) for measurement. However, phi (·) is not explicitly defined due to difficulty in selection, but an inner product of phi (·) is calculated by introducing a kernel function, and MMD is indirectly calculated. The invention adopts Gaussian kernel function, and the calculation formula is shown as (6).

Where σ is the width of the kernel. In MK-MMD, a plurality of sigma values are taken to calculate to obtain a plurality of kernel matrixes, and then the kernel matrixes are summed to obtain the final Gaussian kernel matrix.

For the conditional distribution difference D (p _s (y|),p _t (y|)) the present invention contemplates a Conditional Maximum Mean Difference (CMMD) based on MK-MMD. First, RUL tags of the degraded data samples are classified into four categories, and the tag classification manner is shown in (7).

Wherein y is _cls And y _RUL Respectively represent class labelsIn the RUL label, Y is the maximum life cycle, the classification method takes a degradation data sample of a health state as a class, the data sample of the degradation stage is divided into three stages, degradation degree is gradually increased from front to back, and as the degradation data of a target domain is not marked, in the training process, a model firstly uses a source domain to pretrain a label classifier, then the target domain data obtains a classification pseudo label through the classifier, and the precision of the classifier is gradually improved along with iterative training, so that an accurate classification label is obtained. The present invention uses the cross entropy loss function to calculate the label classification loss for the source domain, as shown in (8).

Wherein N is ^s Representing the number of source domain samples, c _i And

the real label and the class label of the ith source domain sample are respectively represented. The CMMD calculation method is shown in (9) according to the classification result of the source domain and the target domain.

Where c= { c|0,1,2,3 represents the degraded data sample class.

And->

Representing samples of degraded data belonging to class c in the source domain and the target domain, respectively.

In the training process of the model, the calculation method of the dynamic distribution adaptive factor mu is shown as (10).

Wherein d _A (X ^s ,X ^t ) Representation ofThe measurement of the characteristic difference of the source domain and target domain degradation data samples can measure the edge distribution alignment condition of the source domain and target domain characteristics in the training process, and d _A (X ^s(c) ,X ^t(c) ) The measurement of the characteristic difference of the degradation data samples of the source domain and the target domain belonging to the class c can measure the condition distribution alignment condition of the characteristics of the source domain and the target domain in the training process, and epsilon represents the error of distinguishing the data samples of the source domain and the target domain by a classifier based on a support vector.

As can be derived from equations (5) (9) (10), the objective function of the dynamically distributed adaptive module is:

by minimizing the objective function, the degradation features of each source domain and target domain are mapped to the same feature space reduction feature distribution differences, and finally multiple degradation feature representations can be obtained.

Regression prediction module: first, the degradation characteristics of the source domain and the target domain obtained by the dynamic distribution self-adaptive module are used for obtaining RUL prediction labels of the source domain and the target domain through a specific domain regression predictor. The present invention uses RMSE performance evaluation index as the predictive loss function, as shown in (13).

Finally, the module fuses RUL labels obtained by various degradation characteristics of the target domains through predictors of each specific domain, and as a final RUL prediction label, an average fusion method is used for ensuring that decision boundaries of each domain pair are aligned, and identical target samples predicted by different regressors are required to obtain identical predictions, so that the model needs to minimize differences among regressors of all specific domains, and an objective function for aligning prediction results of the regressors is shown as (14).

Wherein S and N ^s The number of source fields and the number of samples per source field are represented, respectively.

And->

The prediction results of the regressors m and n i-th data samples are shown, respectively.

Joint loss function: the joint loss function of the MDDAN model consists of four parts: regression prediction loss error

Label class loss error->

Dynamic distributed adaptive objective function->

Objective function aligned with predicted outcome

Thus, the joint loss function of the model can be expressed as:

where λ is a trade-off coefficient for control of

And->

The occupied loss specific gravity. Beta=2/(1+e) ^-10×(i+1)/ ) -1, which is a time-varying coefficient, varying with each training iteration, i being the current number of iterations and epochs being the total number of iterations.

In summary, the multi-source domain data and the target domain data are respectively extracted by a common feature extractor to obtain shallow feature representations, then different source domains and target domains are respectively extracted by a specific domain feature extractor to obtain source domain degradation features and target domain degradation features, a dynamic distribution domain self-adaptive module calculates degradation feature dynamic distribution differences, then the degradation features are subjected to regression prediction to obtain prediction results of each source domain and target domain, errors are calculated, a plurality of prediction results of the target domain are averaged to obtain a final RUL prediction label, finally, a model calculates joint loss errors, back propagation is carried out by using a random gradient descent method, model parameters are optimized, the degradation features of each domain pair are aligned, multiple degradation feature representations are obtained by the target domain, and the accuracy and generalization capability of a prediction model are improved.

The beneficial effects of the invention are that

Compared with the prior art, the invention has the advantages that:

in order to improve the accuracy of cross-working-condition RUL prediction, the existing technology mostly uses a Shan Yuanyu transfer learning method, but Shan Yuanyu transfer learning extraction of cross-domain invariant features is single. There is also a risk of negative migration if the data distribution of the source domain and the target domain are too different. According to the method, training data sets under different working conditions are used as source domains, multiple distribution advantages of the data sets are utilized, negative migration caused by overlarge data distribution difference between a single source domain and a target domain is avoided, and complex degradation data are fully represented by multiple degradation characteristics of the target domain, so that the aim of improving the prediction accuracy of the cross-working condition RUL is fulfilled.

Drawings

FIG. 1 is a schematic flow chart of the present invention.

Fig. 2 is a diagram of a convolutional block structure of the present invention.

FIG. 3 is a block diagram of a GRU unit of the invention.

Fig. 4 is a flowchart illustrating the operation of the MDDAN model according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention; it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments, and that all other embodiments obtained by persons of ordinary skill in the art without making creative efforts based on the embodiments in the present invention are within the protection scope of the present invention.

1. Source domain and target domain degradation data

Given existing multisensor degradation data { X } _s×n As shown in formula (1).

2. Data preprocessing strategy

First, since the magnitude and size of the monitoring data of the plurality of sensors vary widely, normalization is required before application to the model. The invention adopts a maximum and minimum normalization method to preprocess degradation data, and a calculation formula is shown as (2).

3. multi-source dynamic distribution self-adaption based migration learning prediction model

The invention provides a multi-source domain transfer learning model based on dynamic distribution self-adaption, which is used for RUL prediction, and is called MDDAN for short. The model mainly comprises three modules: the system comprises a degradation characteristic extraction module, a dynamic distribution self-adaptive module and a regression prediction module.

(1) And the degradation characteristic extraction module is used for: the module consists of a common feature extractor and a domain-specific feature extractor for extracting degradation feature representations of source domain and target domain degradation data.

a. A common feature extractor: the role of this part is mainly to extract low-level feature representations of the multi-source and target domains, which are common feature extractors. The common feature extractor consists of four convolution blocks for extracting low-level feature representations of the source and target domains. The convolutional neural network is a deep learning structure utilizing convolutional operation, which allows the neural network to reduce feature space, effectively filter input and prevent overfitting. Furthermore, CNNs can effectively filter noise in time series through convolution operations, enabling the generation of a series of robust features that do not include outliers. Note that as shown in fig. 2, a general convolution block is composed of a convolution layer, a batch normalization layer, and a pooling layer. However, the convolution blocks used by the common feature extractor do not contain a pooling layer. Since pooling loses the position information of time-series data, it is meaningless for the pooled convolution blocks of data to be reused by the GRU. The convolution layer uses a two-dimensional convolution operation, and the size of the convolution kernel can be expressed as (kernel_size, 1), so that the convolution layer only carries out the convolution operation along the direction of the characteristic dimension, and the time dependency relationship of the direction of the time dimension is not destroyed.

b. Feature field feature extractor: the part is mainly used for extracting unique characteristics of a specific field, and low-level characteristic representations obtained by each source field and target field through a previous layer module are respectively obtained by a specific field characteristic extractor to be used as final degradation characteristics. The domain-specific feature extractor consists of two layers of GRU units, through which a low-level feature representation results in a high-level feature representation of the source domain and the target domain. The GRU is a variant form of the cyclic neural network, and the hidden state calculation mode in the cyclic neural network is modified by introducing the concepts of a Reset Gate (Reset Gate) and an Update Gate (Update Gate), so that the problem that the gradient in the cyclic neural network is easy to attenuate or explode is solved, and the time step distance large-scale dependency relationship in the degradation data serving as a time sequence can be better captured. As shown in fig. 3, it can control the flow of information through a learnable gate.

(2) Dynamic distribution self-help and an adaptation module: the module is used to dynamically adjust the effects of edge distribution differences and conditional distribution differences, align the degradation feature distributions of each source domain and target domain, and obtain multiple degradation feature representations of the target domain. It is not equally important that the dynamic distribution adaptation method proposes edge distribution adaptation and conditional distribution adaptation. According to the method, the importance of edge distribution and conditional distribution in the distribution adaptation process can be adaptively adjusted according to the distribution situation of actual degradation data. Precisely, dynamic distribution adaptation dynamically adjusts the distance between two distributions by employing a balancing factor μ:

D(D _s ,D _t )≈(1-μ)D(p _s (x,p _t (x+μD(p _s (y|x,p _t (y|x(4)

To be used for

Wherein x is ^s Representing source domain degradation characteristics obeying p distribution, x ^t Representing the degradation characteristics of the target domain subject to q distribution. Phi (·) is a mapping function that maps the degradation data into a reproducible Hilbert space (RKHS) for measurement. However, phi (·) is not explicitly defined due to the difficulty of selection. Instead, the MMD is calculated indirectly by introducing the inner product of the kernel function calculation phi (.). The inventionThe calculation formula is shown as (6) by using a Gaussian kernel function (RBF kernel).

Wherein y is _cls And y _RUL The class label and the RUL label are indicated, respectively, and Y is the maximum life cycle. The classification method takes a degradation data sample of a health state as a class, wherein the data sample of a degradation stage is divided into three stages, and the degradation degree is gradually increased from front to back. Because the target domain degradation data is unlabeled, in the training process, the model firstly uses a source domain pre-training label classifier, then the target domain data obtains a classification pseudo label through the classifier, and along with iterative training, the precision of the classifier is gradually improved, so that an accurate classification label is obtained. The present invention uses the cross entropy loss function to calculate the label classification loss for the source domain, as shown in (8).

Wherein N is ^s Representing the number of source domain samples, c _i And

the real label and the class label of the ith source domain sample are respectively represented. Based on the classification results of the source domain and the target domain, the CMMD calculation method is as shown in (9)Shown.

Where c= { c|0,1,2,3 represents the degraded data sample class.

And->

d _A (X ^s ,X ^t )＝2(1-2ε(X ^s ,X ^t ))(11)

Wherein d _A (X ^s ,X ^t ) The measurement of the characteristic difference of the degradation data samples of the source domain and the target domain can measure the edge distribution alignment condition of the characteristics of the source domain and the target domain in the training process. d, d _A (X ^s(c) ,X ^t(c) ) The measurement of the characteristic difference of the degradation data samples representing the source domain and the target domain belonging to the class c can measure the condition distribution alignment condition of the characteristics of the source domain and the target domain in the training process. Epsilon represents the error of the support vector based classifier to distinguish between source domain and target domain data samples.

(3) Regression prediction module: first, the degradation characteristics of the source domain and the target domain obtained by the dynamic distribution self-adaptive module are used for obtaining RUL prediction labels of the source domain and the target domain through a specific domain regression predictor. The present invention uses RMSE performance evaluation index as the predictive loss function, as shown in (13).

Finally, the module fuses RUL labels obtained by various degradation characteristics of the target domain through predictors of various specific domains to serve as final RUL prediction labels. The invention adopts the mean value fusion method to ensure that decision boundaries of each domain pair are aligned, and identical target samples predicted by different regressors should be predicted identically. Thus, the model needs to minimize the differences between all domain-specific regressors. The objective function of aligning the predicted results of each regressor is shown as (14).

And->

4. Joint loss function

The joint loss function of the MDDAN model consists of four parts: regression prediction loss error

Label class loss error->

Dynamic distributed adaptive objective function->

An objective function aligned with the predicted outcome +.>

Thus, the joint loss function of the model can be expressed as:

where λ is a trade-off coefficient for control of

And->

The occupied loss specific gravity. Beta=2/(1 +)

e ^-10×(i+1)/ ) -1, which is a time-varying coefficient, varying with each training iteration, i being the current number of iterations and epochs being the total number of iterations.

RUL prediction procedure

The overall operation flow of the MDDAN model proposed by the invention is shown in figure 4. The multi-source domain data and the target domain data are respectively extracted through a common feature extractor to obtain shallow feature representation, then different source domains and target domains are respectively extracted through a specific domain feature extractor to obtain source domain degradation features and target domain degradation features, and the degradation feature dynamic distribution difference is calculated in a dynamic distribution domain self-adaptive module. Then, the degradation characteristic is subjected to regression prediction to obtain a prediction result of each source domain and each target domain, and an error is calculated. And taking the average value of a plurality of prediction results of the target domain to obtain a final RUL prediction tag. And finally, calculating a joint loss error by the model, carrying out back propagation by using a random gradient descent method, optimizing model parameters, aligning degradation characteristics of each domain pair, enabling a target domain to obtain multiple degradation characteristic representations, and improving the accuracy and generalization capability of the prediction model.

In summary, the present invention is merely a preferred embodiment; the scope of the invention is not limited in this respect. Any person skilled in the art, within the technical scope of the present disclosure, may apply to the present invention, and the technical solution and the improvement thereof are all covered by the protection scope of the present invention.

Claims

1. The residual service life prediction method based on the dynamic distribution self-adaption multi-source domain transfer learning is characterized by comprising the following steps of:

1) Given the existing active domain and target domain degradation data;

2) For degradation data pretreating;

2. The method for predicting the remaining service life of the multi-source domain transfer learning based on dynamic distribution self-adaption according to claim 1, wherein the method comprises the following steps:

in step 1), the source domain and target domain degradation data:

3. The method for predicting the remaining service life of the multi-source domain transfer learning based on dynamic distribution self-adaption according to claim 1, wherein the method comprises the following steps:

4. the method for predicting the remaining service life of the multi-source domain transfer learning based on dynamic distribution self-adaption according to claim 1, wherein the method comprises the following steps:

the steps 3) to 5) include three modules: the system comprises a degradation characteristic extraction module, a dynamic distribution self-adaptive module and a regression prediction module.

A common feature extractor: the function of this section is mainly to extract low-level feature representations of the multi-source and target domains, which are common feature extractors composed of four convolution blocks for extracting low-level feature representations of the source and target domains. The convolutional neural network is a deep learning structure utilizing convolutional operation, which allows the neural network to reduce feature space, effectively filter input and prevent overfitting. Furthermore, CNNs can effectively filter noise in time series through convolution operations, enabling the generation of a series of robust features that do not include outliers. The convolution layer of the part uses two-dimensional convolution operation, and the size of the convolution kernel can be expressed as (kernel_size, 1), so that the convolution layer only carries out convolution operation along the direction of the characteristic dimension, and the time dependence of the direction of the time dimension is not destroyed.

Specific domain feature extractor: the part is mainly used for extracting unique characteristics of a specific field, and low-level characteristic representations obtained by each source field and target field through a previous layer module are respectively obtained by a specific field characteristic extractor to be used as final degradation characteristics. The domain-specific feature extractor consists of four layers of GRU units, through which a low-level feature representation results in a high-level feature representation of the source domain and the target domain. The GRU is a variant form of the cyclic neural network, and the hidden state calculation mode in the cyclic neural network is modified by introducing the concepts of a Reset Gate (Reset Gate) and an Update Gate (Update Gate), so that the problem that the gradient in the cyclic neural network is easy to attenuate or explode is solved, and the time step distance large-scale dependency relationship in the degradation data serving as a time sequence can be better captured.

Dynamic distribution self-adaption module: the module is used to dynamically adjust the effects of edge distribution differences and conditional distribution differences, align the degradation feature distributions of each source domain and target domain, and obtain multiple degradation feature representations of the target domain. The dynamic distribution self-adaption method is not equally important to propose edge distribution self-adaption and conditional distribution self-adaption, and the method can adaptively adjust the importance of the edge distribution and the conditional distribution in the distribution self-adaption process according to the distribution situation of actual degradation data. Dynamic distribution adaptation dynamically adjusts the distance between two distributions by employing a balancing factor μ:

D(D _s ,D _t )≈(1-μ)D(p _s (x),p _t (x))+μD(p _s (y|x),p _t (y|x))(4)

The invention adopts a multi-core maximum mean difference (MK-MMD) method to measure the data edge distribution difference D (p) of the source domain and the target domain _s (x),p _t (x) Edge distribution adaptation is achieved by minimizing MK-MMD, which is calculated in the manner shown in (5).

For the conditional distribution difference D (p _s (y|x),p _t (y|x)), a Conditional Maximum Mean Difference (CMMD) based on MK-MMD is designed. First, RUL tags of the degraded data samples are classified into four categories, and the tag classification manner is shown in (7).

Wherein y is _cls And y _RUL Respectively representing a classification label and an RUL label, wherein Y is the maximum life cycle, and the classification method is to take a degradation data sample of a health state as a class, wherein the data sample of the degradation stage is divided into three stages from front to backThe post degradation degree is gradually increased, and because the target domain degradation data is not marked, in the training process, the model firstly uses a source domain pre-training label classifier, then the target domain data obtains a classification pseudo label through the classifier, and along with iterative training, the precision of the classifier is gradually improved, so that an accurate classification label is obtained. The present invention uses the cross entropy loss function to calculate the label classification loss for the source domain, as shown in (8).

Wherein N is ^s Representing the number of source domain samples, c _i And

Where c= { c|0,1,2,3} represents the degraded data sample class.

And->

In the training process of the model, the calculation method of the dynamic distribution adaptive factor mu is shown as (10),

d _A (X ^s ,X ^t )＝2(1-2ε(X ^s ,X ^t )) (11)

wherein the method comprises the steps of，d _A (X ^s ,X ^t ) The measurement of the characteristic difference of the degradation data samples of the source domain and the target domain can measure the edge distribution alignment condition of the characteristics of the source domain and the target domain in the training process, and d _A (X ^s(c) ,X ^t(c) ) The measurement of the characteristic difference of the degradation data samples of the source domain and the target domain belonging to the class c can measure the condition distribution alignment condition of the characteristics of the source domain and the target domain in the training process, and epsilon represents the error of distinguishing the data samples of the source domain and the target domain by a classifier based on a support vector.

by minimizing the objective function, the degradation features of each source domain and target domain are mapped to the same feature space reduction feature distribution difference, and finally the target domain can obtain multiple degradation feature representations.

Finally, the module fuses RUL labels obtained by various degradation characteristics of the target domain through predictors of various specific domains to serve as final RUL prediction labels. The module adopts a mean value fusion method to ensure that decision boundaries of each domain pair are aligned, and identical target samples predicted by different regressors should be predicted identically, so that the model needs to minimize differences among regressors in all specific domains, and an objective function for aligning prediction results of the regressors is shown as (14).

And->

Label class loss error->

Dynamic distributed adaptive objective function->

An objective function aligned with the predicted outcome +.>

Thus, the joint loss function of the model can be expressed as:

where λ is a trade-off coefficient for control of

And->

The occupied loss specific gravity. Beta=2/(1+e) ^{-10×(i+1)/epochs} ) -1, a time-varying coefficient, which varies with each training iterationI is the current iteration number and epochs is the total iteration number.