CN116415485A - Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption - Google Patents

Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption Download PDF

Info

Publication number
CN116415485A
CN116415485A CN202211706024.XA CN202211706024A CN116415485A CN 116415485 A CN116415485 A CN 116415485A CN 202211706024 A CN202211706024 A CN 202211706024A CN 116415485 A CN116415485 A CN 116415485A
Authority
CN
China
Prior art keywords
domain
degradation
data
target domain
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211706024.XA
Other languages
Chinese (zh)
Inventor
吕燚
温振飞
张启晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China Zhongshan Institute
Original Assignee
University of Electronic Science and Technology of China Zhongshan Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China Zhongshan Institute filed Critical University of Electronic Science and Technology of China Zhongshan Institute
Priority to CN202211706024.XA priority Critical patent/CN116415485A/en
Publication of CN116415485A publication Critical patent/CN116415485A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/02Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Geometry (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a residual service life prediction method for multi-source domain transfer learning based on dynamic distribution self-adaption, which comprises the following steps: 1) Given the existing active domain and target domain degradation data; 2) Preprocessing the degradation data; 3) Extracting degradation characteristic representations of degradation data of a source domain and a target domain; 4) Aligning degradation characteristic distribution of each source domain and each target domain to obtain multiple degradation characteristic representations of the target domain; 5) And merging the RUL labels obtained by the multiple degradation characteristics of the target domain through the predictors of the specific domains to serve as final RUL prediction labels. The migration learning can utilize similarities between data, tasks, or models to apply models and knowledge learned from old domains to new domains. The RUL prediction method based on transfer learning utilizes the existing degradation data set to train a prediction model, and applies the learned knowledge transfer to data sets of different working conditions to realize cross-domain RUL prediction.

Description

Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption
Technical Field
The invention relates to the technical field of fault processing, in particular to a residual service life prediction method based on dynamic distribution self-adaption multi-source domain migration learning.
Background
With the rapid development of the intelligent industrial age, the artificial intelligent technology is widely applied to the field of mechanical equipment fault and predictive management (PHM), and greatly improves the operation reliability of mechanical equipment while reducing manpower and material resources. As one of key technologies of PHM, the residual service life prediction utilizes condition monitoring data and fault mechanism of equipment to build a degradation model, analyzes degradation trend of the equipment to judge the fault time of the equipment, and has wide prospect in the fields of manufacturing industry, aerospace and the like. The development of the technology aims at giving an early warning to equipment which is about to fail, preventing the equipment from suddenly failing to cause huge loss and safety problems, reducing the maintenance cost of involved pens and improving the operation reliability of the equipment. However, the accuracy of RUL predictions is susceptible to various factors, such as uncertainty factors, e.g., operating conditions, operating environment, and noise in the monitored data. Therefore, RUL prediction of mechanical devices has been a challenging task.
In recent years, many RUL prediction methods have been proposed, and these methods can be classified into three categories: model-based methods, data-driven methods, and hybrid methods of the former two. The model-based approach is to build a mathematical or physical degradation model to predict the RUL of the mechanical component based on the system failure mechanism. The data driving method is to use a large amount of historical data to extract degradation characteristics of equipment, establish a mapping relation between the degradation characteristics and the residual service life, and fit a degradation curve to achieve the purpose of predicting RUL. The mixing method is to utilize the historical operation data and fault mechanism of the equipment at the same time, and fully combine the advantages of the two methods to carry out RUL prediction. Because model-based methods require a lot of a priori knowledge and for complex devices it is very difficult to build corresponding degradation models, data-driven methods have been a research hotspot for RUL prediction. The deep learning has been developed in the field of residual service life prediction due to its strong feature extraction capability and accurate regression analysis capability, but a key problem still exists. In RUL prediction for most devices, it is generally assumed that the test set and the training set are from the same operating conditions, follow the same distribution, and therefore the model only has accurate prediction results under the same operating conditions. However, in the actual operation process of the equipment, the working conditions of most of the equipment are different, and certain difference exists in the distribution of the data collected by the sensors, so that the accuracy of RUL prediction is drastically reduced.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention aims to provide a residual service life prediction method based on dynamic distribution self-adaption multi-source domain transfer learning.
In order to solve the problems, the invention adopts the following technical scheme.
A residual service life prediction method based on dynamic distribution self-adaption multi-source domain transfer learning comprises the following steps:
1) Given the existing active domain and target domain degradation data;
2) Preprocessing the degradation data;
3) Extracting degradation characteristic representations of degradation data of a source domain and a target domain;
4) Aligning degradation characteristic distribution of each source domain and each target domain to obtain multiple degradation characteristic representations of the target domain;
5) And merging the RUL labels obtained by the multiple degradation characteristics of the target domain through the predictors of the specific domains to serve as final RUL prediction labels.
As a further improvement of the present invention,
in step 1), the source domain and target domain degradation data:
given existing multisensor degradation data { X } s×b As shown in formula (1).
Figure SMS_1
The degradation data is represented in a matrix form, where s represents the number of sensors that can monitor the degradation state and n is the length of the degradation data, typically on a time period scale, to characterize the lifetime of the device.
As a further improvement of the present invention,
in the step 2), since the magnitude and size of the monitoring data of the plurality of sensors are greatly different, normalization is needed before the monitoring data are applied to a model, and the degradation data are preprocessed by adopting a maximum and minimum normalization method, and a calculation formula is shown in the step (2).
Figure SMS_2
Wherein max (x i ) And min (x) i ) Representing the ith characteristic signal x in the data sample x i Maximum and minimum of (a), normalized data x' i ∈[0,1]. The sliding time window method is then used to convert the degraded data into a time series input. The size of the input time window is T w The time step is t d . The input data may be expressed as:
Figure SMS_3
as a further improvement of the present invention, in the steps 3) to 5), three modules are included: the system comprises a degradation characteristic extraction module, a dynamic distribution self-adaptive module and a regression prediction module.
And the degradation characteristic extraction module is used for: the module consists of a common feature extractor and a domain-specific feature extractor for extracting degradation feature representations of source domain and target domain degradation data.
A common feature extractor: the function of this section is mainly to extract low-level feature representations of the multi-source and target domains, which are common feature extractors composed of four convolution blocks for extracting low-level feature representations of the source and target domains. The convolutional neural network is a deep learning structure utilizing convolutional operation, which allows the neural network to reduce feature space, effectively filter input and prevent overfitting, and CNN can effectively filter noise in a time sequence through the convolutional operation so as to generate a series of robust features excluding outliers, the convolutional layer uses two-dimensional convolutional operation, the size of a convolutional kernel can be expressed as (kernel_size, 1), and the convolutional layer only carries out the convolutional operation along the direction of a feature dimension without destroying the time dependence of the direction of the time dimension.
Feature field feature extractor: the part is mainly used for extracting unique characteristics of a specific field, and low-level characteristic representations obtained by each source field and target field through a previous layer module are respectively obtained by a specific field characteristic extractor to be used as final degradation characteristics. The special domain feature extractor consists of four layers of GRU units, the low-level feature representation obtains the high-level feature representation of the source domain and the target domain through the part, the GRU is a variant form of the cyclic neural network, the concept of a Reset Gate (Reset Gate) and an Update Gate (Update Gate) is introduced to modify the calculation mode of the hidden state in the cyclic neural network, the problem that the gradient in the cyclic neural network is easy to attenuate or explode is solved, and the time step distance in the degradation data serving as a time sequence can be better captured to greatly depend on the time step distance.
Dynamic distribution self-adaption module: the method is used for dynamically adjusting the influence of edge distribution difference and condition distribution difference, aligning degradation characteristic distribution of each source domain and each target domain, obtaining multiple degradation characteristic representations of the target domain, and providing that edge distribution self-adaption and condition distribution self-adaption are not equally important by a dynamic distribution self-adaption method. Dynamic distribution adaptation dynamically adjusts the distance between two distributions by employing a balancing factor μ:
D(D s ,D t )≈(1-μ)D(p s (x,p t (x+μD(p s (y|x,p t (y|x(4)
when mu is close to 0, the degradation data of the source domain and the target domain have larger difference, and the edge distribution adaptation is more important; μ is close to 1, which means that the source domain and the target domain data sets have higher similarity, and the condition distribution adaptation is more important.
The invention adopts a multi-core maximum mean difference (MK-MMD) method to measure the data edge distribution difference D (p) of the source domain and the target domain s (x,p t (x) edge distribution adaptation is achieved by minimizing MK-MMD, which is calculated in the manner shown in (5).
Figure SMS_4
Wherein x is s Representing source domain degradation characteristics obeying p distribution, x t Representing the degradation characteristics of the q-distributed target domain, phi (·) is a mapping function, and the degradation data is mapped into a renewable hilbert space (RKHS) for measurement. However, phi (·) is not explicitly defined due to difficulty in selection, but an inner product of phi (·) is calculated by introducing a kernel function, and MMD is indirectly calculated. The invention adopts Gaussian kernel function, and the calculation formula is shown as (6).
Figure SMS_5
Where σ is the width of the kernel. In MK-MMD, a plurality of sigma values are taken to calculate to obtain a plurality of kernel matrixes, and then the kernel matrixes are summed to obtain the final Gaussian kernel matrix.
For the conditional distribution difference D (p s (y|),p t (y|)) the present invention contemplates a Conditional Maximum Mean Difference (CMMD) based on MK-MMD. First, RUL tags of the degraded data samples are classified into four categories, and the tag classification manner is shown in (7).
Figure SMS_6
Wherein y is cls And y RUL Respectively represent class labelsIn the RUL label, Y is the maximum life cycle, the classification method takes a degradation data sample of a health state as a class, the data sample of the degradation stage is divided into three stages, degradation degree is gradually increased from front to back, and as the degradation data of a target domain is not marked, in the training process, a model firstly uses a source domain to pretrain a label classifier, then the target domain data obtains a classification pseudo label through the classifier, and the precision of the classifier is gradually improved along with iterative training, so that an accurate classification label is obtained. The present invention uses the cross entropy loss function to calculate the label classification loss for the source domain, as shown in (8).
Figure SMS_7
Wherein N is s Representing the number of source domain samples, c i And
Figure SMS_8
the real label and the class label of the ith source domain sample are respectively represented. The CMMD calculation method is shown in (9) according to the classification result of the source domain and the target domain.
Figure SMS_9
Where c= { c|0,1,2,3 represents the degraded data sample class.
Figure SMS_10
And->
Figure SMS_11
Representing samples of degraded data belonging to class c in the source domain and the target domain, respectively.
In the training process of the model, the calculation method of the dynamic distribution adaptive factor mu is shown as (10).
Figure SMS_12
Wherein d A (X s ,X t ) Representation ofThe measurement of the characteristic difference of the source domain and target domain degradation data samples can measure the edge distribution alignment condition of the source domain and target domain characteristics in the training process, and d A (X s(c) ,X t(c) ) The measurement of the characteristic difference of the degradation data samples of the source domain and the target domain belonging to the class c can measure the condition distribution alignment condition of the characteristics of the source domain and the target domain in the training process, and epsilon represents the error of distinguishing the data samples of the source domain and the target domain by a classifier based on a support vector.
As can be derived from equations (5) (9) (10), the objective function of the dynamically distributed adaptive module is:
Figure SMS_13
by minimizing the objective function, the degradation features of each source domain and target domain are mapped to the same feature space reduction feature distribution differences, and finally multiple degradation feature representations can be obtained.
Regression prediction module: first, the degradation characteristics of the source domain and the target domain obtained by the dynamic distribution self-adaptive module are used for obtaining RUL prediction labels of the source domain and the target domain through a specific domain regression predictor. The present invention uses RMSE performance evaluation index as the predictive loss function, as shown in (13).
Figure SMS_14
Finally, the module fuses RUL labels obtained by various degradation characteristics of the target domains through predictors of each specific domain, and as a final RUL prediction label, an average fusion method is used for ensuring that decision boundaries of each domain pair are aligned, and identical target samples predicted by different regressors are required to obtain identical predictions, so that the model needs to minimize differences among regressors of all specific domains, and an objective function for aligning prediction results of the regressors is shown as (14).
Figure SMS_15
Wherein S and N s The number of source fields and the number of samples per source field are represented, respectively.
Figure SMS_16
And->
Figure SMS_17
The prediction results of the regressors m and n i-th data samples are shown, respectively.
Joint loss function: the joint loss function of the MDDAN model consists of four parts: regression prediction loss error
Figure SMS_18
Label class loss error->
Figure SMS_19
Dynamic distributed adaptive objective function->
Figure SMS_20
Objective function aligned with predicted outcome
Figure SMS_21
Thus, the joint loss function of the model can be expressed as:
Figure SMS_22
where λ is a trade-off coefficient for control of
Figure SMS_23
And->
Figure SMS_24
The occupied loss specific gravity. Beta=2/(1+e) -10×(i+1)/ ) -1, which is a time-varying coefficient, varying with each training iteration, i being the current number of iterations and epochs being the total number of iterations.
In summary, the multi-source domain data and the target domain data are respectively extracted by a common feature extractor to obtain shallow feature representations, then different source domains and target domains are respectively extracted by a specific domain feature extractor to obtain source domain degradation features and target domain degradation features, a dynamic distribution domain self-adaptive module calculates degradation feature dynamic distribution differences, then the degradation features are subjected to regression prediction to obtain prediction results of each source domain and target domain, errors are calculated, a plurality of prediction results of the target domain are averaged to obtain a final RUL prediction label, finally, a model calculates joint loss errors, back propagation is carried out by using a random gradient descent method, model parameters are optimized, the degradation features of each domain pair are aligned, multiple degradation feature representations are obtained by the target domain, and the accuracy and generalization capability of a prediction model are improved.
The beneficial effects of the invention are that
Compared with the prior art, the invention has the advantages that:
in order to improve the accuracy of cross-working-condition RUL prediction, the existing technology mostly uses a Shan Yuanyu transfer learning method, but Shan Yuanyu transfer learning extraction of cross-domain invariant features is single. There is also a risk of negative migration if the data distribution of the source domain and the target domain are too different. According to the method, training data sets under different working conditions are used as source domains, multiple distribution advantages of the data sets are utilized, negative migration caused by overlarge data distribution difference between a single source domain and a target domain is avoided, and complex degradation data are fully represented by multiple degradation characteristics of the target domain, so that the aim of improving the prediction accuracy of the cross-working condition RUL is fulfilled.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
Fig. 2 is a diagram of a convolutional block structure of the present invention.
FIG. 3 is a block diagram of a GRU unit of the invention.
Fig. 4 is a flowchart illustrating the operation of the MDDAN model according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention; it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments, and that all other embodiments obtained by persons of ordinary skill in the art without making creative efforts based on the embodiments in the present invention are within the protection scope of the present invention.
1. Source domain and target domain degradation data
Given existing multisensor degradation data { X } s×n As shown in formula (1).
Figure SMS_25
The degradation data is represented in a matrix form, where s represents the number of sensors that can monitor the degradation state and n is the length of the degradation data, typically on a time period scale, to characterize the lifetime of the device.
2. Data preprocessing strategy
First, since the magnitude and size of the monitoring data of the plurality of sensors vary widely, normalization is required before application to the model. The invention adopts a maximum and minimum normalization method to preprocess degradation data, and a calculation formula is shown as (2).
Figure SMS_26
Wherein max (x i ) And min (x) i ) Representing the ith characteristic signal x in the data sample x i Maximum and minimum of (a), normalized data x' i ∈[0,1]. The sliding time window method is then used to convert the degraded data into a time series input. The size of the input time window is T W The time step is t d . The input data may be expressed as:
Figure SMS_27
3. multi-source dynamic distribution self-adaption based migration learning prediction model
The invention provides a multi-source domain transfer learning model based on dynamic distribution self-adaption, which is used for RUL prediction, and is called MDDAN for short. The model mainly comprises three modules: the system comprises a degradation characteristic extraction module, a dynamic distribution self-adaptive module and a regression prediction module.
(1) And the degradation characteristic extraction module is used for: the module consists of a common feature extractor and a domain-specific feature extractor for extracting degradation feature representations of source domain and target domain degradation data.
a. A common feature extractor: the role of this part is mainly to extract low-level feature representations of the multi-source and target domains, which are common feature extractors. The common feature extractor consists of four convolution blocks for extracting low-level feature representations of the source and target domains. The convolutional neural network is a deep learning structure utilizing convolutional operation, which allows the neural network to reduce feature space, effectively filter input and prevent overfitting. Furthermore, CNNs can effectively filter noise in time series through convolution operations, enabling the generation of a series of robust features that do not include outliers. Note that as shown in fig. 2, a general convolution block is composed of a convolution layer, a batch normalization layer, and a pooling layer. However, the convolution blocks used by the common feature extractor do not contain a pooling layer. Since pooling loses the position information of time-series data, it is meaningless for the pooled convolution blocks of data to be reused by the GRU. The convolution layer uses a two-dimensional convolution operation, and the size of the convolution kernel can be expressed as (kernel_size, 1), so that the convolution layer only carries out the convolution operation along the direction of the characteristic dimension, and the time dependency relationship of the direction of the time dimension is not destroyed.
b. Feature field feature extractor: the part is mainly used for extracting unique characteristics of a specific field, and low-level characteristic representations obtained by each source field and target field through a previous layer module are respectively obtained by a specific field characteristic extractor to be used as final degradation characteristics. The domain-specific feature extractor consists of two layers of GRU units, through which a low-level feature representation results in a high-level feature representation of the source domain and the target domain. The GRU is a variant form of the cyclic neural network, and the hidden state calculation mode in the cyclic neural network is modified by introducing the concepts of a Reset Gate (Reset Gate) and an Update Gate (Update Gate), so that the problem that the gradient in the cyclic neural network is easy to attenuate or explode is solved, and the time step distance large-scale dependency relationship in the degradation data serving as a time sequence can be better captured. As shown in fig. 3, it can control the flow of information through a learnable gate.
(2) Dynamic distribution self-help and an adaptation module: the module is used to dynamically adjust the effects of edge distribution differences and conditional distribution differences, align the degradation feature distributions of each source domain and target domain, and obtain multiple degradation feature representations of the target domain. It is not equally important that the dynamic distribution adaptation method proposes edge distribution adaptation and conditional distribution adaptation. According to the method, the importance of edge distribution and conditional distribution in the distribution adaptation process can be adaptively adjusted according to the distribution situation of actual degradation data. Precisely, dynamic distribution adaptation dynamically adjusts the distance between two distributions by employing a balancing factor μ:
D(D s ,D t )≈(1-μ)D(p s (x,p t (x+μD(p s (y|x,p t (y|x(4)
when mu is close to 0, the degradation data of the source domain and the target domain have larger difference, and the edge distribution adaptation is more important; μ is close to 1, which means that the source domain and the target domain data sets have higher similarity, and the condition distribution adaptation is more important.
The invention adopts a multi-core maximum mean difference (MK-MMD) method to measure the data edge distribution difference D (p) of the source domain and the target domain s (x,p t (x) edge distribution adaptation is achieved by minimizing MK-MMD, which is calculated in the manner shown in (5).
Figure SMS_28
To be used for
Wherein x is s Representing source domain degradation characteristics obeying p distribution, x t Representing the degradation characteristics of the target domain subject to q distribution. Phi (·) is a mapping function that maps the degradation data into a reproducible Hilbert space (RKHS) for measurement. However, phi (·) is not explicitly defined due to the difficulty of selection. Instead, the MMD is calculated indirectly by introducing the inner product of the kernel function calculation phi (.). The inventionThe calculation formula is shown as (6) by using a Gaussian kernel function (RBF kernel).
Figure SMS_29
Where σ is the width of the kernel. In MK-MMD, a plurality of sigma values are taken to calculate to obtain a plurality of kernel matrixes, and then the kernel matrixes are summed to obtain the final Gaussian kernel matrix.
For the conditional distribution difference D (p s (y|),p t (y|)) the present invention contemplates a Conditional Maximum Mean Difference (CMMD) based on MK-MMD. First, RUL tags of the degraded data samples are classified into four categories, and the tag classification manner is shown in (7).
Figure SMS_30
Wherein y is cls And y RUL The class label and the RUL label are indicated, respectively, and Y is the maximum life cycle. The classification method takes a degradation data sample of a health state as a class, wherein the data sample of a degradation stage is divided into three stages, and the degradation degree is gradually increased from front to back. Because the target domain degradation data is unlabeled, in the training process, the model firstly uses a source domain pre-training label classifier, then the target domain data obtains a classification pseudo label through the classifier, and along with iterative training, the precision of the classifier is gradually improved, so that an accurate classification label is obtained. The present invention uses the cross entropy loss function to calculate the label classification loss for the source domain, as shown in (8).
Figure SMS_31
Wherein N is s Representing the number of source domain samples, c i And
Figure SMS_32
the real label and the class label of the ith source domain sample are respectively represented. Based on the classification results of the source domain and the target domain, the CMMD calculation method is as shown in (9)Shown.
Figure SMS_33
Where c= { c|0,1,2,3 represents the degraded data sample class.
Figure SMS_34
And->
Figure SMS_35
Representing samples of degraded data belonging to class c in the source domain and the target domain, respectively.
In the training process of the model, the calculation method of the dynamic distribution adaptive factor mu is shown as (10).
Figure SMS_36
d A (X s ,X t )=2(1-2ε(X s ,X t ))(11)
Wherein d A (X s ,X t ) The measurement of the characteristic difference of the degradation data samples of the source domain and the target domain can measure the edge distribution alignment condition of the characteristics of the source domain and the target domain in the training process. d, d A (X s(c) ,X t(c) ) The measurement of the characteristic difference of the degradation data samples representing the source domain and the target domain belonging to the class c can measure the condition distribution alignment condition of the characteristics of the source domain and the target domain in the training process. Epsilon represents the error of the support vector based classifier to distinguish between source domain and target domain data samples.
As can be derived from equations (5) (9) (10), the objective function of the dynamically distributed adaptive module is:
Figure SMS_37
by minimizing the objective function, the degradation features of each source domain and target domain are mapped to the same feature space reduction feature distribution differences, and finally multiple degradation feature representations can be obtained.
(3) Regression prediction module: first, the degradation characteristics of the source domain and the target domain obtained by the dynamic distribution self-adaptive module are used for obtaining RUL prediction labels of the source domain and the target domain through a specific domain regression predictor. The present invention uses RMSE performance evaluation index as the predictive loss function, as shown in (13).
Figure SMS_38
Finally, the module fuses RUL labels obtained by various degradation characteristics of the target domain through predictors of various specific domains to serve as final RUL prediction labels. The invention adopts the mean value fusion method to ensure that decision boundaries of each domain pair are aligned, and identical target samples predicted by different regressors should be predicted identically. Thus, the model needs to minimize the differences between all domain-specific regressors. The objective function of aligning the predicted results of each regressor is shown as (14).
Figure SMS_39
Wherein S and N s The number of source fields and the number of samples per source field are represented, respectively.
Figure SMS_40
And->
Figure SMS_41
The prediction results of the regressors m and n i-th data samples are shown, respectively.
4. Joint loss function
The joint loss function of the MDDAN model consists of four parts: regression prediction loss error
Figure SMS_42
Label class loss error->
Figure SMS_43
Dynamic distributed adaptive objective function->
Figure SMS_44
An objective function aligned with the predicted outcome +.>
Figure SMS_45
Thus, the joint loss function of the model can be expressed as:
Figure SMS_46
where λ is a trade-off coefficient for control of
Figure SMS_47
And->
Figure SMS_48
The occupied loss specific gravity. Beta=2/(1 +)
e -10×(i+1)/ ) -1, which is a time-varying coefficient, varying with each training iteration, i being the current number of iterations and epochs being the total number of iterations.
RUL prediction procedure
The overall operation flow of the MDDAN model proposed by the invention is shown in figure 4. The multi-source domain data and the target domain data are respectively extracted through a common feature extractor to obtain shallow feature representation, then different source domains and target domains are respectively extracted through a specific domain feature extractor to obtain source domain degradation features and target domain degradation features, and the degradation feature dynamic distribution difference is calculated in a dynamic distribution domain self-adaptive module. Then, the degradation characteristic is subjected to regression prediction to obtain a prediction result of each source domain and each target domain, and an error is calculated. And taking the average value of a plurality of prediction results of the target domain to obtain a final RUL prediction tag. And finally, calculating a joint loss error by the model, carrying out back propagation by using a random gradient descent method, optimizing model parameters, aligning degradation characteristics of each domain pair, enabling a target domain to obtain multiple degradation characteristic representations, and improving the accuracy and generalization capability of the prediction model.
In summary, the present invention is merely a preferred embodiment; the scope of the invention is not limited in this respect. Any person skilled in the art, within the technical scope of the present disclosure, may apply to the present invention, and the technical solution and the improvement thereof are all covered by the protection scope of the present invention.

Claims (4)

1. The residual service life prediction method based on the dynamic distribution self-adaption multi-source domain transfer learning is characterized by comprising the following steps of:
1) Given the existing active domain and target domain degradation data;
2) For degradation data pretreating;
3) Extracting degradation characteristic representations of degradation data of a source domain and a target domain;
4) Aligning degradation characteristic distribution of each source domain and each target domain to obtain multiple degradation characteristic representations of the target domain;
5) And merging the RUL labels obtained by the multiple degradation characteristics of the target domain through the predictors of the specific domains to serve as final RUL prediction labels.
2. The method for predicting the remaining service life of the multi-source domain transfer learning based on dynamic distribution self-adaption according to claim 1, wherein the method comprises the following steps:
in step 1), the source domain and target domain degradation data:
given existing multisensor degradation data { X } s×n As shown in formula (1).
Figure QLYQS_1
The degradation data is represented in a matrix form, where s represents the number of sensors that can monitor the degradation state and n is the length of the degradation data, typically on a time period scale, to characterize the lifetime of the device.
3. The method for predicting the remaining service life of the multi-source domain transfer learning based on dynamic distribution self-adaption according to claim 1, wherein the method comprises the following steps:
in the step 2), since the magnitude and size of the monitoring data of the plurality of sensors are greatly different, normalization is needed before the monitoring data are applied to a model, and the degradation data are preprocessed by adopting a maximum and minimum normalization method, and a calculation formula is shown in the step (2).
Figure QLYQS_2
Wherein max (x i ) And min (x) i ) Representing the ith characteristic signal x in the data sample x i Maximum and minimum of (a), normalized data x' i ∈[0,1]. The sliding time window method is then used to convert the degraded data into a time series input. The size of the input time window is T W The time step is t d . The input data may be expressed as:
Figure QLYQS_3
4. the method for predicting the remaining service life of the multi-source domain transfer learning based on dynamic distribution self-adaption according to claim 1, wherein the method comprises the following steps:
the steps 3) to 5) include three modules: the system comprises a degradation characteristic extraction module, a dynamic distribution self-adaptive module and a regression prediction module.
And the degradation characteristic extraction module is used for: the module consists of a common feature extractor and a domain-specific feature extractor for extracting degradation feature representations of source domain and target domain degradation data.
A common feature extractor: the function of this section is mainly to extract low-level feature representations of the multi-source and target domains, which are common feature extractors composed of four convolution blocks for extracting low-level feature representations of the source and target domains. The convolutional neural network is a deep learning structure utilizing convolutional operation, which allows the neural network to reduce feature space, effectively filter input and prevent overfitting. Furthermore, CNNs can effectively filter noise in time series through convolution operations, enabling the generation of a series of robust features that do not include outliers. The convolution layer of the part uses two-dimensional convolution operation, and the size of the convolution kernel can be expressed as (kernel_size, 1), so that the convolution layer only carries out convolution operation along the direction of the characteristic dimension, and the time dependence of the direction of the time dimension is not destroyed.
Specific domain feature extractor: the part is mainly used for extracting unique characteristics of a specific field, and low-level characteristic representations obtained by each source field and target field through a previous layer module are respectively obtained by a specific field characteristic extractor to be used as final degradation characteristics. The domain-specific feature extractor consists of four layers of GRU units, through which a low-level feature representation results in a high-level feature representation of the source domain and the target domain. The GRU is a variant form of the cyclic neural network, and the hidden state calculation mode in the cyclic neural network is modified by introducing the concepts of a Reset Gate (Reset Gate) and an Update Gate (Update Gate), so that the problem that the gradient in the cyclic neural network is easy to attenuate or explode is solved, and the time step distance large-scale dependency relationship in the degradation data serving as a time sequence can be better captured.
Dynamic distribution self-adaption module: the module is used to dynamically adjust the effects of edge distribution differences and conditional distribution differences, align the degradation feature distributions of each source domain and target domain, and obtain multiple degradation feature representations of the target domain. The dynamic distribution self-adaption method is not equally important to propose edge distribution self-adaption and conditional distribution self-adaption, and the method can adaptively adjust the importance of the edge distribution and the conditional distribution in the distribution self-adaption process according to the distribution situation of actual degradation data. Dynamic distribution adaptation dynamically adjusts the distance between two distributions by employing a balancing factor μ:
D(D s ,D t )≈(1-μ)D(p s (x),p t (x))+μD(p s (y|x),p t (y|x))(4)
when mu is close to 0, the degradation data of the source domain and the target domain have larger difference, and the edge distribution adaptation is more important; μ is close to 1, which means that the source domain and the target domain data sets have higher similarity, and the condition distribution adaptation is more important.
The invention adopts a multi-core maximum mean difference (MK-MMD) method to measure the data edge distribution difference D (p) of the source domain and the target domain s (x),p t (x) Edge distribution adaptation is achieved by minimizing MK-MMD, which is calculated in the manner shown in (5).
Figure QLYQS_4
Wherein x is s Representing source domain degradation characteristics obeying p distribution, x t Representing the degradation characteristics of the q-distributed target domain, phi (·) is a mapping function, and the degradation data is mapped into a renewable hilbert space (RKHS) for measurement. However, phi (·) is not explicitly defined due to difficulty in selection, but an inner product of phi (·) is calculated by introducing a kernel function, and MMD is indirectly calculated. The invention adopts Gaussian kernel function, and the calculation formula is shown as (6).
Figure QLYQS_5
Where σ is the width of the kernel. In MK-MMD, a plurality of sigma values are taken to calculate to obtain a plurality of kernel matrixes, and then the kernel matrixes are summed to obtain the final Gaussian kernel matrix.
For the conditional distribution difference D (p s (y|x),p t (y|x)), a Conditional Maximum Mean Difference (CMMD) based on MK-MMD is designed. First, RUL tags of the degraded data samples are classified into four categories, and the tag classification manner is shown in (7).
Figure QLYQS_6
Wherein y is cls And y RUL Respectively representing a classification label and an RUL label, wherein Y is the maximum life cycle, and the classification method is to take a degradation data sample of a health state as a class, wherein the data sample of the degradation stage is divided into three stages from front to backThe post degradation degree is gradually increased, and because the target domain degradation data is not marked, in the training process, the model firstly uses a source domain pre-training label classifier, then the target domain data obtains a classification pseudo label through the classifier, and along with iterative training, the precision of the classifier is gradually improved, so that an accurate classification label is obtained. The present invention uses the cross entropy loss function to calculate the label classification loss for the source domain, as shown in (8).
Figure QLYQS_7
Wherein N is s Representing the number of source domain samples, c i And
Figure QLYQS_8
the real label and the class label of the ith source domain sample are respectively represented. The CMMD calculation method is shown in (9) according to the classification result of the source domain and the target domain.
Figure QLYQS_9
Where c= { c|0,1,2,3} represents the degraded data sample class.
Figure QLYQS_10
And->
Figure QLYQS_11
Representing samples of degraded data belonging to class c in the source domain and the target domain, respectively.
In the training process of the model, the calculation method of the dynamic distribution adaptive factor mu is shown as (10),
Figure QLYQS_12
d A (X s ,X t )=2(1-2ε(X s ,X t )) (11)
wherein the method comprises the steps of,d A (X s ,X t ) The measurement of the characteristic difference of the degradation data samples of the source domain and the target domain can measure the edge distribution alignment condition of the characteristics of the source domain and the target domain in the training process, and d A (X s(c) ,X t(c) ) The measurement of the characteristic difference of the degradation data samples of the source domain and the target domain belonging to the class c can measure the condition distribution alignment condition of the characteristics of the source domain and the target domain in the training process, and epsilon represents the error of distinguishing the data samples of the source domain and the target domain by a classifier based on a support vector.
As can be derived from equations (5) (9) (10), the objective function of the dynamically distributed adaptive module is:
Figure QLYQS_13
by minimizing the objective function, the degradation features of each source domain and target domain are mapped to the same feature space reduction feature distribution difference, and finally the target domain can obtain multiple degradation feature representations.
Regression prediction module: first, the degradation characteristics of the source domain and the target domain obtained by the dynamic distribution self-adaptive module are used for obtaining RUL prediction labels of the source domain and the target domain through a specific domain regression predictor. The present invention uses RMSE performance evaluation index as the predictive loss function, as shown in (13).
Figure QLYQS_14
Finally, the module fuses RUL labels obtained by various degradation characteristics of the target domain through predictors of various specific domains to serve as final RUL prediction labels. The module adopts a mean value fusion method to ensure that decision boundaries of each domain pair are aligned, and identical target samples predicted by different regressors should be predicted identically, so that the model needs to minimize differences among regressors in all specific domains, and an objective function for aligning prediction results of the regressors is shown as (14).
Figure QLYQS_15
Wherein S and N s The number of source fields and the number of samples per source field are represented, respectively.
Figure QLYQS_16
And->
Figure QLYQS_17
The prediction results of the regressors m and n i-th data samples are shown, respectively.
Joint loss function: the joint loss function of the MDDAN model consists of four parts: regression prediction loss error
Figure QLYQS_18
Label class loss error->
Figure QLYQS_19
Dynamic distributed adaptive objective function->
Figure QLYQS_20
An objective function aligned with the predicted outcome +.>
Figure QLYQS_21
Thus, the joint loss function of the model can be expressed as:
Figure QLYQS_22
where λ is a trade-off coefficient for control of
Figure QLYQS_23
And->
Figure QLYQS_24
The occupied loss specific gravity. Beta=2/(1+e) -10×(i+1)/epochs ) -1, a time-varying coefficient, which varies with each training iterationI is the current iteration number and epochs is the total iteration number.
In summary, the multi-source domain data and the target domain data are respectively extracted by a common feature extractor to obtain shallow feature representations, then different source domains and target domains are respectively extracted by a specific domain feature extractor to obtain source domain degradation features and target domain degradation features, a dynamic distribution domain self-adaptive module calculates degradation feature dynamic distribution differences, then the degradation features are subjected to regression prediction to obtain prediction results of each source domain and target domain, errors are calculated, a plurality of prediction results of the target domain are averaged to obtain a final RUL prediction label, finally, a model calculates joint loss errors, back propagation is carried out by using a random gradient descent method, model parameters are optimized, the degradation features of each domain pair are aligned, multiple degradation feature representations are obtained by the target domain, and the accuracy and generalization capability of a prediction model are improved.
CN202211706024.XA 2022-12-29 2022-12-29 Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption Pending CN116415485A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211706024.XA CN116415485A (en) 2022-12-29 2022-12-29 Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211706024.XA CN116415485A (en) 2022-12-29 2022-12-29 Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption

Publications (1)

Publication Number Publication Date
CN116415485A true CN116415485A (en) 2023-07-11

Family

ID=87048666

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211706024.XA Pending CN116415485A (en) 2022-12-29 2022-12-29 Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption

Country Status (1)

Country Link
CN (1) CN116415485A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117252083A (en) * 2023-07-12 2023-12-19 中国科学院空间应用工程与技术中心 Bearing residual life prediction method and system combining degradation phase division and sub-domain self-adaption

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117252083A (en) * 2023-07-12 2023-12-19 中国科学院空间应用工程与技术中心 Bearing residual life prediction method and system combining degradation phase division and sub-domain self-adaption

Similar Documents

Publication Publication Date Title
CN112784965B (en) Large-scale multi-element time series data anomaly detection method oriented to cloud environment
Yang et al. Remaining useful life prediction based on a double-convolutional neural network architecture
Fu et al. Deep residual LSTM with domain-invariance for remaining useful life prediction across domains
Lei et al. Mutual information based anomaly detection of monitoring data with attention mechanism and residual learning
CN113486578B (en) Method for predicting residual life of equipment in industrial process
CN109934130A (en) The in-orbit real-time fault diagnosis method of satellite failure and system based on deep learning
CN113642225B (en) CNN-LSTM short-term wind power prediction method based on attention mechanism
Wu et al. A weighted deep domain adaptation method for industrial fault prognostics according to prior distribution of complex working conditions
Ayodeji et al. Causal augmented ConvNet: A temporal memory dilated convolution model for long-sequence time series prediction
Miao et al. A novel real-time fault diagnosis method for planetary gearbox using transferable hidden layer
CN109766992A (en) Industry control abnormality detection and attack classification based on deep learning
US11657270B2 (en) Self-assessing deep representational units
CN118154174B (en) Intelligent operation and maintenance cloud platform for industrial equipment
CN116186633A (en) Power consumption abnormality diagnosis method and system based on small sample learning
CN113989550A (en) Electric vehicle charging pile operation state prediction method based on CNN and LSTM hybrid network
CN116415485A (en) Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption
CN117010263A (en) Residual life prediction method based on convolutional neural network and long-term and short-term memory network
CN118397429A (en) Integrated system and method for checking intelligent sorting of cargoes based on computer vision
CN113469013B (en) Motor fault prediction method and system based on transfer learning and time sequence
CN112560252B (en) Method for predicting residual life of aeroengine
CN113935413A (en) Distribution network wave recording file waveform identification method based on convolutional neural network
CN117371331A (en) KDIN-based method for predicting residual life of aeroengine
US20230350402A1 (en) Multi-task learning based rul predication method under sensor fault condition
Wen et al. An unsupervised subdomain adversarial network for remaining useful life estimation under various conditions
CN115898927A (en) Fusion device molecular pump fault diagnosis and prediction method based on machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination