CN116415485A - Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption - Google Patents
Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption Download PDFInfo
- Publication number
- CN116415485A CN116415485A CN202211706024.XA CN202211706024A CN116415485A CN 116415485 A CN116415485 A CN 116415485A CN 202211706024 A CN202211706024 A CN 202211706024A CN 116415485 A CN116415485 A CN 116415485A
- Authority
- CN
- China
- Prior art keywords
- domain
- degradation
- data
- target domain
- source
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000009826 distribution Methods 0.000 title claims abstract description 98
- 238000000034 method Methods 0.000 title claims abstract description 58
- 238000013508 migration Methods 0.000 title abstract description 7
- 230000005012 migration Effects 0.000 title abstract description 7
- 230000015556 catabolic process Effects 0.000 claims abstract description 135
- 238000006731 degradation reaction Methods 0.000 claims abstract description 135
- 238000013526 transfer learning Methods 0.000 claims abstract description 11
- 230000006870 function Effects 0.000 claims description 38
- 238000012549 training Methods 0.000 claims description 22
- 230000006978 adaptation Effects 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 16
- 230000008569 process Effects 0.000 claims description 15
- 238000013528 artificial neural network Methods 0.000 claims description 12
- 230000003044 adaptive effect Effects 0.000 claims description 9
- 125000004122 cyclic group Chemical group 0.000 claims description 9
- 238000005259 measurement Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 7
- 238000013527 convolutional neural network Methods 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000012544 monitoring process Methods 0.000 claims description 6
- 238000013135 deep learning Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 238000011156 evaluation Methods 0.000 claims description 3
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 230000005484 gravity Effects 0.000 claims description 3
- 230000036541 health Effects 0.000 claims description 3
- 238000007500 overflow downdraw method Methods 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 2
- 238000007781 pre-processing Methods 0.000 abstract description 3
- 238000005516 engineering process Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 3
- 238000011176 pooling Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2119/00—Details relating to the type or aim of the analysis or the optimisation
- G06F2119/02—Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Geometry (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a residual service life prediction method for multi-source domain transfer learning based on dynamic distribution self-adaption, which comprises the following steps: 1) Given the existing active domain and target domain degradation data; 2) Preprocessing the degradation data; 3) Extracting degradation characteristic representations of degradation data of a source domain and a target domain; 4) Aligning degradation characteristic distribution of each source domain and each target domain to obtain multiple degradation characteristic representations of the target domain; 5) And merging the RUL labels obtained by the multiple degradation characteristics of the target domain through the predictors of the specific domains to serve as final RUL prediction labels. The migration learning can utilize similarities between data, tasks, or models to apply models and knowledge learned from old domains to new domains. The RUL prediction method based on transfer learning utilizes the existing degradation data set to train a prediction model, and applies the learned knowledge transfer to data sets of different working conditions to realize cross-domain RUL prediction.
Description
Technical Field
The invention relates to the technical field of fault processing, in particular to a residual service life prediction method based on dynamic distribution self-adaption multi-source domain migration learning.
Background
With the rapid development of the intelligent industrial age, the artificial intelligent technology is widely applied to the field of mechanical equipment fault and predictive management (PHM), and greatly improves the operation reliability of mechanical equipment while reducing manpower and material resources. As one of key technologies of PHM, the residual service life prediction utilizes condition monitoring data and fault mechanism of equipment to build a degradation model, analyzes degradation trend of the equipment to judge the fault time of the equipment, and has wide prospect in the fields of manufacturing industry, aerospace and the like. The development of the technology aims at giving an early warning to equipment which is about to fail, preventing the equipment from suddenly failing to cause huge loss and safety problems, reducing the maintenance cost of involved pens and improving the operation reliability of the equipment. However, the accuracy of RUL predictions is susceptible to various factors, such as uncertainty factors, e.g., operating conditions, operating environment, and noise in the monitored data. Therefore, RUL prediction of mechanical devices has been a challenging task.
In recent years, many RUL prediction methods have been proposed, and these methods can be classified into three categories: model-based methods, data-driven methods, and hybrid methods of the former two. The model-based approach is to build a mathematical or physical degradation model to predict the RUL of the mechanical component based on the system failure mechanism. The data driving method is to use a large amount of historical data to extract degradation characteristics of equipment, establish a mapping relation between the degradation characteristics and the residual service life, and fit a degradation curve to achieve the purpose of predicting RUL. The mixing method is to utilize the historical operation data and fault mechanism of the equipment at the same time, and fully combine the advantages of the two methods to carry out RUL prediction. Because model-based methods require a lot of a priori knowledge and for complex devices it is very difficult to build corresponding degradation models, data-driven methods have been a research hotspot for RUL prediction. The deep learning has been developed in the field of residual service life prediction due to its strong feature extraction capability and accurate regression analysis capability, but a key problem still exists. In RUL prediction for most devices, it is generally assumed that the test set and the training set are from the same operating conditions, follow the same distribution, and therefore the model only has accurate prediction results under the same operating conditions. However, in the actual operation process of the equipment, the working conditions of most of the equipment are different, and certain difference exists in the distribution of the data collected by the sensors, so that the accuracy of RUL prediction is drastically reduced.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention aims to provide a residual service life prediction method based on dynamic distribution self-adaption multi-source domain transfer learning.
In order to solve the problems, the invention adopts the following technical scheme.
A residual service life prediction method based on dynamic distribution self-adaption multi-source domain transfer learning comprises the following steps:
1) Given the existing active domain and target domain degradation data;
2) Preprocessing the degradation data;
3) Extracting degradation characteristic representations of degradation data of a source domain and a target domain;
4) Aligning degradation characteristic distribution of each source domain and each target domain to obtain multiple degradation characteristic representations of the target domain;
5) And merging the RUL labels obtained by the multiple degradation characteristics of the target domain through the predictors of the specific domains to serve as final RUL prediction labels.
As a further improvement of the present invention,
in step 1), the source domain and target domain degradation data:
given existing multisensor degradation data { X } s×b As shown in formula (1).
The degradation data is represented in a matrix form, where s represents the number of sensors that can monitor the degradation state and n is the length of the degradation data, typically on a time period scale, to characterize the lifetime of the device.
As a further improvement of the present invention,
in the step 2), since the magnitude and size of the monitoring data of the plurality of sensors are greatly different, normalization is needed before the monitoring data are applied to a model, and the degradation data are preprocessed by adopting a maximum and minimum normalization method, and a calculation formula is shown in the step (2).
Wherein max (x i ) And min (x) i ) Representing the ith characteristic signal x in the data sample x i Maximum and minimum of (a), normalized data x' i ∈[0,1]. The sliding time window method is then used to convert the degraded data into a time series input. The size of the input time window is T w The time step is t d . The input data may be expressed as:
as a further improvement of the present invention, in the steps 3) to 5), three modules are included: the system comprises a degradation characteristic extraction module, a dynamic distribution self-adaptive module and a regression prediction module.
And the degradation characteristic extraction module is used for: the module consists of a common feature extractor and a domain-specific feature extractor for extracting degradation feature representations of source domain and target domain degradation data.
A common feature extractor: the function of this section is mainly to extract low-level feature representations of the multi-source and target domains, which are common feature extractors composed of four convolution blocks for extracting low-level feature representations of the source and target domains. The convolutional neural network is a deep learning structure utilizing convolutional operation, which allows the neural network to reduce feature space, effectively filter input and prevent overfitting, and CNN can effectively filter noise in a time sequence through the convolutional operation so as to generate a series of robust features excluding outliers, the convolutional layer uses two-dimensional convolutional operation, the size of a convolutional kernel can be expressed as (kernel_size, 1), and the convolutional layer only carries out the convolutional operation along the direction of a feature dimension without destroying the time dependence of the direction of the time dimension.
Feature field feature extractor: the part is mainly used for extracting unique characteristics of a specific field, and low-level characteristic representations obtained by each source field and target field through a previous layer module are respectively obtained by a specific field characteristic extractor to be used as final degradation characteristics. The special domain feature extractor consists of four layers of GRU units, the low-level feature representation obtains the high-level feature representation of the source domain and the target domain through the part, the GRU is a variant form of the cyclic neural network, the concept of a Reset Gate (Reset Gate) and an Update Gate (Update Gate) is introduced to modify the calculation mode of the hidden state in the cyclic neural network, the problem that the gradient in the cyclic neural network is easy to attenuate or explode is solved, and the time step distance in the degradation data serving as a time sequence can be better captured to greatly depend on the time step distance.
Dynamic distribution self-adaption module: the method is used for dynamically adjusting the influence of edge distribution difference and condition distribution difference, aligning degradation characteristic distribution of each source domain and each target domain, obtaining multiple degradation characteristic representations of the target domain, and providing that edge distribution self-adaption and condition distribution self-adaption are not equally important by a dynamic distribution self-adaption method. Dynamic distribution adaptation dynamically adjusts the distance between two distributions by employing a balancing factor μ:
D(D s ,D t )≈(1-μ)D(p s (x,p t (x+μD(p s (y|x,p t (y|x(4)
when mu is close to 0, the degradation data of the source domain and the target domain have larger difference, and the edge distribution adaptation is more important; μ is close to 1, which means that the source domain and the target domain data sets have higher similarity, and the condition distribution adaptation is more important.
The invention adopts a multi-core maximum mean difference (MK-MMD) method to measure the data edge distribution difference D (p) of the source domain and the target domain s (x,p t (x) edge distribution adaptation is achieved by minimizing MK-MMD, which is calculated in the manner shown in (5).
Wherein x is s Representing source domain degradation characteristics obeying p distribution, x t Representing the degradation characteristics of the q-distributed target domain, phi (·) is a mapping function, and the degradation data is mapped into a renewable hilbert space (RKHS) for measurement. However, phi (·) is not explicitly defined due to difficulty in selection, but an inner product of phi (·) is calculated by introducing a kernel function, and MMD is indirectly calculated. The invention adopts Gaussian kernel function, and the calculation formula is shown as (6).
Where σ is the width of the kernel. In MK-MMD, a plurality of sigma values are taken to calculate to obtain a plurality of kernel matrixes, and then the kernel matrixes are summed to obtain the final Gaussian kernel matrix.
For the conditional distribution difference D (p s (y|),p t (y|)) the present invention contemplates a Conditional Maximum Mean Difference (CMMD) based on MK-MMD. First, RUL tags of the degraded data samples are classified into four categories, and the tag classification manner is shown in (7).
Wherein y is cls And y RUL Respectively represent class labelsIn the RUL label, Y is the maximum life cycle, the classification method takes a degradation data sample of a health state as a class, the data sample of the degradation stage is divided into three stages, degradation degree is gradually increased from front to back, and as the degradation data of a target domain is not marked, in the training process, a model firstly uses a source domain to pretrain a label classifier, then the target domain data obtains a classification pseudo label through the classifier, and the precision of the classifier is gradually improved along with iterative training, so that an accurate classification label is obtained. The present invention uses the cross entropy loss function to calculate the label classification loss for the source domain, as shown in (8).
Wherein N is s Representing the number of source domain samples, c i Andthe real label and the class label of the ith source domain sample are respectively represented. The CMMD calculation method is shown in (9) according to the classification result of the source domain and the target domain.
Where c= { c|0,1,2,3 represents the degraded data sample class.And->Representing samples of degraded data belonging to class c in the source domain and the target domain, respectively.
In the training process of the model, the calculation method of the dynamic distribution adaptive factor mu is shown as (10).
Wherein d A (X s ,X t ) Representation ofThe measurement of the characteristic difference of the source domain and target domain degradation data samples can measure the edge distribution alignment condition of the source domain and target domain characteristics in the training process, and d A (X s(c) ,X t(c) ) The measurement of the characteristic difference of the degradation data samples of the source domain and the target domain belonging to the class c can measure the condition distribution alignment condition of the characteristics of the source domain and the target domain in the training process, and epsilon represents the error of distinguishing the data samples of the source domain and the target domain by a classifier based on a support vector.
As can be derived from equations (5) (9) (10), the objective function of the dynamically distributed adaptive module is:
by minimizing the objective function, the degradation features of each source domain and target domain are mapped to the same feature space reduction feature distribution differences, and finally multiple degradation feature representations can be obtained.
Regression prediction module: first, the degradation characteristics of the source domain and the target domain obtained by the dynamic distribution self-adaptive module are used for obtaining RUL prediction labels of the source domain and the target domain through a specific domain regression predictor. The present invention uses RMSE performance evaluation index as the predictive loss function, as shown in (13).
Finally, the module fuses RUL labels obtained by various degradation characteristics of the target domains through predictors of each specific domain, and as a final RUL prediction label, an average fusion method is used for ensuring that decision boundaries of each domain pair are aligned, and identical target samples predicted by different regressors are required to obtain identical predictions, so that the model needs to minimize differences among regressors of all specific domains, and an objective function for aligning prediction results of the regressors is shown as (14).
Wherein S and N s The number of source fields and the number of samples per source field are represented, respectively.And->The prediction results of the regressors m and n i-th data samples are shown, respectively.
Joint loss function: the joint loss function of the MDDAN model consists of four parts: regression prediction loss errorLabel class loss error->Dynamic distributed adaptive objective function->Objective function aligned with predicted outcomeThus, the joint loss function of the model can be expressed as:
where λ is a trade-off coefficient for control ofAnd->The occupied loss specific gravity. Beta=2/(1+e) -10×(i+1)/ ) -1, which is a time-varying coefficient, varying with each training iteration, i being the current number of iterations and epochs being the total number of iterations.
In summary, the multi-source domain data and the target domain data are respectively extracted by a common feature extractor to obtain shallow feature representations, then different source domains and target domains are respectively extracted by a specific domain feature extractor to obtain source domain degradation features and target domain degradation features, a dynamic distribution domain self-adaptive module calculates degradation feature dynamic distribution differences, then the degradation features are subjected to regression prediction to obtain prediction results of each source domain and target domain, errors are calculated, a plurality of prediction results of the target domain are averaged to obtain a final RUL prediction label, finally, a model calculates joint loss errors, back propagation is carried out by using a random gradient descent method, model parameters are optimized, the degradation features of each domain pair are aligned, multiple degradation feature representations are obtained by the target domain, and the accuracy and generalization capability of a prediction model are improved.
The beneficial effects of the invention are that
Compared with the prior art, the invention has the advantages that:
in order to improve the accuracy of cross-working-condition RUL prediction, the existing technology mostly uses a Shan Yuanyu transfer learning method, but Shan Yuanyu transfer learning extraction of cross-domain invariant features is single. There is also a risk of negative migration if the data distribution of the source domain and the target domain are too different. According to the method, training data sets under different working conditions are used as source domains, multiple distribution advantages of the data sets are utilized, negative migration caused by overlarge data distribution difference between a single source domain and a target domain is avoided, and complex degradation data are fully represented by multiple degradation characteristics of the target domain, so that the aim of improving the prediction accuracy of the cross-working condition RUL is fulfilled.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
Fig. 2 is a diagram of a convolutional block structure of the present invention.
FIG. 3 is a block diagram of a GRU unit of the invention.
Fig. 4 is a flowchart illustrating the operation of the MDDAN model according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention; it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments, and that all other embodiments obtained by persons of ordinary skill in the art without making creative efforts based on the embodiments in the present invention are within the protection scope of the present invention.
1. Source domain and target domain degradation data
Given existing multisensor degradation data { X } s×n As shown in formula (1).
The degradation data is represented in a matrix form, where s represents the number of sensors that can monitor the degradation state and n is the length of the degradation data, typically on a time period scale, to characterize the lifetime of the device.
2. Data preprocessing strategy
First, since the magnitude and size of the monitoring data of the plurality of sensors vary widely, normalization is required before application to the model. The invention adopts a maximum and minimum normalization method to preprocess degradation data, and a calculation formula is shown as (2).
Wherein max (x i ) And min (x) i ) Representing the ith characteristic signal x in the data sample x i Maximum and minimum of (a), normalized data x' i ∈[0,1]. The sliding time window method is then used to convert the degraded data into a time series input. The size of the input time window is T W The time step is t d . The input data may be expressed as:
3. multi-source dynamic distribution self-adaption based migration learning prediction model
The invention provides a multi-source domain transfer learning model based on dynamic distribution self-adaption, which is used for RUL prediction, and is called MDDAN for short. The model mainly comprises three modules: the system comprises a degradation characteristic extraction module, a dynamic distribution self-adaptive module and a regression prediction module.
(1) And the degradation characteristic extraction module is used for: the module consists of a common feature extractor and a domain-specific feature extractor for extracting degradation feature representations of source domain and target domain degradation data.
a. A common feature extractor: the role of this part is mainly to extract low-level feature representations of the multi-source and target domains, which are common feature extractors. The common feature extractor consists of four convolution blocks for extracting low-level feature representations of the source and target domains. The convolutional neural network is a deep learning structure utilizing convolutional operation, which allows the neural network to reduce feature space, effectively filter input and prevent overfitting. Furthermore, CNNs can effectively filter noise in time series through convolution operations, enabling the generation of a series of robust features that do not include outliers. Note that as shown in fig. 2, a general convolution block is composed of a convolution layer, a batch normalization layer, and a pooling layer. However, the convolution blocks used by the common feature extractor do not contain a pooling layer. Since pooling loses the position information of time-series data, it is meaningless for the pooled convolution blocks of data to be reused by the GRU. The convolution layer uses a two-dimensional convolution operation, and the size of the convolution kernel can be expressed as (kernel_size, 1), so that the convolution layer only carries out the convolution operation along the direction of the characteristic dimension, and the time dependency relationship of the direction of the time dimension is not destroyed.
b. Feature field feature extractor: the part is mainly used for extracting unique characteristics of a specific field, and low-level characteristic representations obtained by each source field and target field through a previous layer module are respectively obtained by a specific field characteristic extractor to be used as final degradation characteristics. The domain-specific feature extractor consists of two layers of GRU units, through which a low-level feature representation results in a high-level feature representation of the source domain and the target domain. The GRU is a variant form of the cyclic neural network, and the hidden state calculation mode in the cyclic neural network is modified by introducing the concepts of a Reset Gate (Reset Gate) and an Update Gate (Update Gate), so that the problem that the gradient in the cyclic neural network is easy to attenuate or explode is solved, and the time step distance large-scale dependency relationship in the degradation data serving as a time sequence can be better captured. As shown in fig. 3, it can control the flow of information through a learnable gate.
(2) Dynamic distribution self-help and an adaptation module: the module is used to dynamically adjust the effects of edge distribution differences and conditional distribution differences, align the degradation feature distributions of each source domain and target domain, and obtain multiple degradation feature representations of the target domain. It is not equally important that the dynamic distribution adaptation method proposes edge distribution adaptation and conditional distribution adaptation. According to the method, the importance of edge distribution and conditional distribution in the distribution adaptation process can be adaptively adjusted according to the distribution situation of actual degradation data. Precisely, dynamic distribution adaptation dynamically adjusts the distance between two distributions by employing a balancing factor μ:
D(D s ,D t )≈(1-μ)D(p s (x,p t (x+μD(p s (y|x,p t (y|x(4)
when mu is close to 0, the degradation data of the source domain and the target domain have larger difference, and the edge distribution adaptation is more important; μ is close to 1, which means that the source domain and the target domain data sets have higher similarity, and the condition distribution adaptation is more important.
The invention adopts a multi-core maximum mean difference (MK-MMD) method to measure the data edge distribution difference D (p) of the source domain and the target domain s (x,p t (x) edge distribution adaptation is achieved by minimizing MK-MMD, which is calculated in the manner shown in (5).
To be used for
Wherein x is s Representing source domain degradation characteristics obeying p distribution, x t Representing the degradation characteristics of the target domain subject to q distribution. Phi (·) is a mapping function that maps the degradation data into a reproducible Hilbert space (RKHS) for measurement. However, phi (·) is not explicitly defined due to the difficulty of selection. Instead, the MMD is calculated indirectly by introducing the inner product of the kernel function calculation phi (.). The inventionThe calculation formula is shown as (6) by using a Gaussian kernel function (RBF kernel).
Where σ is the width of the kernel. In MK-MMD, a plurality of sigma values are taken to calculate to obtain a plurality of kernel matrixes, and then the kernel matrixes are summed to obtain the final Gaussian kernel matrix.
For the conditional distribution difference D (p s (y|),p t (y|)) the present invention contemplates a Conditional Maximum Mean Difference (CMMD) based on MK-MMD. First, RUL tags of the degraded data samples are classified into four categories, and the tag classification manner is shown in (7).
Wherein y is cls And y RUL The class label and the RUL label are indicated, respectively, and Y is the maximum life cycle. The classification method takes a degradation data sample of a health state as a class, wherein the data sample of a degradation stage is divided into three stages, and the degradation degree is gradually increased from front to back. Because the target domain degradation data is unlabeled, in the training process, the model firstly uses a source domain pre-training label classifier, then the target domain data obtains a classification pseudo label through the classifier, and along with iterative training, the precision of the classifier is gradually improved, so that an accurate classification label is obtained. The present invention uses the cross entropy loss function to calculate the label classification loss for the source domain, as shown in (8).
Wherein N is s Representing the number of source domain samples, c i Andthe real label and the class label of the ith source domain sample are respectively represented. Based on the classification results of the source domain and the target domain, the CMMD calculation method is as shown in (9)Shown.
Where c= { c|0,1,2,3 represents the degraded data sample class.And->Representing samples of degraded data belonging to class c in the source domain and the target domain, respectively.
In the training process of the model, the calculation method of the dynamic distribution adaptive factor mu is shown as (10).
d A (X s ,X t )=2(1-2ε(X s ,X t ))(11)
Wherein d A (X s ,X t ) The measurement of the characteristic difference of the degradation data samples of the source domain and the target domain can measure the edge distribution alignment condition of the characteristics of the source domain and the target domain in the training process. d, d A (X s(c) ,X t(c) ) The measurement of the characteristic difference of the degradation data samples representing the source domain and the target domain belonging to the class c can measure the condition distribution alignment condition of the characteristics of the source domain and the target domain in the training process. Epsilon represents the error of the support vector based classifier to distinguish between source domain and target domain data samples.
As can be derived from equations (5) (9) (10), the objective function of the dynamically distributed adaptive module is:
by minimizing the objective function, the degradation features of each source domain and target domain are mapped to the same feature space reduction feature distribution differences, and finally multiple degradation feature representations can be obtained.
(3) Regression prediction module: first, the degradation characteristics of the source domain and the target domain obtained by the dynamic distribution self-adaptive module are used for obtaining RUL prediction labels of the source domain and the target domain through a specific domain regression predictor. The present invention uses RMSE performance evaluation index as the predictive loss function, as shown in (13).
Finally, the module fuses RUL labels obtained by various degradation characteristics of the target domain through predictors of various specific domains to serve as final RUL prediction labels. The invention adopts the mean value fusion method to ensure that decision boundaries of each domain pair are aligned, and identical target samples predicted by different regressors should be predicted identically. Thus, the model needs to minimize the differences between all domain-specific regressors. The objective function of aligning the predicted results of each regressor is shown as (14).
Wherein S and N s The number of source fields and the number of samples per source field are represented, respectively.And->The prediction results of the regressors m and n i-th data samples are shown, respectively.
4. Joint loss function
The joint loss function of the MDDAN model consists of four parts: regression prediction loss errorLabel class loss error->Dynamic distributed adaptive objective function->An objective function aligned with the predicted outcome +.>Thus, the joint loss function of the model can be expressed as:
where λ is a trade-off coefficient for control ofAnd->The occupied loss specific gravity. Beta=2/(1 +)
e -10×(i+1)/ ) -1, which is a time-varying coefficient, varying with each training iteration, i being the current number of iterations and epochs being the total number of iterations.
RUL prediction procedure
The overall operation flow of the MDDAN model proposed by the invention is shown in figure 4. The multi-source domain data and the target domain data are respectively extracted through a common feature extractor to obtain shallow feature representation, then different source domains and target domains are respectively extracted through a specific domain feature extractor to obtain source domain degradation features and target domain degradation features, and the degradation feature dynamic distribution difference is calculated in a dynamic distribution domain self-adaptive module. Then, the degradation characteristic is subjected to regression prediction to obtain a prediction result of each source domain and each target domain, and an error is calculated. And taking the average value of a plurality of prediction results of the target domain to obtain a final RUL prediction tag. And finally, calculating a joint loss error by the model, carrying out back propagation by using a random gradient descent method, optimizing model parameters, aligning degradation characteristics of each domain pair, enabling a target domain to obtain multiple degradation characteristic representations, and improving the accuracy and generalization capability of the prediction model.
In summary, the present invention is merely a preferred embodiment; the scope of the invention is not limited in this respect. Any person skilled in the art, within the technical scope of the present disclosure, may apply to the present invention, and the technical solution and the improvement thereof are all covered by the protection scope of the present invention.
Claims (4)
1. The residual service life prediction method based on the dynamic distribution self-adaption multi-source domain transfer learning is characterized by comprising the following steps of:
1) Given the existing active domain and target domain degradation data;
2) For degradation data pretreating;
3) Extracting degradation characteristic representations of degradation data of a source domain and a target domain;
4) Aligning degradation characteristic distribution of each source domain and each target domain to obtain multiple degradation characteristic representations of the target domain;
5) And merging the RUL labels obtained by the multiple degradation characteristics of the target domain through the predictors of the specific domains to serve as final RUL prediction labels.
2. The method for predicting the remaining service life of the multi-source domain transfer learning based on dynamic distribution self-adaption according to claim 1, wherein the method comprises the following steps:
in step 1), the source domain and target domain degradation data:
given existing multisensor degradation data { X } s×n As shown in formula (1).
The degradation data is represented in a matrix form, where s represents the number of sensors that can monitor the degradation state and n is the length of the degradation data, typically on a time period scale, to characterize the lifetime of the device.
3. The method for predicting the remaining service life of the multi-source domain transfer learning based on dynamic distribution self-adaption according to claim 1, wherein the method comprises the following steps:
in the step 2), since the magnitude and size of the monitoring data of the plurality of sensors are greatly different, normalization is needed before the monitoring data are applied to a model, and the degradation data are preprocessed by adopting a maximum and minimum normalization method, and a calculation formula is shown in the step (2).
Wherein max (x i ) And min (x) i ) Representing the ith characteristic signal x in the data sample x i Maximum and minimum of (a), normalized data x' i ∈[0,1]. The sliding time window method is then used to convert the degraded data into a time series input. The size of the input time window is T W The time step is t d . The input data may be expressed as:
4. the method for predicting the remaining service life of the multi-source domain transfer learning based on dynamic distribution self-adaption according to claim 1, wherein the method comprises the following steps:
the steps 3) to 5) include three modules: the system comprises a degradation characteristic extraction module, a dynamic distribution self-adaptive module and a regression prediction module.
And the degradation characteristic extraction module is used for: the module consists of a common feature extractor and a domain-specific feature extractor for extracting degradation feature representations of source domain and target domain degradation data.
A common feature extractor: the function of this section is mainly to extract low-level feature representations of the multi-source and target domains, which are common feature extractors composed of four convolution blocks for extracting low-level feature representations of the source and target domains. The convolutional neural network is a deep learning structure utilizing convolutional operation, which allows the neural network to reduce feature space, effectively filter input and prevent overfitting. Furthermore, CNNs can effectively filter noise in time series through convolution operations, enabling the generation of a series of robust features that do not include outliers. The convolution layer of the part uses two-dimensional convolution operation, and the size of the convolution kernel can be expressed as (kernel_size, 1), so that the convolution layer only carries out convolution operation along the direction of the characteristic dimension, and the time dependence of the direction of the time dimension is not destroyed.
Specific domain feature extractor: the part is mainly used for extracting unique characteristics of a specific field, and low-level characteristic representations obtained by each source field and target field through a previous layer module are respectively obtained by a specific field characteristic extractor to be used as final degradation characteristics. The domain-specific feature extractor consists of four layers of GRU units, through which a low-level feature representation results in a high-level feature representation of the source domain and the target domain. The GRU is a variant form of the cyclic neural network, and the hidden state calculation mode in the cyclic neural network is modified by introducing the concepts of a Reset Gate (Reset Gate) and an Update Gate (Update Gate), so that the problem that the gradient in the cyclic neural network is easy to attenuate or explode is solved, and the time step distance large-scale dependency relationship in the degradation data serving as a time sequence can be better captured.
Dynamic distribution self-adaption module: the module is used to dynamically adjust the effects of edge distribution differences and conditional distribution differences, align the degradation feature distributions of each source domain and target domain, and obtain multiple degradation feature representations of the target domain. The dynamic distribution self-adaption method is not equally important to propose edge distribution self-adaption and conditional distribution self-adaption, and the method can adaptively adjust the importance of the edge distribution and the conditional distribution in the distribution self-adaption process according to the distribution situation of actual degradation data. Dynamic distribution adaptation dynamically adjusts the distance between two distributions by employing a balancing factor μ:
D(D s ,D t )≈(1-μ)D(p s (x),p t (x))+μD(p s (y|x),p t (y|x))(4)
when mu is close to 0, the degradation data of the source domain and the target domain have larger difference, and the edge distribution adaptation is more important; μ is close to 1, which means that the source domain and the target domain data sets have higher similarity, and the condition distribution adaptation is more important.
The invention adopts a multi-core maximum mean difference (MK-MMD) method to measure the data edge distribution difference D (p) of the source domain and the target domain s (x),p t (x) Edge distribution adaptation is achieved by minimizing MK-MMD, which is calculated in the manner shown in (5).
Wherein x is s Representing source domain degradation characteristics obeying p distribution, x t Representing the degradation characteristics of the q-distributed target domain, phi (·) is a mapping function, and the degradation data is mapped into a renewable hilbert space (RKHS) for measurement. However, phi (·) is not explicitly defined due to difficulty in selection, but an inner product of phi (·) is calculated by introducing a kernel function, and MMD is indirectly calculated. The invention adopts Gaussian kernel function, and the calculation formula is shown as (6).
Where σ is the width of the kernel. In MK-MMD, a plurality of sigma values are taken to calculate to obtain a plurality of kernel matrixes, and then the kernel matrixes are summed to obtain the final Gaussian kernel matrix.
For the conditional distribution difference D (p s (y|x),p t (y|x)), a Conditional Maximum Mean Difference (CMMD) based on MK-MMD is designed. First, RUL tags of the degraded data samples are classified into four categories, and the tag classification manner is shown in (7).
Wherein y is cls And y RUL Respectively representing a classification label and an RUL label, wherein Y is the maximum life cycle, and the classification method is to take a degradation data sample of a health state as a class, wherein the data sample of the degradation stage is divided into three stages from front to backThe post degradation degree is gradually increased, and because the target domain degradation data is not marked, in the training process, the model firstly uses a source domain pre-training label classifier, then the target domain data obtains a classification pseudo label through the classifier, and along with iterative training, the precision of the classifier is gradually improved, so that an accurate classification label is obtained. The present invention uses the cross entropy loss function to calculate the label classification loss for the source domain, as shown in (8).
Wherein N is s Representing the number of source domain samples, c i Andthe real label and the class label of the ith source domain sample are respectively represented. The CMMD calculation method is shown in (9) according to the classification result of the source domain and the target domain.
Where c= { c|0,1,2,3} represents the degraded data sample class.And->Representing samples of degraded data belonging to class c in the source domain and the target domain, respectively.
In the training process of the model, the calculation method of the dynamic distribution adaptive factor mu is shown as (10),
d A (X s ,X t )=2(1-2ε(X s ,X t )) (11)
wherein the method comprises the steps of,d A (X s ,X t ) The measurement of the characteristic difference of the degradation data samples of the source domain and the target domain can measure the edge distribution alignment condition of the characteristics of the source domain and the target domain in the training process, and d A (X s(c) ,X t(c) ) The measurement of the characteristic difference of the degradation data samples of the source domain and the target domain belonging to the class c can measure the condition distribution alignment condition of the characteristics of the source domain and the target domain in the training process, and epsilon represents the error of distinguishing the data samples of the source domain and the target domain by a classifier based on a support vector.
As can be derived from equations (5) (9) (10), the objective function of the dynamically distributed adaptive module is:
by minimizing the objective function, the degradation features of each source domain and target domain are mapped to the same feature space reduction feature distribution difference, and finally the target domain can obtain multiple degradation feature representations.
Regression prediction module: first, the degradation characteristics of the source domain and the target domain obtained by the dynamic distribution self-adaptive module are used for obtaining RUL prediction labels of the source domain and the target domain through a specific domain regression predictor. The present invention uses RMSE performance evaluation index as the predictive loss function, as shown in (13).
Finally, the module fuses RUL labels obtained by various degradation characteristics of the target domain through predictors of various specific domains to serve as final RUL prediction labels. The module adopts a mean value fusion method to ensure that decision boundaries of each domain pair are aligned, and identical target samples predicted by different regressors should be predicted identically, so that the model needs to minimize differences among regressors in all specific domains, and an objective function for aligning prediction results of the regressors is shown as (14).
Wherein S and N s The number of source fields and the number of samples per source field are represented, respectively.And->The prediction results of the regressors m and n i-th data samples are shown, respectively.
Joint loss function: the joint loss function of the MDDAN model consists of four parts: regression prediction loss errorLabel class loss error->Dynamic distributed adaptive objective function->An objective function aligned with the predicted outcome +.>Thus, the joint loss function of the model can be expressed as:
where λ is a trade-off coefficient for control ofAnd->The occupied loss specific gravity. Beta=2/(1+e) -10×(i+1)/epochs ) -1, a time-varying coefficient, which varies with each training iterationI is the current iteration number and epochs is the total iteration number.
In summary, the multi-source domain data and the target domain data are respectively extracted by a common feature extractor to obtain shallow feature representations, then different source domains and target domains are respectively extracted by a specific domain feature extractor to obtain source domain degradation features and target domain degradation features, a dynamic distribution domain self-adaptive module calculates degradation feature dynamic distribution differences, then the degradation features are subjected to regression prediction to obtain prediction results of each source domain and target domain, errors are calculated, a plurality of prediction results of the target domain are averaged to obtain a final RUL prediction label, finally, a model calculates joint loss errors, back propagation is carried out by using a random gradient descent method, model parameters are optimized, the degradation features of each domain pair are aligned, multiple degradation feature representations are obtained by the target domain, and the accuracy and generalization capability of a prediction model are improved.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211706024.XA CN116415485A (en) | 2022-12-29 | 2022-12-29 | Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211706024.XA CN116415485A (en) | 2022-12-29 | 2022-12-29 | Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116415485A true CN116415485A (en) | 2023-07-11 |
Family
ID=87048666
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211706024.XA Pending CN116415485A (en) | 2022-12-29 | 2022-12-29 | Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116415485A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117252083A (en) * | 2023-07-12 | 2023-12-19 | 中国科学院空间应用工程与技术中心 | Bearing residual life prediction method and system combining degradation phase division and sub-domain self-adaption |
-
2022
- 2022-12-29 CN CN202211706024.XA patent/CN116415485A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117252083A (en) * | 2023-07-12 | 2023-12-19 | 中国科学院空间应用工程与技术中心 | Bearing residual life prediction method and system combining degradation phase division and sub-domain self-adaption |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112784965B (en) | Large-scale multi-element time series data anomaly detection method oriented to cloud environment | |
Yang et al. | Remaining useful life prediction based on a double-convolutional neural network architecture | |
Fu et al. | Deep residual LSTM with domain-invariance for remaining useful life prediction across domains | |
Lei et al. | Mutual information based anomaly detection of monitoring data with attention mechanism and residual learning | |
CN113486578B (en) | Method for predicting residual life of equipment in industrial process | |
CN109934130A (en) | The in-orbit real-time fault diagnosis method of satellite failure and system based on deep learning | |
CN113642225B (en) | CNN-LSTM short-term wind power prediction method based on attention mechanism | |
Wu et al. | A weighted deep domain adaptation method for industrial fault prognostics according to prior distribution of complex working conditions | |
Ayodeji et al. | Causal augmented ConvNet: A temporal memory dilated convolution model for long-sequence time series prediction | |
Miao et al. | A novel real-time fault diagnosis method for planetary gearbox using transferable hidden layer | |
CN109766992A (en) | Industry control abnormality detection and attack classification based on deep learning | |
US11657270B2 (en) | Self-assessing deep representational units | |
CN118154174B (en) | Intelligent operation and maintenance cloud platform for industrial equipment | |
CN116186633A (en) | Power consumption abnormality diagnosis method and system based on small sample learning | |
CN113989550A (en) | Electric vehicle charging pile operation state prediction method based on CNN and LSTM hybrid network | |
CN116415485A (en) | Multi-source domain migration learning residual service life prediction method based on dynamic distribution self-adaption | |
CN117010263A (en) | Residual life prediction method based on convolutional neural network and long-term and short-term memory network | |
CN118397429A (en) | Integrated system and method for checking intelligent sorting of cargoes based on computer vision | |
CN113469013B (en) | Motor fault prediction method and system based on transfer learning and time sequence | |
CN112560252B (en) | Method for predicting residual life of aeroengine | |
CN113935413A (en) | Distribution network wave recording file waveform identification method based on convolutional neural network | |
CN117371331A (en) | KDIN-based method for predicting residual life of aeroengine | |
US20230350402A1 (en) | Multi-task learning based rul predication method under sensor fault condition | |
Wen et al. | An unsupervised subdomain adversarial network for remaining useful life estimation under various conditions | |
CN115898927A (en) | Fusion device molecular pump fault diagnosis and prediction method based on machine learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |