CN115204227A - Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning - Google Patents
Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning Download PDFInfo
- Publication number
- CN115204227A CN115204227A CN202210820966.4A CN202210820966A CN115204227A CN 115204227 A CN115204227 A CN 115204227A CN 202210820966 A CN202210820966 A CN 202210820966A CN 115204227 A CN115204227 A CN 115204227A
- Authority
- CN
- China
- Prior art keywords
- distribution
- uncertainty
- fault diagnosis
- fault
- calibration
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003745 diagnosis Methods 0.000 title claims abstract description 145
- 238000000034 method Methods 0.000 title claims abstract description 53
- 238000013135 deep learning Methods 0.000 title claims abstract description 25
- 238000009826 distribution Methods 0.000 claims abstract description 253
- 238000012544 monitoring process Methods 0.000 claims abstract description 96
- 230000006870 function Effects 0.000 claims abstract description 51
- 238000013528 artificial neural network Methods 0.000 claims abstract description 38
- 238000011002 quantification Methods 0.000 claims abstract description 36
- 238000012549 training Methods 0.000 claims abstract description 28
- 230000001149 cognitive effect Effects 0.000 claims abstract description 26
- 238000012360 testing method Methods 0.000 claims abstract description 19
- 238000012614 Monte-Carlo sampling Methods 0.000 claims abstract description 9
- 238000007781 pre-processing Methods 0.000 claims abstract description 6
- 230000003044 adaptive effect Effects 0.000 claims description 12
- 238000011156 evaluation Methods 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 11
- 238000005070 sampling Methods 0.000 claims description 8
- 238000012795 verification Methods 0.000 claims description 8
- 230000019771 cognition Effects 0.000 claims description 7
- 238000013531 bayesian neural network Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 238000012935 Averaging Methods 0.000 claims description 3
- 238000005315 distribution function Methods 0.000 claims description 3
- 238000009499 grossing Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 210000002569 neuron Anatomy 0.000 claims description 3
- 238000013139 quantization Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims description 2
- 238000011158 quantitative evaluation Methods 0.000 claims description 2
- 238000012216 screening Methods 0.000 claims description 2
- 238000000354 decomposition reaction Methods 0.000 abstract description 3
- 230000036541 health Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 3
- 238000012423 maintenance Methods 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000003754 machining Methods 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 241000197727 Euscorpius alpha Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013398 bayesian method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000005553 drilling Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000010892 electric spark Methods 0.000 description 1
- 238000009760 electrical discharge machining Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01M—TESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
- G01M7/00—Vibration-testing of structures; Shock-testing of structures
- G01M7/02—Vibration-testing by means of a shake table
- G01M7/025—Measuring arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
- G06N5/041—Abduction
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Testing Or Calibration Of Command Recording Devices (AREA)
Abstract
The invention provides an uncertainty quantitative calibration method in electromechanical equipment fault diagnosis based on deep learning, which comprises the following steps: acquiring a basic data set by a monitoring signal of a preprocessing device; determining the type and scale of a deep neural network, and constructing a Bayesian deep neural network for fault diagnosis; selecting calibration errors to evaluate uncertainty calibration loss, and determining loss functions on the in-distribution and out-distribution fault monitoring data sets; and determining hyper-parameters of the model, alternately training on the fault monitoring data sets in and out of distribution, and outputting fault diagnosis and uncertainty quantification results of the test data after the model is converged. The modeling method for the uncertainty of the fault diagnosis of the electromechanical equipment integrates the calibration loss, realizes the uncertainty quantification and calibration, obtains the inherent, cognitive, distribution and prediction uncertainty quantification results after the calibration by combining the uncertainty decomposition with the Monte Carlo sampling method, and effectively improves the fault diagnosis precision and the uncertainty quantification precision of the equipment.
Description
Technical Field
The invention belongs to the technical field of fault diagnosis in prediction and health management, and particularly relates to an uncertainty quantitative calibration method in fault diagnosis of electromechanical equipment based on deep learning.
Background
With the rapid development of economic science and technology, the increasing complexity of equipment brings unprecedented challenges to fault diagnosis. Particularly for safety key application, whether the fault occurs or not and the fault category can be accurately and timely judged, so that the serious economic loss and casualties can be effectively avoided. Fault diagnosis plays an important role in prediction and health management as an important method for determining the state of health of equipment. With the development and wide application of sensor technology, real-time monitoring of the operation state of equipment becomes possible, and a data base is provided for data-driven diagnostic technology. In addition, deep learning relies on strong learning expression ability and is widely applied to fault diagnosis.
Although the fault diagnosis method based on deep learning obtains excellent performance, a reliable uncertainty quantification result cannot be given to the diagnosis result, and the practical application of the methods is greatly limited. In fault diagnosis, uncertainties are generally classified into three categories: intrinsic, cognitive and distribution uncertainties. Inherent uncertainty captures the inherent noise of the observed data, reflecting the uncertainty caused by unknown or missing information, such as measurement errors, and the like, which cannot be reduced by adding training data. Cognitive uncertainty reflects uncertainty caused by lack of knowledge and can be reduced by adding training data. The distribution uncertainty reflects the uncertainty caused by data distribution change, and can capture the test samples which are not seen in the model training process, namely the uncertainty caused by the fault monitoring data outside the distribution. When quantifying uncertainty using a deep learning approach, many studies have the output of softmax as a prediction distribution, but the prediction distribution is often overly confident and unable to quantify the true prediction uncertainty. On the other hand, the bayesian deep learning method has both uncertainty quantification capability of the bayesian method and expression learning capability of the deep learning method, and thus becomes a hot point of current research. In order to realize simple and convenient calculation of the Bayes deep neural network, the approximation method based on dropout draws attention by virtue of higher calculation efficiency and superior performance. However, due to errors in model selection and the use of approximate reasoning, the quantification of uncertainty based on bayesian deep neural networks is often inaccurate. For example, a sample set with a prediction probability of 0.9 typically cannot contain 90% of the correct prediction results. Therefore, in order to improve the accuracy of uncertainty quantification while quantifying uncertainty, a Bayesian deep learning-based method for calibrating uncertainty quantification in a fault diagnosis process needs to be researched.
Disclosure of Invention
The invention aims to solve the technical problems of constructing an uncertainty modeling method based on a Bayesian depth neural network, integrating calibration loss, realizing uncertainty quantification and calibration, acquiring inherent uncertainty, cognitive uncertainty, distribution uncertainty and prediction uncertainty quantification results after calibration by combining uncertainty decomposition with a Monte Carlo sampling method, and improving fault diagnosis precision and uncertainty quantification precision.
In order to solve the problems, the invention provides a deep learning-based method for quantitatively calibrating uncertainty in fault diagnosis of electromechanical equipment, which comprises the following steps:
s1, preprocessing a fault monitoring signal of electromechanical equipment to acquire a basic data set: preprocessing a fault monitoring signal of the electromechanical equipment, including signal screening, feature extraction, data normalization and set division, and dividing an obtained in-distribution fault monitoring data set and an obtained out-distribution fault monitoring data set to obtain a training set, a verification set and a test set;
s2, determining the type and scale of the fault diagnosis deep neural network: selecting a deep neural network according to the characteristics of the equipment fault monitoring signal, and determining the scale of the network according to the size of the data set, wherein the scale comprises the number of neurons and the number of network layers;
s3, constructing a Bayes deep neural network for electromechanical device fault diagnosis: selecting probability distribution to capture inherent, cognitive and distribution uncertainty in fault diagnosis, constructing probability distribution by using a Bayesian deep neural network, and determining an estimation method of the distribution;
s31, quantifying cognitive uncertainty in fault diagnosis is fused into a fault diagnosis network, and posterior distribution of model parameters is estimated in a Bayesian neural network by selecting variation inference and combining a Monte Carlo sampling method;
s32, integrating quantification of distribution uncertainty in fault diagnosis into a fault diagnosis network, and respectively constructing KL divergence aiming at a fault monitoring data set in distribution and a fault monitoring data set out of distribution so as to determine corresponding probability distribution of the fault monitoring data set;
s33, quantitatively integrating inherent uncertainty in fault diagnosis into a fault diagnosis network to obtain KL divergence of the Bayesian neural network;
s4, constructing uncertainty quantitative calibration loss in fault diagnosis: determining a calibration evaluation index and evaluating calibration loss, and integrating the calibration loss into an overall loss function so as to realize calibration while realizing uncertainty quantification in the fault diagnosis of the electromechanical equipment;
s5, determining a loss function: constructing a loss function of the whole model;
s51, determining a loss function on a fault monitoring data set in distribution;
s52, determining a loss function on the external fault monitoring data set;
s6, determining the hyper-parameters of the fault diagnosis model of the electromechanical equipment: determining hyper-parameters of the model through a trial and error strategy, wherein the hyper-parameters comprise learning rate, batch size and the like, and determining the optimal combination of the hyper-parameters through grid search;
s7, training a fault diagnosis model of the electromechanical equipment: selecting an optimization method and combining the selected hyper-parameters to train a model on the in-distribution fault monitoring data set and the out-distribution fault monitoring data set in sequence;
s8, judging whether the fault diagnosis model of the electromechanical equipment is converged: judging whether the model parameter variation of the optimal model before and after training is smaller than a specified threshold, if so, executing a step S9, otherwise, executing a step S7;
s9, outputting a fault diagnosis result and an uncertainty quantification result of the electromechanical equipment: and outputting the fault diagnosis result and the quantitative results of inherent uncertainty, cognitive uncertainty, distribution uncertainty and prediction uncertainty of the fault diagnosis result by the trained model.
Further, the step S31 specifically includes the following steps:
s311, capturing cognitive uncertainty by utilizing posterior distribution of electromechanical equipment fault diagnosis model parameters, and aiming at historical dataAndwherein X is fault observation data, Y is a fault category label, and for a classification task with the classification category number of C, there is Y n ∈{1,2,L,C},x n Representing single fault observation data, N representing observation data amount, and modeling cognitive uncertainty in fault diagnosis as posterior distribution p (omega | X, Y), wherein omega represents model parameters;
s312, introducing an inferred distribution q by means of variational inference θ (ω) approximate p (ω | X, Y) and infer distribution q using KL divergence measure θ Distance between (ω) and true posterior distribution p (ω | X, Y):
KL(q θ (ω)||p(ω|X,Y))=KL(q θ (ω)||p(ω))-∫q θ (ω)log(p(Y|X,ω))dω (1)
wherein p (Y | X, ω) represents a likelihood function based on the historical data set, p (ω) represents a prior distribution of weights, KL (| | -) represents KL divergence, θ represents a variation parameter, and θ is optimized to minimize KL (q | X, ω) θ (ω) | p (ω | X, Y)) to obtain an estimate of the posterior distribution;
s313, for a deep neural network with L layers, the number of units in each layer is K l The model parameter ω is expressed as:
wherein, W l Model parameters of the l layer representing a deep neural network;
s314, applying the Concrete drop to the deep neural network to enable the deep neural network to be approximate to a Bayesian deep neural network, namely processing the fixed model parameter omega into a random variable which is inferred to be distributed:
wherein, for the variation parameter θ, there are:
wherein M is l The expression dimension is K l+1 ×K l Average weight matrix of p l The dropout probability of the l-th network, the inferred distribution q of each network θ (ω) is expressed as:
wherein Bernoulli (·) represents a Bernoulli distribution function;
s315, selecting prior distribution of the fault diagnosis model parameter omega as follows:
wherein, p (W) l ) A priori distribution of model parameters of layer I representing a deep neural network and havingυ l A control parameter representing the degree of smoothness of the function;
s316, combining the Monte Carlo sampling method to obtain KL (q) θ An analytical expression of (ω) | p (ω | X, Y)):
wherein, p (y) i |x i ω) represents the likelihood function of each sample, h (p) l ) Entropy representing Bernoulli random variable and has h (p) l )=-p l logp l -(1-p l )log(1-p l );
Further, the step S32 specifically includes the following steps:
s321, inputting the uncertain distribution performance of the fault monitoring data x, and capturing the uncertain distribution performance by placing a Dirichlet distribution on an output probability mu with a model parameter of omega:
wherein Γ (·) represents a gamma function, dir (·) represents a Dirichlet distribution, and μ = [ μ ]) 1 ,…,μ C ] T =[p(y=1),…,p(y=C)] T Representing the respective class probabilities of the model outputs, where μ 1 Indicates the output probability, μ, of class 1 C The output probability of class C is represented, and so on. α (x, ω) = [ α = 1 (x,ω),…,α C (x,ω)] T More than 0 is the parameter of Dirichlet distribution, andthe precision parameter representing the Dirichlet distribution reflects the discrete degree of the Dirichlet distribution, and the distribution is more concentrated when the value is larger;
s322, monitoring data for faults in distributionAnd out-of-distribution fault monitoring dataThe corresponding predicted Dirichlet distribution is Dir (μ | α (x) n ω)) and Dir (μ | α (x) t ω), and; to give the model the ability to discriminate between in-distribution and out-of-distribution fault monitoring data, input data x for in-distribution n And distributing the external input data x t Respectively selecting more concentrated Dirichlet distributionsDir(μ|α n ) And a relatively discrete Dirichlet distribution Dir (μ | α) t ) As target distribution, and using KL divergence to measure target distribution and predict the distance of dirichlet distribution:
for intra-distribution fault monitoring data:
for out-of-distribution fault monitoring data:
wherein T is the sample size of the out-of-distribution fault monitoring dataset, and α = [ α ] for any two parameters respectively 1 ,…,α C ] T And β = [ β = 1 ,…,β C ] T The KL divergence of the dirichlet distribution of (1) and (b) has an analytic form, namely:
where ψ (-) is a double gamma function,andrespectively representing the precision parameters of the two Dirichlet distributions;
further, the step S33 specifically includes the following steps:
s331, capturing the inherent uncertain performance in the fault diagnosis process by predicting the classification distribution on the categories:
wherein, I {. Is used for indicating function, and Cat (·) is used for classifying distribution;
s332, calculating a likelihood function:
wherein,is Dir (μ | α (x) n ω)) of the measured values; then, KL (q) θ (ω) | p (ω | X, Y)) is expressed as:
further, the step S4 specifically includes the following steps:
s41, defining calibration of uncertainty in a fault diagnosis process: calibration is defined as the failure category prediction probability can reflect the accuracy of the predicted value, when the failure diagnosis model has been calibrated, for a set of samples with failure category prediction probability κ, there should be samples with ratio κ correctly classified, and thus, for each category c ∈ {1, l, c }, it satisfies:
wherein Y | X ~ P, andthe predicted probability of the category c corresponding to the input X is input;
s42, uncertainty calibration error evaluation in fault diagnosis: for quantitatively evaluating calibration errors, selecting adaptive calibration errors as evaluation indexes, adjusting probability intervals according to the number of samples by the adaptive calibration errors for each category to ensure that each probability interval contains equal number of samples, calculating the accuracy and the average prediction probability in each interval, and if the accuracy and the average prediction probability are approximately equalIf the number of probability intervals for the class c is M, the probability interval for the M-th interval is represented as U c,m Can be arranged in sequenceObtaining a corresponding sample set ofThe accuracy within this interval can then be calculated:
and average prediction probability:
the adaptive calibration error is obtained by averaging the calibration errors for each class and interval:
the self-adaptive calibration error considers the prediction probabilities of all categories, not only the prediction probability of the final prediction category, and can better perform comprehensive evaluation on the calibration errors of the multi-classification task, and in addition, when the sample size of fault monitoring data is small or the prediction probabilities are concentrated in 0 and 1, the self-adaptive calibration error can ensure that each section has enough samples by adaptively adjusting the length of the probability section, and the calibration error is objectively evaluated;
s43, constructing uncertainty calibration loss in fault diagnosis: in order to improve the quantitative accuracy of uncertainty in fault diagnosis during model training, calibration loss is fused into an integral loss function, the calibration loss is evaluated by a calibration error, a small batch gradient descent algorithm is selected for model optimization during Bayesian deep neural network training, each small batch cannot have sufficient fault monitoring data samples to evaluate the calibration loss, and therefore a self-adaptive calibration error can be selected to evaluate the calibration loss:
where CL (·) represents the uncertainty calibration loss in fault diagnosis.
Further, the step S51 specifically includes the following steps:
s511, for the fault monitoring data in distribution, the cognition and inherent uncertainty modeling in fault diagnosis can be realized by minimizing KL (q) θ (ω) | p (ω | X, Y)) and distributed uncertainty modeling in fault diagnosis can be achieved by minimizingThe method is realized by the following steps that the loss function corresponding to uncertainty quantification in the fault diagnosis of the electromechanical equipment based on Bayesian deep learning is as follows:
wherein λ is a hyper-parameter controlling contribution degree of loss corresponding to distribution uncertainty in fault diagnosis in total loss, and in addition, predicted Dirichlet distribution of fault monitoring data in distribution is concentrated, and target distribution Dir (μ | α) is distributed n ) Parameter (d) ofShould be larger, in order to ensure the prediction accuracy, the average value of the target distribution should be corresponding 0-1 label, i.e. the distribution average value satisfies Can be considered as a constant independent of the input, however, this will result in a n Most of the parameters in the Dirichlet distribution take values of 0, so that the corresponding KL divergence is difficult to optimize, and the problem is avoided by endowing a region with the probability density of 0 in the Dirichlet distribution with a lower probability;
the xi is a smoothing factor with a smaller value, and values of all parameters of Dirichlet distribution are obtained on the basis;
s512, integrating uncertainty calibration loss in fault diagnosis into overall loss, realizing uncertainty calibration while training a fault diagnosis model, and expressing a loss function corresponding to fault monitoring data in distribution as follows:
wherein gamma is a hyper-parameter controlling the contribution of the calibration loss in the total loss;
further, the step S52 specifically includes the following steps:
s521, for the fault monitoring data outside the distribution, the cognition and the inherent uncertainty of the fault monitoring data do not need to be modeled, the distribution uncertainty only needs to be modeled to be distinguished from the fault monitoring data inside the distribution, the predicted value and the uncertainty quantification of the fault monitoring data outside the distribution are not credible, the uncertainty in fault diagnosis is not corrected, and the corresponding loss function is expressed as:
the predicted Dirichlet distribution of the out-of-distribution fault monitoring data is more dispersed and corresponds to the target distribution Dir (mu | alpha) t ) Parameter (d) ofTaking a smaller value, the model output corresponding to the sample outside the distribution should beEqui-probability to indicate that the prediction is not reliable, i.e. alpha t All distribution parameters inIs set to 1.
Preferably, the step S9 specifically includes the following steps:
s91, measuring uncertainty in fault diagnosis through entropy, and then predicting the uncertainty to be expressed as:
wherein H [. C]Denotes the entropy of the distribution, p (Y = c | X, Y) denotes the prediction probability of the class c, by subtracting q from q θ S model parameters omega are sampled in omega s To yield p (Y = c | X, Y):
the prediction uncertainty is further broken down into:
wherein, I [. C]Representing mutual information, E μ|x,X,Y [·]Denotes the expectation under the distribution p (μ | X, Y), and p (μ | X, Y) =: [ p (μ | X, ω) p (ω | X, Y) d ω, then:
wherein, E μ|x,X,Y [H[p(y|μ)]]Measures the inherent uncertainty and mutual information I [ Y, mu | X, X, Y in fault diagnosis]Cognition and distribution uncertainty in fault diagnosis are measured, and distribution uncertainty performance is measured by an expected differential entropy alone:
s92, monitoring test data x for faults * In the testing stage, S monte carlo dropouts are executed to obtain a set of sampling values:
s93, calculating the failure prediction probability, and obtaining the final prediction type and uncertainty quantization result:
given a threshold value ε, ifX is then * The final prediction type and other uncertainty quantification results are obtained for the samples outside the fault monitoring data distribution without calculation, otherwise, x * For the samples within the fault-monitoring data distribution, its final fault prediction class c * The expression of (a) is:
compared with the prior art, the technical effect that this scheme produced is:
compared with the traditional fault diagnosis method, the uncertainty quantification method in the equipment fault diagnosis based on the deep learning, provided by the invention, has the advantages that the uncertainty in the fault diagnosis is more effectively and accurately quantified due to the comprehensive consideration of the comprehensive influence caused by inherent uncertainty, cognitive uncertainty and distribution uncertainty in the uncertainty quantification process; the uncertainty calibration method in the electromechanical device fault diagnosis process based on Bayesian deep learning is based on specific diagnosis tasks and combined with self-adaptive calibration errors, and provides uncertainty calibration loss to realize uncertainty calibration, so that an important basis is provided for accurate quantification of uncertainty in electromechanical device fault diagnosis; for the trained network, various uncertain quantification results are obtained by combining uncertainty decomposition and Monte Carlo sampling, so that the diagnosis precision and the uncertain quantification precision are effectively improved.
Drawings
FIG. 1 is a flowchart of an uncertainty quantitative calibration method in deep learning-based fault diagnosis of electromechanical devices according to the present invention;
FIG. 2 is a graph of a prediction network constructed for a bearing dataset;
FIG. 3 is a statistical result of the distribution uncertainty on the test data of the present invention both outside and inside the distribution of the PU data set;
FIG. 4 is a statistical result of the distribution uncertainty on the test data outside and inside the distribution of the IMS dataset according to the present invention;
Detailed Description
The present application will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present application may be combined with each other without conflict.
The following fig. 1 shows an uncertainty quantitative calibration method in deep learning based fault diagnosis of an electromechanical device, which includes the following steps:
s1, preprocessing a monitoring signal to obtain a basic data set: two bearing datasets were analyzed and preprocessed, the University of Padrborn (PU) dataset, produced and provided by the University of padrborn, germany, which was collected from a modular test stand and contained three operating conditions: four working condition data consisting of rotating speed, loading torque and radial force. In the data set, the monitoring data of each bearing under each working condition consists of 20 data files, and each data file stores vibration signals and motor current data with the sampling frequency of 64kHz and the sampling time of 4 seconds. Considering that the vibration signal can effectively reflect the health state of the bearing in fault diagnosis, only the vibration signal data is selected for analysis. Further, the bearings in the data set are classified into three categories according to the cause of the occurrence of a failure: healthy, human damaged and accelerated testing caused damage. Wherein, select rotational speed 900rpm, loading torque 0.7Nm, the bearing state monitoring data of health condition and artificial damage under the radial force 1000N operating mode as experimental data set, wherein the trouble of artificial damage bearing is produced by artificial processing, and the damage mode includes: electrical discharge machining, drilling and electrical engraving. In addition, the damaged portion and the damaged degree of each bearing were different, and specific information of the selected bearing is shown in table 1. Selecting a bearing with a damage mode of electric spark machining as an outer distribution sample, selecting bearings with other damage modes as inner distribution samples, and setting the proportion of the samples in a training set, a verification set and a test set in the inner distribution fault monitoring data and the outer distribution fault monitoring data to be 7.
TABLE 1
An Intelligent Maintenance System (IMS) data set is provided by the american Intelligent Maintenance system center, which contains three sub data sets. Each subdata set consists of a plurality of data files, and each data file stores vibration signals of 4 bearings with the sampling frequency of 20kHz and the sampling time of 1 second. And selecting the bearing 1 data in the second subdata to carry out an experiment, wherein the total number of the bearing 1 data is 984 data files, and the acquisition interval time of each file is 10 minutes. The bearing data are grouped according to the degree of degradation of the bearing as shown in table 2. Similarly, the degradation later stage and fault data are selected as samples outside the distribution, the rest are samples inside the distribution, and the proportion of the samples in the training set, the verification set and the test set is 7. Compared with the PU data set, the experimental data set describes a complete life cycle of the bearing, and faults are generated in the running process of the bearing instead of being generated by artificial machining. Thus, the data set is closer to reality, with higher observed noise. In addition, the goal of fault diagnosis on this data set is to determine the current state of health of the bearing in order to develop a maintenance and security strategy.
TABLE 2
S2, determining the type and scale of the deep neural network: and selecting a Recurrent Neural Network (RNN) as a basic frame according to the time sequence characteristics of the bearing monitoring signals. And time domain characteristics of the vibration signal such as mean value, kurtosis, skewness, root mean square value and the like are used as input of the RNN to further extract characteristics and make prediction, and in order to capture Long-time dependence characteristics, variants of the RNN such as a Long Short-term Memory (LSTM) network and a Gated Recurrent Unit (GRU) are selected to carry out model construction. To verify the universality of the proposed method, LSTM and GRU were chosen to model the PU and IMS datasets, respectively, and to obtain a stronger expressive power by stacking multiple RNN layers. And connecting a full connection layer to output parameters of Dirichlet distribution at the last time step of the last layer of the model. Considering the non-negativity of the dirichlet distribution parameters, the model output layer selects an exponential activation function, as shown in fig. 2. After multiple training verification, three layers of LSTM or GRU are selected to establish a network, and the number of neurons in each layer is 128, 64 and 32 respectively.
S3, establishing a Bayes deep neural network for fault diagnosis: and (3) applying Concrete dropouts to each layer of the network to approximate the Concrete to be a Bayesian deep neural network, and selecting variation dropouts when applying the dropouts in order to ensure the correctness of Bayesian reasoning, namely ensuring that dropouts masks of each time step are the same. In addition, probability distribution is selected to capture inherent, cognitive and distribution uncertainty, the probability distribution is constructed by utilizing a Bayesian deep neural network, and an estimation method of the distribution is determined.
S31, quantifying of cognitive uncertainty is fused into a network, and posterior distribution of model parameters is estimated in a Bayesian neural network by selecting variation inference and combining a Monte Carlo sampling method.
S311, capturing cognitive uncertainty by utilizing posterior distribution of electromechanical equipment fault diagnosis model parameters, and aiming at historical dataAndwherein X is fault observation data, Y is a fault category label, and for a classification task with the classification category number of C, there is Y n ∈{1,2,…,C},x n Representing single fault observation data, N representing an observation data amount, and modeling cognitive uncertainty in fault diagnosis as posterior distribution p (omega | X, Y), wherein omega represents a model parameter;
s312, introducing an inferred distribution q by means of variational inference θ (ω) approximate p (ω | X, Y) and infer the distribution q using KL divergence measure θ Distance between (ω) and the true posterior distribution p (ω | X, Y):
KL(q θ (ω)||p(ω|X,Y))=KL(q θ (ω)||p(ω))-∫q θ (ω)log(p(Y|X,ω))dω (1)
wherein p (Y | X, ω) represents a likelihood function based on the historical data set, p (ω) represents a prior distribution of weights, KL (| | -) represents KL divergence, θ represents a variation parameter, and θ is optimized to minimize KL (q | X, ω) θ (ω) | p (ω | X, Y)) to obtain an estimate of the posterior distribution;
s313, for a deep neural network with L layers, the number of units in each layer is K l The model parameter ω is expressed as:
wherein, W l Layer I model representing deep neural networkA type parameter;
s314, applying the Concrete drop to the deep neural network to enable the deep neural network to be approximate to a Bayesian deep neural network, namely processing the fixed model parameter omega into a random variable which is inferred to be distributed:
wherein, for the variation parameter θ, there are:
wherein M is l The expression dimension is K l+1 ×K l Average weight matrix of p l The dropout probability of the l-th network, the inferred distribution q of each network θ (ω) is expressed as:
wherein Bernoulli (·) represents a Bernoulli distribution function;
s315, selecting prior distribution of the fault diagnosis model parameter omega as follows:
wherein, p (W) l ) A priori distribution of model parameters of layer I representing a deep neural network and havingυ l A control parameter representing the degree of smoothness of the function;
s316, combining the Monte Carlo sampling method to obtain KL (q) θ Analytical expression of (ω) | p (ω | X, Y)):
wherein, p (y) i |x i ω) represents the likelihood function of each sample, h (p) l ) Entropy representing Bernoulli random variable and has h (p) l )=-p l logp l -(1-p l )log(1-p l );
S32, quantizing the distribution uncertainty into a network, and respectively constructing KL divergence aiming at the fault monitoring data sets inside and outside the distribution so as to estimate the corresponding probability distribution.
S321, the distribution uncertainty of the input fault monitoring data x is captured by placing a Dirichlet distribution on the output probability mu under the model parameter omega:
wherein Γ (·) represents a gamma function, dir (·) represents a Dirichlet distribution, and μ = [ μ ]) 1 ,…,μ C ] T =[p(y=1),…,p(y=C)] T Representing the respective class probabilities of the model outputs, where μ 1 Indicates the output probability, μ, of class 1 C The output probability of class C is represented, and so on. α (x, ω) = [ α = 1 (x,ω),…,α C (x,ω)] T More than 0 is the parameter of Dirichlet distribution, andthe precision parameter representing the Dirichlet distribution reflects the discrete degree of the Dirichlet distribution, and the distribution is more concentrated when the value is larger;
s322, monitoring data for faults in distributionAnd out-of-distribution fault monitoring dataThe corresponding predicted Dirichlet distribution is Dir (μ | α (x) n ω)) and Dir (μ | α (x)) t ω)). To give the model the ability to discriminate between in-distribution and out-of-distribution fault monitoring data, input data x for in-distribution n And distributing the external input data x t Respectively selecting more concentrated Dirichlet distribution Dir (mu | alpha) n ) And a relatively discrete Dirichlet distribution Dir (μ | α) t ) As target distribution, and measure target distribution and predict distance of dirichlet distribution using KL divergence:
for intra-distribution fault monitoring data:
for out-of-distribution fault monitoring data:
wherein, T is the sample size of the out-of-distribution fault monitoring data set, and α = [ α ] for any two parameters respectively 1 ,…,α C ] T And β = [ β = 1 ,…,β C ] T The KL divergence of the dirichlet distribution of (1) and (b) has an analytic form, namely:
where ψ (-) is a double gamma function,andrespectively representing the precision parameters of the two Dirichlet distributions;
and S33, fusing the quantification of the inherent uncertainty into the network to obtain the KL divergence of the Bayesian neural network.
S331, capturing the inherent uncertainty in the fault diagnosis process by predicting the classification distribution on the categories:
wherein, I {. Is used for indicating function, and Cat (·) is used for classifying distribution;
s332, calculating a likelihood function:
wherein,is Dir (μ | α (x) n ω)) of the measured values; then, KL (q) θ (ω) | p (ω | X, Y)) is expressed as:
the step S3 is an important invention point of the invention, and is mainly embodied in that a Bayesian deep learning network is constructed to model inherent, cognitive and distribution uncertainty in fault diagnosis, and a distribution estimation method of the Bayesian deep learning network is determined, so that an important basis is provided for quantifying the uncertainty.
S4, establishing a calibration loss: the adaptive calibration error is selected to evaluate the calibration loss and the number of intervals M is set to 10 and 20 and the number of classes C is 7 and 3 for the PU data set and the IMS data set, respectively.
S41, defining calibration of uncertainty in a fault diagnosis process: calibration is defined as the failure category prediction probability that reflects the accuracy of the predicted value, when the failure diagnosis model has been calibrated, for a set of samples with failure category prediction probability κ, there should be samples with ratio κ correctly classified, and thus, for each category C ∈ {1, …, C }, both satisfy:
wherein Y | X ~ P, andthe predicted probability of the category c corresponding to the input X is input;
s42, uncertainty calibration error evaluation in fault diagnosis: for quantitative evaluation of calibration errors, selecting adaptive calibration errors as evaluation indexes, for each category, adjusting a probability interval by the adaptive calibration errors according to the number of samples to ensure that each probability interval contains equal number of samples, calculating the accuracy and the average prediction probability in each interval, if the accuracy and the average prediction probability are approximately equal, considering the model as a calibrated model, and for category c, if the number of the probability intervals is M, expressing the probability interval of the mth interval as U c,m By arranging them in sequenceObtained as a corresponding sample setThe accuracy within this interval is then calculated:
and average prediction probability:
the adaptive calibration error is obtained by averaging the calibration errors for each class and interval:
the self-adaptive calibration error considers the prediction probabilities of all categories, not only the prediction probability of the final prediction category, and can better perform comprehensive evaluation on the calibration errors of the multi-classification task, and in addition, when the sample size of fault monitoring data is small or the prediction probabilities are concentrated in 0 and 1, the self-adaptive calibration error can ensure that each section has enough samples by adaptively adjusting the length of the probability section, and the calibration error is objectively evaluated;
s43, constructing uncertainty calibration loss in fault diagnosis: in order to improve the quantitative accuracy of uncertainty in fault diagnosis during model training, calibration loss is fused into an integral loss function, the calibration loss is evaluated by a calibration error, a small batch gradient descent algorithm is selected for model optimization during Bayesian deep neural network training, each small batch cannot have sufficient fault monitoring data samples to evaluate the calibration loss, and therefore a self-adaptive calibration error is selected to evaluate the calibration loss:
wherein CL (·) represents uncertainty calibration loss in fault diagnosis;
the step S4 is an important invention point of the present invention, and is mainly reflected in that the uncertainty calibration loss in the fault diagnosis is provided in combination with the adaptive calibration error, so as to provide an important basis for accurate quantification of uncertainty in the fault diagnosis.
S5, determining a loss function: and constructing a loss function of the fault diagnosis model.
And S51, determining a loss function on the fault monitoring data set in the distribution.
S511, for the fault monitoring data in distribution, the cognitive uncertainty and the inherent uncertainty in fault diagnosis are modeled by minimizing KL (q) θ (ω) | p (ω | X, Y)) and distributed uncertainty modeling in fault diagnosis by minimizingThe method is realized by the following steps that the loss function corresponding to uncertainty quantification in the electromechanical device fault diagnosis based on Bayesian deep learning is as follows:
wherein λ is a hyper-parameter controlling contribution degree of loss corresponding to distribution uncertainty in fault diagnosis in total loss, and in addition, predicted Dirichlet distribution of fault monitoring data in distribution is concentrated, and target distribution Dir (μ | α) is distributed n ) Parameter (d) ofShould be larger, in order to ensure the prediction accuracy, the average value of the target distribution should be corresponding 0-1 label, i.e. the distribution average value satisfies Treated as a constant independent of the input, however, this would result in a n Most of the parameters in the Dirichlet distribution take values of 0, so that the corresponding KL divergence is difficult to optimize, and the problem is avoided by endowing a region with the probability density of 0 in the Dirichlet distribution with a lower probability;
xi is a smoothing factor with a smaller value, and values of all parameters of Dirichlet distribution are obtained on the basis;
s512, integrating uncertainty calibration loss in fault diagnosis into overall loss, realizing uncertainty calibration while training a fault diagnosis model, and writing a loss function corresponding to fault monitoring data in distribution as follows:
wherein gamma is a hyper-parameter for controlling the contribution degree of the calibration loss in the total loss;
further, step S52 specifically includes the following steps:
s521, for the fault monitoring data outside the distribution, the cognition and the inherent uncertainty of the fault monitoring data do not need to be modeled, the distribution uncertainty only needs to be modeled to be distinguished from the fault monitoring data inside the distribution, the predicted value and the uncertainty quantification of the fault monitoring data outside the distribution are not credible, the uncertainty in fault diagnosis is not corrected, and the corresponding loss function is expressed as:
the predicted Dirichlet distribution of the out-of-distribution fault monitoring data is more dispersed and corresponds to the target distribution Dir (mu | alpha) t ) Parameter (d) ofTaking a smaller value, the model output corresponding to the sample outside the distribution should be equal in probability to indicate that the predicted value is not credible, namely alpha t All distribution parameters inIs set to 1;
after multiple training verification, the selection of the hyper-parameters in the loss function, such as the control parameters of each layer and the hyper-parameters controlling the contribution degree of each loss item, is shown in table 3.
TABLE 3
S6, determining the hyper-parameters of the fault diagnosis model: for the PU data set and the IMS data set, the selected sequence lengths are 30 and 20, respectively, and the initial dropout probability for the network is set to 0.2 for each. And selecting an Adam algorithm to optimize the model on the fault monitoring data sets in and out of distribution, trying various combinations of hyper-parameters by combining a trial and error strategy and grid search, and verifying and selecting the optimal combination by training as shown in Table 4.
TABLE 4
S7, training a fault diagnosis model of the electromechanical equipment: alternately training the models on the fault monitoring data sets in and out of the distribution, saving the loss on the verification set of each training round after each cycle is finished, and if the minimum loss on the verification sets of two continuous cycles does not decrease any more, considering that the models are converged, terminating the cycle, and saving the final models;
s8, judging whether the fault diagnosis model of the electromechanical equipment is converged: executing step S9 when the model parameter variation of the optimal model before and after training is smaller than a specified threshold;
s9, outputting a fault diagnosis result and an uncertainty quantification result of the electromechanical equipment: and outputting the fault diagnosis result and the quantitative results of the inherent uncertainty, the cognitive uncertainty, the distribution uncertainty and the prediction uncertainty of the fault diagnosis result by the trained model.
S91, measuring uncertainty through entropy, and describing prediction uncertainty as follows:
and the prediction uncertainty is further decomposed into:
wherein,
wherein E is μ|x,X,Y [H[p(y|μ)]]The inherent uncertainty and mutual information I [ Y, mu | X, X, Y are measured]Cognitive and distribution uncertainty is measured. In addition, distribution uncertainty alone passes through the periodThe differential entropy is expected to measure:
s92, for test data x * And executing S times of Monte Carlo dropout in the testing stage to obtain a sampling value set:
s93, calculating the prediction probability, and obtaining the final prediction type and uncertainty quantization result:
given a threshold value ε, ifX is then * To distribute the out-of-range samples, in this case, no computation is needed to get the final prediction classes and other uncertainty quantification results. Otherwise, x * For the intra-distribution samples, their final prediction class c * The expression of (a) is:
and repeating the Monte Carlo dropout for 1000 times in the testing stage to obtain 1000 groups of sampling values of Dirichlet distribution, and calculating to obtain a fault diagnosis and uncertainty quantification result. Table 5 statistics of the mean uncertainty, accuracy and adaptive calibration error for each class of test samples within the distribution. Fig. 3 and 4 show statistical results of the distribution uncertainty of the in-distribution and out-distribution test samples.
TABLE 5
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made thereto without departing from the spirit and scope of the invention and it is intended to cover in the claims the invention as defined in the appended claims.
Claims (5)
1. A method for quantitatively calibrating uncertainty in fault diagnosis of electromechanical equipment based on deep learning is characterized by comprising the following steps:
s1, preprocessing a fault monitoring signal of electromechanical equipment to acquire a basic data set: preprocessing a fault monitoring signal of the electromechanical equipment, including signal screening, feature extraction, data normalization and set division, and dividing an obtained in-distribution fault monitoring data set and an obtained out-distribution fault monitoring data set to obtain a training set, a verification set and a test set;
s2, determining the type and scale of the fault diagnosis deep neural network: selecting a deep neural network according to the characteristics of the equipment fault monitoring signal, and determining the scale of the network according to the size of the data set, wherein the scale comprises the number of neurons and the number of network layers;
s3, constructing a Bayes deep neural network for electromechanical device fault diagnosis: selecting probability distribution to capture inherent, cognitive and distribution uncertainty in fault diagnosis, constructing probability distribution by using a Bayesian deep neural network, and determining an estimation method of the distribution;
s31, quantifying cognitive uncertainty in fault diagnosis is fused into a fault diagnosis network, and posterior distribution of model parameters is estimated in a Bayesian neural network by selecting variation inference and combining a Monte Carlo sampling method;
s32, integrating quantification of distribution uncertainty in fault diagnosis into a fault diagnosis network, and respectively constructing KL divergence aiming at a fault monitoring data set in distribution and a fault monitoring data set out of distribution so as to determine corresponding probability distribution of the fault monitoring data set;
s33, quantitatively integrating inherent uncertainty in fault diagnosis into a fault diagnosis network to obtain KL divergence of the Bayesian neural network;
s4, constructing uncertainty quantitative calibration loss in fault diagnosis: determining a calibration evaluation index and evaluating calibration loss, and integrating the calibration loss into an overall loss function so as to realize calibration while realizing uncertainty quantification in the fault diagnosis of the electromechanical equipment;
s5, determining a loss function: constructing a loss function of the whole model;
s51, determining a loss function on a fault monitoring data set in distribution;
s52, determining a loss function on the external fault monitoring data set;
s6, determining the hyper-parameters of the fault diagnosis model of the electromechanical equipment: determining hyper-parameters of the model through a trial and error strategy, wherein the hyper-parameters comprise learning rate, batch size and the like, and determining the optimal combination of the hyper-parameters through grid search;
s7, training a fault diagnosis model of the electromechanical equipment: selecting an optimization method and combining the selected hyper-parameters to train a model on the in-distribution fault monitoring data set and the out-distribution fault monitoring data set in sequence;
s8, judging whether the fault diagnosis model of the electromechanical equipment is converged: judging whether the model parameter variation of the optimal model before and after training is smaller than a specified threshold, if so, executing a step S9, otherwise, executing a step S7;
s9, outputting a fault diagnosis result and an uncertainty quantification result of the electromechanical equipment: and outputting the fault diagnosis result and the quantitative results of the inherent uncertainty, the cognitive uncertainty, the distribution uncertainty and the prediction uncertainty of the fault diagnosis result by the trained model.
2. The method for quantifying and calibrating the uncertainty in the deep learning based fault diagnosis of the electromechanical device according to claim 1, wherein the step S31 specifically comprises the following steps:
s311, capturing cognitive uncertainty by utilizing posterior distribution of electromechanical equipment fault diagnosis model parameters, and aiming at historical dataAndwherein X is fault observation data, Y is a fault category label, and for a classification task with the classification category number of C, there is Y n ∈{1,2,L,C},x n Representing single fault observation data, N representing an observation data amount, and modeling cognitive uncertainty in fault diagnosis as posterior distribution p (omega | X, Y), wherein omega represents a model parameter;
s312, introducing an inferred distribution q by variation inference θ (ω) approximate p (ω | X, Y) and infer distribution q using KL divergence measure θ Distance between (ω) and the true posterior distribution p (ω | X, Y):
KL(q θ (ω)||p(ω|X,Y))=KL(q θ (ω)||p(ω))-∫q θ (ω)log(p(Y|X,ω))dω (1)
wherein p (Y | X, ω) represents a likelihood function based on the historical data set, p (ω) represents a prior distribution of weights, KL (| | -) represents KL divergence, θ represents a variation parameter, and θ is optimized to minimize KL (q | X, ω) θ (ω) | p (ω | X, Y)) to obtain an estimate of the posterior distribution;
s313, for a deep neural network with L layers, the number of units in each layer is K l The model parameter ω is expressed as:
wherein, W l Model parameters of the l layer representing a deep neural network;
s314, applying the Concrete drop to the deep neural network to enable the deep neural network to be approximate to a Bayesian deep neural network, namely processing the fixed model parameter omega into a random variable which is inferred to be distributed:
wherein, for the variation parameter θ, there are:
wherein M is l The expression dimension is K l+1 ×K l Average weight matrix of p l The dropout probability of the l-th network, the inferred distribution q of each network θ (ω) is represented by:
wherein Bernoulli (·) represents a Bernoulli distribution function;
s315, selecting prior distribution of the fault diagnosis model parameter omega as follows:
wherein, p (W) l ) A priori distribution of model parameters of layer I representing a deep neural network and havingυ l A control parameter representing the degree of smoothness of the function;
s316, combining the Monte Carlo sampling method to obtain KL (q) θ Analytical expression of (ω) | p (ω | X, Y)):
wherein, p (y) i |x i ω) represents the likelihood function of each sample, h (p) l ) Entropy representing Bernoulli random variable and has h (p) l )=-p l logp l -(1-p l )log(1-p l );
The step S32 specifically includes the following steps:
s321, inputting the uncertain distribution performance of the fault monitoring data x, and capturing the uncertain distribution performance by placing a Dirichlet distribution on an output probability mu with a model parameter of omega:
wherein Γ (·) represents a gamma function, dir (·) represents a Dirichlet distribution, and μ = [ μ ]) 1 ,…,μ C ] T =[p(y=1),…,p(y=C)] T Representing the respective class probabilities of the model outputs, where μ 1 Indicates the output probability, μ, of class 1 C Representing the output probability of class C, and so on. α (x, ω) = [ α = 1 (x,ω),…,α C (x,ω)] T More than 0 is the parameter of Dirichlet distribution, andthe accuracy parameter representing the Dirichlet distribution reflects the discrete degree of the Dirichlet distribution, and the distribution is more concentrated when the value is larger;
s322, monitoring data for faults in distributionAnd out-of-distribution fault monitoring dataThe corresponding predicted Dirichlet distribution is Dir (μ | α (x) n ω)) and Dir (μ | α (x) t ω), and; to give the model the ability to discriminate between in-distribution and out-of-distribution fault monitoring data, input data x for in-distribution n And distributing the external input data x t Respectively selecting more concentrated Dirichlet distribution Dir (mu | alpha) n ) And a relatively discrete Dirichlet distribution Dir (μ | α) t ) As target distribution, and using KL divergence to measure target distribution and predict the distance of dirichlet distribution:
for intra-distribution fault monitoring data:
for out-of-distribution fault monitoring data:
wherein, T is the sample size of the out-of-distribution fault monitoring data set, and α = [ α ] for any two parameters respectively 1 ,…,α C ] T And β = [ β = 1 ,…,β C ] T The KL divergence of the dirichlet distribution of (1) and (b) has an analytic form, namely:
where ψ (-) is a double gamma function,andrespectively representing the precision parameters of the two Dirichlet distributions;
the step S33 specifically includes the following steps:
s331, capturing the inherent uncertain performance in the fault diagnosis process by predicting the classification distribution on the categories:
wherein, I {. Represents an indication function, and Cat (·) represents classification distribution;
s332, calculating a likelihood function:
wherein,is Dir (μ | α (x) n ω)) of the measured values; then, KL (q) θ (ω) | p (ω | X, Y)) is expressed as:
3. the method for quantitatively calibrating the uncertainty in the fault diagnosis of the electromechanical device based on the deep learning of claim 1, wherein the step S4 specifically comprises the following steps:
s41, defining calibration of uncertainty in a fault diagnosis process: calibration is defined as the failure category prediction probability can reflect the accuracy of the predicted value, when the failure diagnosis model has been calibrated, for a set of samples with failure category prediction probability κ, there should be samples with ratio κ correctly classified, and thus, for each category c ∈ {1, l, c }, it satisfies:
wherein Y | X ~ P, andthe predicted probability of the category c corresponding to the input X is input;
s42, uncertainty calibration error evaluation in fault diagnosis: for quantitative evaluation of calibration errors, selecting adaptive calibration errors as evaluation indexes, for each category, adjusting probability intervals according to the number of samples by the adaptive calibration errors, so that each probability interval contains equal number of samples, calculating the accuracy and the average prediction probability in each interval, if the accuracy and the average prediction probability are approximately equal, considering the model as a calibrated model, and for category c, if the number of the probability intervals is M, considering the model as a calibrated modelFor the mth interval, the probability interval is represented as U c,m Can be arranged in sequenceObtained as a corresponding sample setThe accuracy within this interval can then be calculated:
and average prediction probability:
the adaptive calibration error is obtained by averaging the calibration errors for each class and interval:
the self-adaptive calibration error considers the prediction probabilities of all categories, not only the prediction probability of the final prediction category, and can better perform comprehensive evaluation on the calibration errors of the multi-classification task, and in addition, when the sample size of fault monitoring data is small or the prediction probabilities are concentrated in 0 and 1, the self-adaptive calibration error can ensure that each section has enough samples by adaptively adjusting the length of the probability section, and the calibration error is objectively evaluated;
s43, constructing uncertainty calibration loss in fault diagnosis: in order to improve the quantitative accuracy of uncertainty in fault diagnosis during model training, calibration loss is fused into an integral loss function, the calibration loss is evaluated by a calibration error, a small batch gradient descent algorithm is selected for model optimization during Bayesian deep neural network training, each small batch cannot have sufficient fault monitoring data samples to evaluate the calibration loss, and therefore a self-adaptive calibration error can be selected to evaluate the calibration loss:
where CL (·) represents the uncertainty calibration loss in fault diagnosis.
4. The method for quantitatively calibrating the uncertainty in the fault diagnosis of the electromechanical device based on the deep learning of claim 1, wherein the step S51 specifically comprises the following steps:
s511, for the fault monitoring data in distribution, the cognition and inherent uncertainty modeling in fault diagnosis can be realized by minimizing KL (q) θ (ω) | p (ω | X, Y)) and distributed uncertainty modeling in fault diagnosis can be achieved by minimizingThe method is realized by the following steps that the loss function corresponding to uncertainty quantification in the electromechanical device fault diagnosis based on Bayesian deep learning is as follows:
wherein λ is a hyper-parameter controlling contribution degree of loss corresponding to distribution uncertainty in fault diagnosis in total loss, and in addition, predicted Dirichlet distribution of fault monitoring data in distribution is concentrated, and target distribution Dir (μ | α) is distributed n ) Parameter (d) ofShould be larger, in order to ensure the prediction accuracy, the average value of the target distribution should be corresponding 0-1 label, i.e. the distribution average value satisfies Can be considered as a constant independent of the input, however, this will result in a n Most of the parameters in the dirichlet allocation are taken as 0, so that the corresponding KL divergence is difficult to optimize, and the problem is avoided by endowing a region with the probability density of 0 in the dirichlet allocation with a lower probability;
xi is a smoothing factor with a smaller value, and values of all parameters of Dirichlet distribution are obtained on the basis;
s512, integrating uncertainty calibration loss in fault diagnosis into overall loss, realizing uncertainty calibration while training a fault diagnosis model, and expressing a loss function corresponding to fault monitoring data in distribution as follows:
wherein gamma is a hyper-parameter controlling the contribution of the calibration loss in the total loss;
the step S52 specifically includes the following steps:
s521, for the fault monitoring data outside the distribution, the cognition and the inherent uncertainty of the fault monitoring data do not need to be modeled, the distribution uncertainty only needs to be modeled to be distinguished from the fault monitoring data inside the distribution, the predicted value and the uncertainty quantification of the fault monitoring data outside the distribution are not credible, the uncertainty in fault diagnosis is not corrected, and the corresponding loss function is expressed as:
predictive dirichlet distribution of out-of-distribution fault monitoring dataRelatively dispersed, corresponding to the target distribution Dir (μ | α) t ) Parameter (d) ofTaking a smaller value, the model output corresponding to the sample outside the distribution should be equal in probability to indicate that the predicted value is not credible, namely alpha t All distribution parameters inIs set to 1.
5. The method for quantitatively calibrating the uncertainty in the fault diagnosis of the electromechanical device based on the deep learning of claim 1, wherein the step S9 specifically comprises the following steps:
s91, measuring uncertainty in fault diagnosis through entropy, and then predicting the uncertainty to be expressed as:
wherein H [. C]Denotes the entropy of the distribution, p (Y = c | X, Y) denotes the prediction probability of the class c, by subtracting q from q θ S model parameters omega are sampled in omega s To yield p (Y = c | X, Y):
the prediction uncertainty is further broken down into:
wherein, I [. C]Representing mutual information, E μ|x,X,Y [·]Denotes the expectation under the distribution p (μ | X, Y), and p (μ | X, Y) = p (μ | X, ω) p (ω | X, Y) d ω, then there are:
wherein E is μ|x,X,Y [H[p(y|μ)]]The inherent uncertainty in fault diagnosis, mutual information I [ Y, mu | X, X, Y, is measured]Cognition and distribution uncertainty in fault diagnosis are measured, and distribution uncertainty performance is measured by an expected differential entropy alone:
s92, monitoring test data x for faults * And executing S times of Monte Carlo dropout in the testing stage to obtain a sampling value set:
s93, calculating the failure prediction probability, and obtaining the final prediction type and uncertainty quantization result:
given a threshold value ε, ifX is then * The final prediction type and other uncertainty quantification results are obtained for the samples outside the fault monitoring data distribution without calculation, otherwise, x * For the samples within the fault-monitoring data distribution, its final fault prediction class c * The expression of (a) is:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210820966.4A CN115204227A (en) | 2022-07-12 | 2022-07-12 | Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210820966.4A CN115204227A (en) | 2022-07-12 | 2022-07-12 | Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115204227A true CN115204227A (en) | 2022-10-18 |
Family
ID=83580273
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210820966.4A Pending CN115204227A (en) | 2022-07-12 | 2022-07-12 | Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115204227A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116502073A (en) * | 2023-06-27 | 2023-07-28 | 北京理工大学 | High-reliability intelligent fault diagnosis and health management method for wind generating set |
CN116680589A (en) * | 2023-05-18 | 2023-09-01 | 哈尔滨工业大学 | DC charging pile remote verification method based on Dirichlet process and folded rod structural representation |
CN117371299A (en) * | 2023-12-08 | 2024-01-09 | 安徽大学 | Machine learning method for Tokamak new classical circumferential viscous torque |
CN117828481A (en) * | 2024-03-04 | 2024-04-05 | 烟台哈尔滨工程大学研究院 | Fuel system fault diagnosis method and medium for common rail ship based on dynamic integrated frame |
CN118519818A (en) * | 2024-07-23 | 2024-08-20 | 国富瑞(福建)信息技术产业园有限公司 | Deep recursion network-based big data computer system fault detection method |
CN118519818B (en) * | 2024-07-23 | 2024-09-27 | 国富瑞(福建)信息技术产业园有限公司 | Deep recursion network-based big data computer system fault detection method |
-
2022
- 2022-07-12 CN CN202210820966.4A patent/CN115204227A/en active Pending
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116680589A (en) * | 2023-05-18 | 2023-09-01 | 哈尔滨工业大学 | DC charging pile remote verification method based on Dirichlet process and folded rod structural representation |
CN116502073A (en) * | 2023-06-27 | 2023-07-28 | 北京理工大学 | High-reliability intelligent fault diagnosis and health management method for wind generating set |
CN117371299A (en) * | 2023-12-08 | 2024-01-09 | 安徽大学 | Machine learning method for Tokamak new classical circumferential viscous torque |
CN117371299B (en) * | 2023-12-08 | 2024-02-27 | 安徽大学 | Machine learning method for Tokamak new classical circumferential viscous torque |
CN117828481A (en) * | 2024-03-04 | 2024-04-05 | 烟台哈尔滨工程大学研究院 | Fuel system fault diagnosis method and medium for common rail ship based on dynamic integrated frame |
CN118519818A (en) * | 2024-07-23 | 2024-08-20 | 国富瑞(福建)信息技术产业园有限公司 | Deep recursion network-based big data computer system fault detection method |
CN118519818B (en) * | 2024-07-23 | 2024-09-27 | 国富瑞(福建)信息技术产业园有限公司 | Deep recursion network-based big data computer system fault detection method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115204227A (en) | Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning | |
CN109948117A (en) | A kind of satellite method for detecting abnormality fighting network self-encoding encoder | |
CN110598851A (en) | Time series data abnormity detection method fusing LSTM and GAN | |
CN111813084B (en) | Mechanical equipment fault diagnosis method based on deep learning | |
Yan et al. | A dynamic multi-scale Markov model based methodology for remaining life prediction | |
Chang et al. | A theoretical survey on Mahalanobis-Taguchi system | |
Turner et al. | Likelihood-free Bayesian analysis of memory models. | |
CN113255848A (en) | Water turbine cavitation sound signal identification method based on big data learning | |
CN102609612B (en) | Data fusion method for calibration of multi-parameter instruments | |
CN112488235A (en) | Elevator time sequence data abnormity diagnosis method based on deep learning | |
CN110555247A (en) | structure damage early warning method based on multipoint sensor data and BilSTM | |
CN112651119B (en) | Multi-performance parameter acceleration degradation test evaluation method for space harmonic reducer | |
CN115659583A (en) | Point switch fault diagnosis method | |
Chen et al. | A deep learning feature fusion based health index construction method for prognostics using multiobjective optimization | |
CN114118219A (en) | Data-driven real-time abnormal detection method for health state of long-term power-on equipment | |
CN110852906B (en) | Method and system for identifying electricity stealing suspicion based on high-dimensional random matrix | |
CN114512239A (en) | Cerebral apoplexy risk prediction method and system based on transfer learning | |
CN115185937A (en) | SA-GAN architecture-based time sequence anomaly detection method | |
Xie et al. | Internal defect inspection in magnetic tile by using acoustic resonance technology | |
CN113868957B (en) | Residual life prediction and uncertainty quantitative calibration method under Bayes deep learning | |
CN114495438B (en) | Disaster early warning method, system, equipment and storage medium based on multiple sensors | |
CN116431346A (en) | Compensation method for main memory capacity of electronic equipment | |
CN116384223A (en) | Nuclear equipment reliability assessment method and system based on intelligent degradation state identification | |
CN115153549A (en) | BP neural network-based man-machine interaction interface cognitive load prediction method | |
CN113627621B (en) | Active learning method for optical network transmission quality regression estimation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |