CN115204227A - Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning - Google Patents

Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning Download PDF

Info

Publication number
CN115204227A
CN115204227A CN202210820966.4A CN202210820966A CN115204227A CN 115204227 A CN115204227 A CN 115204227A CN 202210820966 A CN202210820966 A CN 202210820966A CN 115204227 A CN115204227 A CN 115204227A
Authority
CN
China
Prior art keywords
distribution
uncertainty
fault diagnosis
fault
calibration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210820966.4A
Other languages
Chinese (zh)
Inventor
林焱辉
李港辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202210820966.4A priority Critical patent/CN115204227A/en
Publication of CN115204227A publication Critical patent/CN115204227A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M7/00Vibration-testing of structures; Shock-testing of structures
    • G01M7/02Vibration-testing by means of a shake table
    • G01M7/025Measuring arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/041Abduction

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Testing Or Calibration Of Command Recording Devices (AREA)

Abstract

The invention provides an uncertainty quantitative calibration method in electromechanical equipment fault diagnosis based on deep learning, which comprises the following steps: acquiring a basic data set by a monitoring signal of a preprocessing device; determining the type and scale of a deep neural network, and constructing a Bayesian deep neural network for fault diagnosis; selecting calibration errors to evaluate uncertainty calibration loss, and determining loss functions on the in-distribution and out-distribution fault monitoring data sets; and determining hyper-parameters of the model, alternately training on the fault monitoring data sets in and out of distribution, and outputting fault diagnosis and uncertainty quantification results of the test data after the model is converged. The modeling method for the uncertainty of the fault diagnosis of the electromechanical equipment integrates the calibration loss, realizes the uncertainty quantification and calibration, obtains the inherent, cognitive, distribution and prediction uncertainty quantification results after the calibration by combining the uncertainty decomposition with the Monte Carlo sampling method, and effectively improves the fault diagnosis precision and the uncertainty quantification precision of the equipment.

Description

Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning
Technical Field
The invention belongs to the technical field of fault diagnosis in prediction and health management, and particularly relates to an uncertainty quantitative calibration method in fault diagnosis of electromechanical equipment based on deep learning.
Background
With the rapid development of economic science and technology, the increasing complexity of equipment brings unprecedented challenges to fault diagnosis. Particularly for safety key application, whether the fault occurs or not and the fault category can be accurately and timely judged, so that the serious economic loss and casualties can be effectively avoided. Fault diagnosis plays an important role in prediction and health management as an important method for determining the state of health of equipment. With the development and wide application of sensor technology, real-time monitoring of the operation state of equipment becomes possible, and a data base is provided for data-driven diagnostic technology. In addition, deep learning relies on strong learning expression ability and is widely applied to fault diagnosis.
Although the fault diagnosis method based on deep learning obtains excellent performance, a reliable uncertainty quantification result cannot be given to the diagnosis result, and the practical application of the methods is greatly limited. In fault diagnosis, uncertainties are generally classified into three categories: intrinsic, cognitive and distribution uncertainties. Inherent uncertainty captures the inherent noise of the observed data, reflecting the uncertainty caused by unknown or missing information, such as measurement errors, and the like, which cannot be reduced by adding training data. Cognitive uncertainty reflects uncertainty caused by lack of knowledge and can be reduced by adding training data. The distribution uncertainty reflects the uncertainty caused by data distribution change, and can capture the test samples which are not seen in the model training process, namely the uncertainty caused by the fault monitoring data outside the distribution. When quantifying uncertainty using a deep learning approach, many studies have the output of softmax as a prediction distribution, but the prediction distribution is often overly confident and unable to quantify the true prediction uncertainty. On the other hand, the bayesian deep learning method has both uncertainty quantification capability of the bayesian method and expression learning capability of the deep learning method, and thus becomes a hot point of current research. In order to realize simple and convenient calculation of the Bayes deep neural network, the approximation method based on dropout draws attention by virtue of higher calculation efficiency and superior performance. However, due to errors in model selection and the use of approximate reasoning, the quantification of uncertainty based on bayesian deep neural networks is often inaccurate. For example, a sample set with a prediction probability of 0.9 typically cannot contain 90% of the correct prediction results. Therefore, in order to improve the accuracy of uncertainty quantification while quantifying uncertainty, a Bayesian deep learning-based method for calibrating uncertainty quantification in a fault diagnosis process needs to be researched.
Disclosure of Invention
The invention aims to solve the technical problems of constructing an uncertainty modeling method based on a Bayesian depth neural network, integrating calibration loss, realizing uncertainty quantification and calibration, acquiring inherent uncertainty, cognitive uncertainty, distribution uncertainty and prediction uncertainty quantification results after calibration by combining uncertainty decomposition with a Monte Carlo sampling method, and improving fault diagnosis precision and uncertainty quantification precision.
In order to solve the problems, the invention provides a deep learning-based method for quantitatively calibrating uncertainty in fault diagnosis of electromechanical equipment, which comprises the following steps:
s1, preprocessing a fault monitoring signal of electromechanical equipment to acquire a basic data set: preprocessing a fault monitoring signal of the electromechanical equipment, including signal screening, feature extraction, data normalization and set division, and dividing an obtained in-distribution fault monitoring data set and an obtained out-distribution fault monitoring data set to obtain a training set, a verification set and a test set;
s2, determining the type and scale of the fault diagnosis deep neural network: selecting a deep neural network according to the characteristics of the equipment fault monitoring signal, and determining the scale of the network according to the size of the data set, wherein the scale comprises the number of neurons and the number of network layers;
s3, constructing a Bayes deep neural network for electromechanical device fault diagnosis: selecting probability distribution to capture inherent, cognitive and distribution uncertainty in fault diagnosis, constructing probability distribution by using a Bayesian deep neural network, and determining an estimation method of the distribution;
s31, quantifying cognitive uncertainty in fault diagnosis is fused into a fault diagnosis network, and posterior distribution of model parameters is estimated in a Bayesian neural network by selecting variation inference and combining a Monte Carlo sampling method;
s32, integrating quantification of distribution uncertainty in fault diagnosis into a fault diagnosis network, and respectively constructing KL divergence aiming at a fault monitoring data set in distribution and a fault monitoring data set out of distribution so as to determine corresponding probability distribution of the fault monitoring data set;
s33, quantitatively integrating inherent uncertainty in fault diagnosis into a fault diagnosis network to obtain KL divergence of the Bayesian neural network;
s4, constructing uncertainty quantitative calibration loss in fault diagnosis: determining a calibration evaluation index and evaluating calibration loss, and integrating the calibration loss into an overall loss function so as to realize calibration while realizing uncertainty quantification in the fault diagnosis of the electromechanical equipment;
s5, determining a loss function: constructing a loss function of the whole model;
s51, determining a loss function on a fault monitoring data set in distribution;
s52, determining a loss function on the external fault monitoring data set;
s6, determining the hyper-parameters of the fault diagnosis model of the electromechanical equipment: determining hyper-parameters of the model through a trial and error strategy, wherein the hyper-parameters comprise learning rate, batch size and the like, and determining the optimal combination of the hyper-parameters through grid search;
s7, training a fault diagnosis model of the electromechanical equipment: selecting an optimization method and combining the selected hyper-parameters to train a model on the in-distribution fault monitoring data set and the out-distribution fault monitoring data set in sequence;
s8, judging whether the fault diagnosis model of the electromechanical equipment is converged: judging whether the model parameter variation of the optimal model before and after training is smaller than a specified threshold, if so, executing a step S9, otherwise, executing a step S7;
s9, outputting a fault diagnosis result and an uncertainty quantification result of the electromechanical equipment: and outputting the fault diagnosis result and the quantitative results of inherent uncertainty, cognitive uncertainty, distribution uncertainty and prediction uncertainty of the fault diagnosis result by the trained model.
Further, the step S31 specifically includes the following steps:
s311, capturing cognitive uncertainty by utilizing posterior distribution of electromechanical equipment fault diagnosis model parameters, and aiming at historical data
Figure BDA0003742394930000031
And
Figure BDA0003742394930000032
wherein X is fault observation data, Y is a fault category label, and for a classification task with the classification category number of C, there is Y n ∈{1,2,L,C},x n Representing single fault observation data, N representing observation data amount, and modeling cognitive uncertainty in fault diagnosis as posterior distribution p (omega | X, Y), wherein omega represents model parameters;
s312, introducing an inferred distribution q by means of variational inference θ (ω) approximate p (ω | X, Y) and infer distribution q using KL divergence measure θ Distance between (ω) and true posterior distribution p (ω | X, Y):
KL(q θ (ω)||p(ω|X,Y))=KL(q θ (ω)||p(ω))-∫q θ (ω)log(p(Y|X,ω))dω (1)
wherein p (Y | X, ω) represents a likelihood function based on the historical data set, p (ω) represents a prior distribution of weights, KL (| | -) represents KL divergence, θ represents a variation parameter, and θ is optimized to minimize KL (q | X, ω) θ (ω) | p (ω | X, Y)) to obtain an estimate of the posterior distribution;
s313, for a deep neural network with L layers, the number of units in each layer is K l The model parameter ω is expressed as:
Figure BDA0003742394930000033
wherein, W l Model parameters of the l layer representing a deep neural network;
s314, applying the Concrete drop to the deep neural network to enable the deep neural network to be approximate to a Bayesian deep neural network, namely processing the fixed model parameter omega into a random variable which is inferred to be distributed:
Figure BDA0003742394930000034
wherein, for the variation parameter θ, there are:
Figure BDA0003742394930000035
wherein M is l The expression dimension is K l+1 ×K l Average weight matrix of p l The dropout probability of the l-th network, the inferred distribution q of each network θ (ω) is expressed as:
Figure BDA0003742394930000036
wherein Bernoulli (·) represents a Bernoulli distribution function;
s315, selecting prior distribution of the fault diagnosis model parameter omega as follows:
Figure BDA0003742394930000041
wherein, p (W) l ) A priori distribution of model parameters of layer I representing a deep neural network and having
Figure BDA0003742394930000042
υ l A control parameter representing the degree of smoothness of the function;
s316, combining the Monte Carlo sampling method to obtain KL (q) θ An analytical expression of (ω) | p (ω | X, Y)):
Figure BDA0003742394930000043
wherein, p (y) i |x i ω) represents the likelihood function of each sample, h (p) l ) Entropy representing Bernoulli random variable and has h (p) l )=-p l logp l -(1-p l )log(1-p l );
Further, the step S32 specifically includes the following steps:
s321, inputting the uncertain distribution performance of the fault monitoring data x, and capturing the uncertain distribution performance by placing a Dirichlet distribution on an output probability mu with a model parameter of omega:
Figure BDA0003742394930000044
wherein Γ (·) represents a gamma function, dir (·) represents a Dirichlet distribution, and μ = [ μ ]) 1 ,…,μ C ] T =[p(y=1),…,p(y=C)] T Representing the respective class probabilities of the model outputs, where μ 1 Indicates the output probability, μ, of class 1 C The output probability of class C is represented, and so on. α (x, ω) = [ α = 1 (x,ω),…,α C (x,ω)] T More than 0 is the parameter of Dirichlet distribution, and
Figure BDA0003742394930000045
the precision parameter representing the Dirichlet distribution reflects the discrete degree of the Dirichlet distribution, and the distribution is more concentrated when the value is larger;
s322, monitoring data for faults in distribution
Figure BDA0003742394930000046
And out-of-distribution fault monitoring data
Figure BDA0003742394930000047
The corresponding predicted Dirichlet distribution is Dir (μ | α (x) n ω)) and Dir (μ | α (x) t ω), and; to give the model the ability to discriminate between in-distribution and out-of-distribution fault monitoring data, input data x for in-distribution n And distributing the external input data x t Respectively selecting more concentrated Dirichlet distributionsDir(μ|α n ) And a relatively discrete Dirichlet distribution Dir (μ | α) t ) As target distribution, and using KL divergence to measure target distribution and predict the distance of dirichlet distribution:
for intra-distribution fault monitoring data:
Figure BDA0003742394930000051
for out-of-distribution fault monitoring data:
Figure BDA0003742394930000052
wherein T is the sample size of the out-of-distribution fault monitoring dataset, and α = [ α ] for any two parameters respectively 1 ,…,α C ] T And β = [ β = 1 ,…,β C ] T The KL divergence of the dirichlet distribution of (1) and (b) has an analytic form, namely:
Figure BDA0003742394930000053
where ψ (-) is a double gamma function,
Figure BDA0003742394930000054
and
Figure BDA0003742394930000055
respectively representing the precision parameters of the two Dirichlet distributions;
further, the step S33 specifically includes the following steps:
s331, capturing the inherent uncertain performance in the fault diagnosis process by predicting the classification distribution on the categories:
Figure BDA0003742394930000056
wherein, I {. Is used for indicating function, and Cat (·) is used for classifying distribution;
s332, calculating a likelihood function:
Figure BDA0003742394930000057
wherein,
Figure BDA0003742394930000058
is Dir (μ | α (x) n ω)) of the measured values; then, KL (q) θ (ω) | p (ω | X, Y)) is expressed as:
Figure BDA0003742394930000059
further, the step S4 specifically includes the following steps:
s41, defining calibration of uncertainty in a fault diagnosis process: calibration is defined as the failure category prediction probability can reflect the accuracy of the predicted value, when the failure diagnosis model has been calibrated, for a set of samples with failure category prediction probability κ, there should be samples with ratio κ correctly classified, and thus, for each category c ∈ {1, l, c }, it satisfies:
Figure BDA0003742394930000061
wherein Y | X ~ P, and
Figure BDA0003742394930000062
the predicted probability of the category c corresponding to the input X is input;
s42, uncertainty calibration error evaluation in fault diagnosis: for quantitatively evaluating calibration errors, selecting adaptive calibration errors as evaluation indexes, adjusting probability intervals according to the number of samples by the adaptive calibration errors for each category to ensure that each probability interval contains equal number of samples, calculating the accuracy and the average prediction probability in each interval, and if the accuracy and the average prediction probability are approximately equalIf the number of probability intervals for the class c is M, the probability interval for the M-th interval is represented as U c,m Can be arranged in sequence
Figure BDA0003742394930000063
Obtaining a corresponding sample set of
Figure BDA0003742394930000064
The accuracy within this interval can then be calculated:
Figure BDA0003742394930000065
and average prediction probability:
Figure BDA0003742394930000066
the adaptive calibration error is obtained by averaging the calibration errors for each class and interval:
Figure BDA0003742394930000067
the self-adaptive calibration error considers the prediction probabilities of all categories, not only the prediction probability of the final prediction category, and can better perform comprehensive evaluation on the calibration errors of the multi-classification task, and in addition, when the sample size of fault monitoring data is small or the prediction probabilities are concentrated in 0 and 1, the self-adaptive calibration error can ensure that each section has enough samples by adaptively adjusting the length of the probability section, and the calibration error is objectively evaluated;
s43, constructing uncertainty calibration loss in fault diagnosis: in order to improve the quantitative accuracy of uncertainty in fault diagnosis during model training, calibration loss is fused into an integral loss function, the calibration loss is evaluated by a calibration error, a small batch gradient descent algorithm is selected for model optimization during Bayesian deep neural network training, each small batch cannot have sufficient fault monitoring data samples to evaluate the calibration loss, and therefore a self-adaptive calibration error can be selected to evaluate the calibration loss:
Figure BDA0003742394930000071
where CL (·) represents the uncertainty calibration loss in fault diagnosis.
Further, the step S51 specifically includes the following steps:
s511, for the fault monitoring data in distribution, the cognition and inherent uncertainty modeling in fault diagnosis can be realized by minimizing KL (q) θ (ω) | p (ω | X, Y)) and distributed uncertainty modeling in fault diagnosis can be achieved by minimizing
Figure BDA0003742394930000072
The method is realized by the following steps that the loss function corresponding to uncertainty quantification in the fault diagnosis of the electromechanical equipment based on Bayesian deep learning is as follows:
Figure BDA0003742394930000073
wherein λ is a hyper-parameter controlling contribution degree of loss corresponding to distribution uncertainty in fault diagnosis in total loss, and in addition, predicted Dirichlet distribution of fault monitoring data in distribution is concentrated, and target distribution Dir (μ | α) is distributed n ) Parameter (d) of
Figure BDA0003742394930000074
Should be larger, in order to ensure the prediction accuracy, the average value of the target distribution should be corresponding 0-1 label, i.e. the distribution average value satisfies
Figure BDA0003742394930000075
Figure BDA0003742394930000076
Can be considered as a constant independent of the input, however, this will result in a n Most of the parameters in the Dirichlet distribution take values of 0, so that the corresponding KL divergence is difficult to optimize, and the problem is avoided by endowing a region with the probability density of 0 in the Dirichlet distribution with a lower probability;
Figure BDA0003742394930000077
the xi is a smoothing factor with a smaller value, and values of all parameters of Dirichlet distribution are obtained on the basis;
s512, integrating uncertainty calibration loss in fault diagnosis into overall loss, realizing uncertainty calibration while training a fault diagnosis model, and expressing a loss function corresponding to fault monitoring data in distribution as follows:
Figure BDA0003742394930000078
wherein gamma is a hyper-parameter controlling the contribution of the calibration loss in the total loss;
further, the step S52 specifically includes the following steps:
s521, for the fault monitoring data outside the distribution, the cognition and the inherent uncertainty of the fault monitoring data do not need to be modeled, the distribution uncertainty only needs to be modeled to be distinguished from the fault monitoring data inside the distribution, the predicted value and the uncertainty quantification of the fault monitoring data outside the distribution are not credible, the uncertainty in fault diagnosis is not corrected, and the corresponding loss function is expressed as:
Figure BDA0003742394930000081
the predicted Dirichlet distribution of the out-of-distribution fault monitoring data is more dispersed and corresponds to the target distribution Dir (mu | alpha) t ) Parameter (d) of
Figure BDA0003742394930000082
Taking a smaller value, the model output corresponding to the sample outside the distribution should beEqui-probability to indicate that the prediction is not reliable, i.e. alpha t All distribution parameters in
Figure BDA0003742394930000083
Is set to 1.
Preferably, the step S9 specifically includes the following steps:
s91, measuring uncertainty in fault diagnosis through entropy, and then predicting the uncertainty to be expressed as:
Figure BDA0003742394930000084
wherein H [. C]Denotes the entropy of the distribution, p (Y = c | X, Y) denotes the prediction probability of the class c, by subtracting q from q θ S model parameters omega are sampled in omega s To yield p (Y = c | X, Y):
Figure BDA0003742394930000085
the prediction uncertainty is further broken down into:
Figure BDA0003742394930000086
wherein, I [. C]Representing mutual information, E μ|x,X,Y [·]Denotes the expectation under the distribution p (μ | X, Y), and p (μ | X, Y) =: [ p (μ | X, ω) p (ω | X, Y) d ω, then:
Figure BDA0003742394930000087
wherein, E μ|x,X,Y [H[p(y|μ)]]Measures the inherent uncertainty and mutual information I [ Y, mu | X, X, Y in fault diagnosis]Cognition and distribution uncertainty in fault diagnosis are measured, and distribution uncertainty performance is measured by an expected differential entropy alone:
Figure BDA0003742394930000091
s92, monitoring test data x for faults * In the testing stage, S monte carlo dropouts are executed to obtain a set of sampling values:
Figure BDA0003742394930000092
s93, calculating the failure prediction probability, and obtaining the final prediction type and uncertainty quantization result:
prediction probability
Figure BDA0003742394930000093
The expression of (a) is:
Figure BDA0003742394930000094
distributing uncertainty in fault diagnosis
Figure BDA0003742394930000095
The expression of (a) is:
Figure BDA0003742394930000096
given a threshold value ε, if
Figure BDA0003742394930000097
X is then * The final prediction type and other uncertainty quantification results are obtained for the samples outside the fault monitoring data distribution without calculation, otherwise, x * For the samples within the fault-monitoring data distribution, its final fault prediction class c * The expression of (a) is:
Figure BDA0003742394930000098
wherein,
Figure BDA0003742394930000099
expressing a function for solving the subscript of the maximum element in the vector;
inherent uncertainty in fault diagnosis
Figure BDA00037423949300000910
The expression of (a) is:
Figure BDA00037423949300000911
prediction uncertainty in fault diagnosis
Figure BDA00037423949300000912
The expression of (a) is:
Figure BDA0003742394930000101
cognitive uncertainty in fault diagnosis
Figure BDA0003742394930000102
From Iy, μ | X, X, Y]Approximated, the expression is:
Figure BDA0003742394930000103
compared with the prior art, the technical effect that this scheme produced is:
compared with the traditional fault diagnosis method, the uncertainty quantification method in the equipment fault diagnosis based on the deep learning, provided by the invention, has the advantages that the uncertainty in the fault diagnosis is more effectively and accurately quantified due to the comprehensive consideration of the comprehensive influence caused by inherent uncertainty, cognitive uncertainty and distribution uncertainty in the uncertainty quantification process; the uncertainty calibration method in the electromechanical device fault diagnosis process based on Bayesian deep learning is based on specific diagnosis tasks and combined with self-adaptive calibration errors, and provides uncertainty calibration loss to realize uncertainty calibration, so that an important basis is provided for accurate quantification of uncertainty in electromechanical device fault diagnosis; for the trained network, various uncertain quantification results are obtained by combining uncertainty decomposition and Monte Carlo sampling, so that the diagnosis precision and the uncertain quantification precision are effectively improved.
Drawings
FIG. 1 is a flowchart of an uncertainty quantitative calibration method in deep learning-based fault diagnosis of electromechanical devices according to the present invention;
FIG. 2 is a graph of a prediction network constructed for a bearing dataset;
FIG. 3 is a statistical result of the distribution uncertainty on the test data of the present invention both outside and inside the distribution of the PU data set;
FIG. 4 is a statistical result of the distribution uncertainty on the test data outside and inside the distribution of the IMS dataset according to the present invention;
Detailed Description
The present application will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present application may be combined with each other without conflict.
The following fig. 1 shows an uncertainty quantitative calibration method in deep learning based fault diagnosis of an electromechanical device, which includes the following steps:
s1, preprocessing a monitoring signal to obtain a basic data set: two bearing datasets were analyzed and preprocessed, the University of Padrborn (PU) dataset, produced and provided by the University of padrborn, germany, which was collected from a modular test stand and contained three operating conditions: four working condition data consisting of rotating speed, loading torque and radial force. In the data set, the monitoring data of each bearing under each working condition consists of 20 data files, and each data file stores vibration signals and motor current data with the sampling frequency of 64kHz and the sampling time of 4 seconds. Considering that the vibration signal can effectively reflect the health state of the bearing in fault diagnosis, only the vibration signal data is selected for analysis. Further, the bearings in the data set are classified into three categories according to the cause of the occurrence of a failure: healthy, human damaged and accelerated testing caused damage. Wherein, select rotational speed 900rpm, loading torque 0.7Nm, the bearing state monitoring data of health condition and artificial damage under the radial force 1000N operating mode as experimental data set, wherein the trouble of artificial damage bearing is produced by artificial processing, and the damage mode includes: electrical discharge machining, drilling and electrical engraving. In addition, the damaged portion and the damaged degree of each bearing were different, and specific information of the selected bearing is shown in table 1. Selecting a bearing with a damage mode of electric spark machining as an outer distribution sample, selecting bearings with other damage modes as inner distribution samples, and setting the proportion of the samples in a training set, a verification set and a test set in the inner distribution fault monitoring data and the outer distribution fault monitoring data to be 7.
Figure BDA0003742394930000111
TABLE 1
An Intelligent Maintenance System (IMS) data set is provided by the american Intelligent Maintenance system center, which contains three sub data sets. Each subdata set consists of a plurality of data files, and each data file stores vibration signals of 4 bearings with the sampling frequency of 20kHz and the sampling time of 1 second. And selecting the bearing 1 data in the second subdata to carry out an experiment, wherein the total number of the bearing 1 data is 984 data files, and the acquisition interval time of each file is 10 minutes. The bearing data are grouped according to the degree of degradation of the bearing as shown in table 2. Similarly, the degradation later stage and fault data are selected as samples outside the distribution, the rest are samples inside the distribution, and the proportion of the samples in the training set, the verification set and the test set is 7. Compared with the PU data set, the experimental data set describes a complete life cycle of the bearing, and faults are generated in the running process of the bearing instead of being generated by artificial machining. Thus, the data set is closer to reality, with higher observed noise. In addition, the goal of fault diagnosis on this data set is to determine the current state of health of the bearing in order to develop a maintenance and security strategy.
Figure BDA0003742394930000112
Figure BDA0003742394930000121
TABLE 2
S2, determining the type and scale of the deep neural network: and selecting a Recurrent Neural Network (RNN) as a basic frame according to the time sequence characteristics of the bearing monitoring signals. And time domain characteristics of the vibration signal such as mean value, kurtosis, skewness, root mean square value and the like are used as input of the RNN to further extract characteristics and make prediction, and in order to capture Long-time dependence characteristics, variants of the RNN such as a Long Short-term Memory (LSTM) network and a Gated Recurrent Unit (GRU) are selected to carry out model construction. To verify the universality of the proposed method, LSTM and GRU were chosen to model the PU and IMS datasets, respectively, and to obtain a stronger expressive power by stacking multiple RNN layers. And connecting a full connection layer to output parameters of Dirichlet distribution at the last time step of the last layer of the model. Considering the non-negativity of the dirichlet distribution parameters, the model output layer selects an exponential activation function, as shown in fig. 2. After multiple training verification, three layers of LSTM or GRU are selected to establish a network, and the number of neurons in each layer is 128, 64 and 32 respectively.
S3, establishing a Bayes deep neural network for fault diagnosis: and (3) applying Concrete dropouts to each layer of the network to approximate the Concrete to be a Bayesian deep neural network, and selecting variation dropouts when applying the dropouts in order to ensure the correctness of Bayesian reasoning, namely ensuring that dropouts masks of each time step are the same. In addition, probability distribution is selected to capture inherent, cognitive and distribution uncertainty, the probability distribution is constructed by utilizing a Bayesian deep neural network, and an estimation method of the distribution is determined.
S31, quantifying of cognitive uncertainty is fused into a network, and posterior distribution of model parameters is estimated in a Bayesian neural network by selecting variation inference and combining a Monte Carlo sampling method.
S311, capturing cognitive uncertainty by utilizing posterior distribution of electromechanical equipment fault diagnosis model parameters, and aiming at historical data
Figure BDA0003742394930000122
And
Figure BDA0003742394930000123
wherein X is fault observation data, Y is a fault category label, and for a classification task with the classification category number of C, there is Y n ∈{1,2,…,C},x n Representing single fault observation data, N representing an observation data amount, and modeling cognitive uncertainty in fault diagnosis as posterior distribution p (omega | X, Y), wherein omega represents a model parameter;
s312, introducing an inferred distribution q by means of variational inference θ (ω) approximate p (ω | X, Y) and infer the distribution q using KL divergence measure θ Distance between (ω) and the true posterior distribution p (ω | X, Y):
KL(q θ (ω)||p(ω|X,Y))=KL(q θ (ω)||p(ω))-∫q θ (ω)log(p(Y|X,ω))dω (1)
wherein p (Y | X, ω) represents a likelihood function based on the historical data set, p (ω) represents a prior distribution of weights, KL (| | -) represents KL divergence, θ represents a variation parameter, and θ is optimized to minimize KL (q | X, ω) θ (ω) | p (ω | X, Y)) to obtain an estimate of the posterior distribution;
s313, for a deep neural network with L layers, the number of units in each layer is K l The model parameter ω is expressed as:
Figure BDA0003742394930000131
wherein, W l Layer I model representing deep neural networkA type parameter;
s314, applying the Concrete drop to the deep neural network to enable the deep neural network to be approximate to a Bayesian deep neural network, namely processing the fixed model parameter omega into a random variable which is inferred to be distributed:
Figure BDA0003742394930000132
wherein, for the variation parameter θ, there are:
Figure BDA0003742394930000133
wherein M is l The expression dimension is K l+1 ×K l Average weight matrix of p l The dropout probability of the l-th network, the inferred distribution q of each network θ (ω) is expressed as:
Figure BDA0003742394930000134
wherein Bernoulli (·) represents a Bernoulli distribution function;
s315, selecting prior distribution of the fault diagnosis model parameter omega as follows:
Figure BDA0003742394930000135
wherein, p (W) l ) A priori distribution of model parameters of layer I representing a deep neural network and having
Figure BDA0003742394930000136
υ l A control parameter representing the degree of smoothness of the function;
s316, combining the Monte Carlo sampling method to obtain KL (q) θ Analytical expression of (ω) | p (ω | X, Y)):
Figure BDA0003742394930000137
wherein, p (y) i |x i ω) represents the likelihood function of each sample, h (p) l ) Entropy representing Bernoulli random variable and has h (p) l )=-p l logp l -(1-p l )log(1-p l );
S32, quantizing the distribution uncertainty into a network, and respectively constructing KL divergence aiming at the fault monitoring data sets inside and outside the distribution so as to estimate the corresponding probability distribution.
S321, the distribution uncertainty of the input fault monitoring data x is captured by placing a Dirichlet distribution on the output probability mu under the model parameter omega:
Figure BDA0003742394930000141
wherein Γ (·) represents a gamma function, dir (·) represents a Dirichlet distribution, and μ = [ μ ]) 1 ,…,μ C ] T =[p(y=1),…,p(y=C)] T Representing the respective class probabilities of the model outputs, where μ 1 Indicates the output probability, μ, of class 1 C The output probability of class C is represented, and so on. α (x, ω) = [ α = 1 (x,ω),…,α C (x,ω)] T More than 0 is the parameter of Dirichlet distribution, and
Figure BDA0003742394930000142
the precision parameter representing the Dirichlet distribution reflects the discrete degree of the Dirichlet distribution, and the distribution is more concentrated when the value is larger;
s322, monitoring data for faults in distribution
Figure BDA0003742394930000143
And out-of-distribution fault monitoring data
Figure BDA0003742394930000144
The corresponding predicted Dirichlet distribution is Dir (μ | α (x) n ω)) and Dir (μ | α (x)) t ω)). To give the model the ability to discriminate between in-distribution and out-of-distribution fault monitoring data, input data x for in-distribution n And distributing the external input data x t Respectively selecting more concentrated Dirichlet distribution Dir (mu | alpha) n ) And a relatively discrete Dirichlet distribution Dir (μ | α) t ) As target distribution, and measure target distribution and predict distance of dirichlet distribution using KL divergence:
for intra-distribution fault monitoring data:
Figure BDA0003742394930000145
for out-of-distribution fault monitoring data:
Figure BDA0003742394930000146
wherein, T is the sample size of the out-of-distribution fault monitoring data set, and α = [ α ] for any two parameters respectively 1 ,…,α C ] T And β = [ β = 1 ,…,β C ] T The KL divergence of the dirichlet distribution of (1) and (b) has an analytic form, namely:
Figure BDA0003742394930000147
where ψ (-) is a double gamma function,
Figure BDA0003742394930000151
and
Figure BDA0003742394930000152
respectively representing the precision parameters of the two Dirichlet distributions;
and S33, fusing the quantification of the inherent uncertainty into the network to obtain the KL divergence of the Bayesian neural network.
S331, capturing the inherent uncertainty in the fault diagnosis process by predicting the classification distribution on the categories:
Figure BDA0003742394930000153
wherein, I {. Is used for indicating function, and Cat (·) is used for classifying distribution;
s332, calculating a likelihood function:
Figure BDA0003742394930000154
wherein,
Figure BDA0003742394930000155
is Dir (μ | α (x) n ω)) of the measured values; then, KL (q) θ (ω) | p (ω | X, Y)) is expressed as:
Figure BDA0003742394930000156
the step S3 is an important invention point of the invention, and is mainly embodied in that a Bayesian deep learning network is constructed to model inherent, cognitive and distribution uncertainty in fault diagnosis, and a distribution estimation method of the Bayesian deep learning network is determined, so that an important basis is provided for quantifying the uncertainty.
S4, establishing a calibration loss: the adaptive calibration error is selected to evaluate the calibration loss and the number of intervals M is set to 10 and 20 and the number of classes C is 7 and 3 for the PU data set and the IMS data set, respectively.
S41, defining calibration of uncertainty in a fault diagnosis process: calibration is defined as the failure category prediction probability that reflects the accuracy of the predicted value, when the failure diagnosis model has been calibrated, for a set of samples with failure category prediction probability κ, there should be samples with ratio κ correctly classified, and thus, for each category C ∈ {1, …, C }, both satisfy:
Figure BDA0003742394930000157
wherein Y | X ~ P, and
Figure BDA0003742394930000158
the predicted probability of the category c corresponding to the input X is input;
s42, uncertainty calibration error evaluation in fault diagnosis: for quantitative evaluation of calibration errors, selecting adaptive calibration errors as evaluation indexes, for each category, adjusting a probability interval by the adaptive calibration errors according to the number of samples to ensure that each probability interval contains equal number of samples, calculating the accuracy and the average prediction probability in each interval, if the accuracy and the average prediction probability are approximately equal, considering the model as a calibrated model, and for category c, if the number of the probability intervals is M, expressing the probability interval of the mth interval as U c,m By arranging them in sequence
Figure BDA0003742394930000161
Obtained as a corresponding sample set
Figure BDA0003742394930000162
The accuracy within this interval is then calculated:
Figure BDA0003742394930000163
and average prediction probability:
Figure BDA0003742394930000164
the adaptive calibration error is obtained by averaging the calibration errors for each class and interval:
Figure BDA0003742394930000165
the self-adaptive calibration error considers the prediction probabilities of all categories, not only the prediction probability of the final prediction category, and can better perform comprehensive evaluation on the calibration errors of the multi-classification task, and in addition, when the sample size of fault monitoring data is small or the prediction probabilities are concentrated in 0 and 1, the self-adaptive calibration error can ensure that each section has enough samples by adaptively adjusting the length of the probability section, and the calibration error is objectively evaluated;
s43, constructing uncertainty calibration loss in fault diagnosis: in order to improve the quantitative accuracy of uncertainty in fault diagnosis during model training, calibration loss is fused into an integral loss function, the calibration loss is evaluated by a calibration error, a small batch gradient descent algorithm is selected for model optimization during Bayesian deep neural network training, each small batch cannot have sufficient fault monitoring data samples to evaluate the calibration loss, and therefore a self-adaptive calibration error is selected to evaluate the calibration loss:
Figure BDA0003742394930000166
wherein CL (·) represents uncertainty calibration loss in fault diagnosis;
the step S4 is an important invention point of the present invention, and is mainly reflected in that the uncertainty calibration loss in the fault diagnosis is provided in combination with the adaptive calibration error, so as to provide an important basis for accurate quantification of uncertainty in the fault diagnosis.
S5, determining a loss function: and constructing a loss function of the fault diagnosis model.
And S51, determining a loss function on the fault monitoring data set in the distribution.
S511, for the fault monitoring data in distribution, the cognitive uncertainty and the inherent uncertainty in fault diagnosis are modeled by minimizing KL (q) θ (ω) | p (ω | X, Y)) and distributed uncertainty modeling in fault diagnosis by minimizing
Figure BDA0003742394930000171
The method is realized by the following steps that the loss function corresponding to uncertainty quantification in the electromechanical device fault diagnosis based on Bayesian deep learning is as follows:
Figure BDA0003742394930000172
wherein λ is a hyper-parameter controlling contribution degree of loss corresponding to distribution uncertainty in fault diagnosis in total loss, and in addition, predicted Dirichlet distribution of fault monitoring data in distribution is concentrated, and target distribution Dir (μ | α) is distributed n ) Parameter (d) of
Figure BDA0003742394930000173
Should be larger, in order to ensure the prediction accuracy, the average value of the target distribution should be corresponding 0-1 label, i.e. the distribution average value satisfies
Figure BDA0003742394930000174
Figure BDA0003742394930000175
Treated as a constant independent of the input, however, this would result in a n Most of the parameters in the Dirichlet distribution take values of 0, so that the corresponding KL divergence is difficult to optimize, and the problem is avoided by endowing a region with the probability density of 0 in the Dirichlet distribution with a lower probability;
Figure BDA0003742394930000176
xi is a smoothing factor with a smaller value, and values of all parameters of Dirichlet distribution are obtained on the basis;
s512, integrating uncertainty calibration loss in fault diagnosis into overall loss, realizing uncertainty calibration while training a fault diagnosis model, and writing a loss function corresponding to fault monitoring data in distribution as follows:
Figure BDA0003742394930000177
wherein gamma is a hyper-parameter for controlling the contribution degree of the calibration loss in the total loss;
further, step S52 specifically includes the following steps:
s521, for the fault monitoring data outside the distribution, the cognition and the inherent uncertainty of the fault monitoring data do not need to be modeled, the distribution uncertainty only needs to be modeled to be distinguished from the fault monitoring data inside the distribution, the predicted value and the uncertainty quantification of the fault monitoring data outside the distribution are not credible, the uncertainty in fault diagnosis is not corrected, and the corresponding loss function is expressed as:
Figure BDA0003742394930000178
the predicted Dirichlet distribution of the out-of-distribution fault monitoring data is more dispersed and corresponds to the target distribution Dir (mu | alpha) t ) Parameter (d) of
Figure BDA0003742394930000181
Taking a smaller value, the model output corresponding to the sample outside the distribution should be equal in probability to indicate that the predicted value is not credible, namely alpha t All distribution parameters in
Figure BDA0003742394930000182
Is set to 1;
after multiple training verification, the selection of the hyper-parameters in the loss function, such as the control parameters of each layer and the hyper-parameters controlling the contribution degree of each loss item, is shown in table 3.
Figure BDA0003742394930000183
TABLE 3
S6, determining the hyper-parameters of the fault diagnosis model: for the PU data set and the IMS data set, the selected sequence lengths are 30 and 20, respectively, and the initial dropout probability for the network is set to 0.2 for each. And selecting an Adam algorithm to optimize the model on the fault monitoring data sets in and out of distribution, trying various combinations of hyper-parameters by combining a trial and error strategy and grid search, and verifying and selecting the optimal combination by training as shown in Table 4.
Figure BDA0003742394930000184
TABLE 4
S7, training a fault diagnosis model of the electromechanical equipment: alternately training the models on the fault monitoring data sets in and out of the distribution, saving the loss on the verification set of each training round after each cycle is finished, and if the minimum loss on the verification sets of two continuous cycles does not decrease any more, considering that the models are converged, terminating the cycle, and saving the final models;
s8, judging whether the fault diagnosis model of the electromechanical equipment is converged: executing step S9 when the model parameter variation of the optimal model before and after training is smaller than a specified threshold;
s9, outputting a fault diagnosis result and an uncertainty quantification result of the electromechanical equipment: and outputting the fault diagnosis result and the quantitative results of the inherent uncertainty, the cognitive uncertainty, the distribution uncertainty and the prediction uncertainty of the fault diagnosis result by the trained model.
S91, measuring uncertainty through entropy, and describing prediction uncertainty as follows:
Figure BDA0003742394930000185
and the prediction uncertainty is further decomposed into:
Figure BDA0003742394930000186
wherein,
Figure BDA0003742394930000191
wherein E is μ|x,X,Y [H[p(y|μ)]]The inherent uncertainty and mutual information I [ Y, mu | X, X, Y are measured]Cognitive and distribution uncertainty is measured. In addition, distribution uncertainty alone passes through the periodThe differential entropy is expected to measure:
Figure BDA0003742394930000192
s92, for test data x * And executing S times of Monte Carlo dropout in the testing stage to obtain a sampling value set:
Figure BDA0003742394930000193
s93, calculating the prediction probability, and obtaining the final prediction type and uncertainty quantization result:
wherein the probability is predicted
Figure BDA0003742394930000194
The expression of (a) is:
Figure BDA0003742394930000195
distribution uncertainty
Figure BDA0003742394930000196
The expression of (a) is:
Figure BDA0003742394930000197
given a threshold value ε, if
Figure BDA0003742394930000198
X is then * To distribute the out-of-range samples, in this case, no computation is needed to get the final prediction classes and other uncertainty quantification results. Otherwise, x * For the intra-distribution samples, their final prediction class c * The expression of (a) is:
Figure BDA0003742394930000199
inherent uncertainty
Figure BDA00037423949300001910
The expression of (a) is:
Figure BDA00037423949300001911
prediction uncertainty
Figure BDA00037423949300001912
The expression of (c) is:
Figure BDA0003742394930000201
cognitive uncertainty
Figure BDA0003742394930000202
From Iy, μ | X, X, Y]Approximated, the expression is:
Figure BDA0003742394930000203
and repeating the Monte Carlo dropout for 1000 times in the testing stage to obtain 1000 groups of sampling values of Dirichlet distribution, and calculating to obtain a fault diagnosis and uncertainty quantification result. Table 5 statistics of the mean uncertainty, accuracy and adaptive calibration error for each class of test samples within the distribution. Fig. 3 and 4 show statistical results of the distribution uncertainty of the in-distribution and out-distribution test samples.
Figure BDA0003742394930000204
TABLE 5
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made thereto without departing from the spirit and scope of the invention and it is intended to cover in the claims the invention as defined in the appended claims.

Claims (5)

1. A method for quantitatively calibrating uncertainty in fault diagnosis of electromechanical equipment based on deep learning is characterized by comprising the following steps:
s1, preprocessing a fault monitoring signal of electromechanical equipment to acquire a basic data set: preprocessing a fault monitoring signal of the electromechanical equipment, including signal screening, feature extraction, data normalization and set division, and dividing an obtained in-distribution fault monitoring data set and an obtained out-distribution fault monitoring data set to obtain a training set, a verification set and a test set;
s2, determining the type and scale of the fault diagnosis deep neural network: selecting a deep neural network according to the characteristics of the equipment fault monitoring signal, and determining the scale of the network according to the size of the data set, wherein the scale comprises the number of neurons and the number of network layers;
s3, constructing a Bayes deep neural network for electromechanical device fault diagnosis: selecting probability distribution to capture inherent, cognitive and distribution uncertainty in fault diagnosis, constructing probability distribution by using a Bayesian deep neural network, and determining an estimation method of the distribution;
s31, quantifying cognitive uncertainty in fault diagnosis is fused into a fault diagnosis network, and posterior distribution of model parameters is estimated in a Bayesian neural network by selecting variation inference and combining a Monte Carlo sampling method;
s32, integrating quantification of distribution uncertainty in fault diagnosis into a fault diagnosis network, and respectively constructing KL divergence aiming at a fault monitoring data set in distribution and a fault monitoring data set out of distribution so as to determine corresponding probability distribution of the fault monitoring data set;
s33, quantitatively integrating inherent uncertainty in fault diagnosis into a fault diagnosis network to obtain KL divergence of the Bayesian neural network;
s4, constructing uncertainty quantitative calibration loss in fault diagnosis: determining a calibration evaluation index and evaluating calibration loss, and integrating the calibration loss into an overall loss function so as to realize calibration while realizing uncertainty quantification in the fault diagnosis of the electromechanical equipment;
s5, determining a loss function: constructing a loss function of the whole model;
s51, determining a loss function on a fault monitoring data set in distribution;
s52, determining a loss function on the external fault monitoring data set;
s6, determining the hyper-parameters of the fault diagnosis model of the electromechanical equipment: determining hyper-parameters of the model through a trial and error strategy, wherein the hyper-parameters comprise learning rate, batch size and the like, and determining the optimal combination of the hyper-parameters through grid search;
s7, training a fault diagnosis model of the electromechanical equipment: selecting an optimization method and combining the selected hyper-parameters to train a model on the in-distribution fault monitoring data set and the out-distribution fault monitoring data set in sequence;
s8, judging whether the fault diagnosis model of the electromechanical equipment is converged: judging whether the model parameter variation of the optimal model before and after training is smaller than a specified threshold, if so, executing a step S9, otherwise, executing a step S7;
s9, outputting a fault diagnosis result and an uncertainty quantification result of the electromechanical equipment: and outputting the fault diagnosis result and the quantitative results of the inherent uncertainty, the cognitive uncertainty, the distribution uncertainty and the prediction uncertainty of the fault diagnosis result by the trained model.
2. The method for quantifying and calibrating the uncertainty in the deep learning based fault diagnosis of the electromechanical device according to claim 1, wherein the step S31 specifically comprises the following steps:
s311, capturing cognitive uncertainty by utilizing posterior distribution of electromechanical equipment fault diagnosis model parameters, and aiming at historical data
Figure FDA0003742394920000021
And
Figure FDA0003742394920000022
wherein X is fault observation data, Y is a fault category label, and for a classification task with the classification category number of C, there is Y n ∈{1,2,L,C},x n Representing single fault observation data, N representing an observation data amount, and modeling cognitive uncertainty in fault diagnosis as posterior distribution p (omega | X, Y), wherein omega represents a model parameter;
s312, introducing an inferred distribution q by variation inference θ (ω) approximate p (ω | X, Y) and infer distribution q using KL divergence measure θ Distance between (ω) and the true posterior distribution p (ω | X, Y):
KL(q θ (ω)||p(ω|X,Y))=KL(q θ (ω)||p(ω))-∫q θ (ω)log(p(Y|X,ω))dω (1)
wherein p (Y | X, ω) represents a likelihood function based on the historical data set, p (ω) represents a prior distribution of weights, KL (| | -) represents KL divergence, θ represents a variation parameter, and θ is optimized to minimize KL (q | X, ω) θ (ω) | p (ω | X, Y)) to obtain an estimate of the posterior distribution;
s313, for a deep neural network with L layers, the number of units in each layer is K l The model parameter ω is expressed as:
Figure FDA0003742394920000023
wherein, W l Model parameters of the l layer representing a deep neural network;
s314, applying the Concrete drop to the deep neural network to enable the deep neural network to be approximate to a Bayesian deep neural network, namely processing the fixed model parameter omega into a random variable which is inferred to be distributed:
Figure FDA0003742394920000024
wherein, for the variation parameter θ, there are:
Figure FDA0003742394920000025
wherein M is l The expression dimension is K l+1 ×K l Average weight matrix of p l The dropout probability of the l-th network, the inferred distribution q of each network θ (ω) is represented by:
Figure FDA0003742394920000026
wherein Bernoulli (·) represents a Bernoulli distribution function;
s315, selecting prior distribution of the fault diagnosis model parameter omega as follows:
Figure FDA0003742394920000031
wherein, p (W) l ) A priori distribution of model parameters of layer I representing a deep neural network and having
Figure FDA0003742394920000032
υ l A control parameter representing the degree of smoothness of the function;
s316, combining the Monte Carlo sampling method to obtain KL (q) θ Analytical expression of (ω) | p (ω | X, Y)):
Figure FDA0003742394920000033
wherein, p (y) i |x i ω) represents the likelihood function of each sample, h (p) l ) Entropy representing Bernoulli random variable and has h (p) l )=-p l logp l -(1-p l )log(1-p l );
The step S32 specifically includes the following steps:
s321, inputting the uncertain distribution performance of the fault monitoring data x, and capturing the uncertain distribution performance by placing a Dirichlet distribution on an output probability mu with a model parameter of omega:
Figure FDA0003742394920000034
wherein Γ (·) represents a gamma function, dir (·) represents a Dirichlet distribution, and μ = [ μ ]) 1 ,…,μ C ] T =[p(y=1),…,p(y=C)] T Representing the respective class probabilities of the model outputs, where μ 1 Indicates the output probability, μ, of class 1 C Representing the output probability of class C, and so on. α (x, ω) = [ α = 1 (x,ω),…,α C (x,ω)] T More than 0 is the parameter of Dirichlet distribution, and
Figure FDA0003742394920000035
the accuracy parameter representing the Dirichlet distribution reflects the discrete degree of the Dirichlet distribution, and the distribution is more concentrated when the value is larger;
s322, monitoring data for faults in distribution
Figure FDA0003742394920000036
And out-of-distribution fault monitoring data
Figure FDA0003742394920000037
The corresponding predicted Dirichlet distribution is Dir (μ | α (x) n ω)) and Dir (μ | α (x) t ω), and; to give the model the ability to discriminate between in-distribution and out-of-distribution fault monitoring data, input data x for in-distribution n And distributing the external input data x t Respectively selecting more concentrated Dirichlet distribution Dir (mu | alpha) n ) And a relatively discrete Dirichlet distribution Dir (μ | α) t ) As target distribution, and using KL divergence to measure target distribution and predict the distance of dirichlet distribution:
for intra-distribution fault monitoring data:
Figure FDA0003742394920000041
for out-of-distribution fault monitoring data:
Figure FDA0003742394920000042
wherein, T is the sample size of the out-of-distribution fault monitoring data set, and α = [ α ] for any two parameters respectively 1 ,…,α C ] T And β = [ β = 1 ,…,β C ] T The KL divergence of the dirichlet distribution of (1) and (b) has an analytic form, namely:
Figure FDA0003742394920000043
where ψ (-) is a double gamma function,
Figure FDA0003742394920000044
and
Figure FDA0003742394920000045
respectively representing the precision parameters of the two Dirichlet distributions;
the step S33 specifically includes the following steps:
s331, capturing the inherent uncertain performance in the fault diagnosis process by predicting the classification distribution on the categories:
Figure FDA0003742394920000046
wherein, I {. Represents an indication function, and Cat (·) represents classification distribution;
s332, calculating a likelihood function:
Figure FDA0003742394920000047
wherein,
Figure FDA0003742394920000048
is Dir (μ | α (x) n ω)) of the measured values; then, KL (q) θ (ω) | p (ω | X, Y)) is expressed as:
Figure FDA0003742394920000049
3. the method for quantitatively calibrating the uncertainty in the fault diagnosis of the electromechanical device based on the deep learning of claim 1, wherein the step S4 specifically comprises the following steps:
s41, defining calibration of uncertainty in a fault diagnosis process: calibration is defined as the failure category prediction probability can reflect the accuracy of the predicted value, when the failure diagnosis model has been calibrated, for a set of samples with failure category prediction probability κ, there should be samples with ratio κ correctly classified, and thus, for each category c ∈ {1, l, c }, it satisfies:
Figure FDA0003742394920000051
wherein Y | X ~ P, and
Figure FDA0003742394920000052
the predicted probability of the category c corresponding to the input X is input;
s42, uncertainty calibration error evaluation in fault diagnosis: for quantitative evaluation of calibration errors, selecting adaptive calibration errors as evaluation indexes, for each category, adjusting probability intervals according to the number of samples by the adaptive calibration errors, so that each probability interval contains equal number of samples, calculating the accuracy and the average prediction probability in each interval, if the accuracy and the average prediction probability are approximately equal, considering the model as a calibrated model, and for category c, if the number of the probability intervals is M, considering the model as a calibrated modelFor the mth interval, the probability interval is represented as U c,m Can be arranged in sequence
Figure FDA0003742394920000053
Obtained as a corresponding sample set
Figure FDA0003742394920000054
The accuracy within this interval can then be calculated:
Figure FDA0003742394920000055
and average prediction probability:
Figure FDA0003742394920000056
the adaptive calibration error is obtained by averaging the calibration errors for each class and interval:
Figure FDA0003742394920000057
the self-adaptive calibration error considers the prediction probabilities of all categories, not only the prediction probability of the final prediction category, and can better perform comprehensive evaluation on the calibration errors of the multi-classification task, and in addition, when the sample size of fault monitoring data is small or the prediction probabilities are concentrated in 0 and 1, the self-adaptive calibration error can ensure that each section has enough samples by adaptively adjusting the length of the probability section, and the calibration error is objectively evaluated;
s43, constructing uncertainty calibration loss in fault diagnosis: in order to improve the quantitative accuracy of uncertainty in fault diagnosis during model training, calibration loss is fused into an integral loss function, the calibration loss is evaluated by a calibration error, a small batch gradient descent algorithm is selected for model optimization during Bayesian deep neural network training, each small batch cannot have sufficient fault monitoring data samples to evaluate the calibration loss, and therefore a self-adaptive calibration error can be selected to evaluate the calibration loss:
Figure FDA0003742394920000061
where CL (·) represents the uncertainty calibration loss in fault diagnosis.
4. The method for quantitatively calibrating the uncertainty in the fault diagnosis of the electromechanical device based on the deep learning of claim 1, wherein the step S51 specifically comprises the following steps:
s511, for the fault monitoring data in distribution, the cognition and inherent uncertainty modeling in fault diagnosis can be realized by minimizing KL (q) θ (ω) | p (ω | X, Y)) and distributed uncertainty modeling in fault diagnosis can be achieved by minimizing
Figure FDA0003742394920000062
The method is realized by the following steps that the loss function corresponding to uncertainty quantification in the electromechanical device fault diagnosis based on Bayesian deep learning is as follows:
Figure FDA0003742394920000063
wherein λ is a hyper-parameter controlling contribution degree of loss corresponding to distribution uncertainty in fault diagnosis in total loss, and in addition, predicted Dirichlet distribution of fault monitoring data in distribution is concentrated, and target distribution Dir (μ | α) is distributed n ) Parameter (d) of
Figure FDA0003742394920000064
Should be larger, in order to ensure the prediction accuracy, the average value of the target distribution should be corresponding 0-1 label, i.e. the distribution average value satisfies
Figure FDA0003742394920000065
Figure FDA0003742394920000066
Can be considered as a constant independent of the input, however, this will result in a n Most of the parameters in the dirichlet allocation are taken as 0, so that the corresponding KL divergence is difficult to optimize, and the problem is avoided by endowing a region with the probability density of 0 in the dirichlet allocation with a lower probability;
Figure FDA0003742394920000067
xi is a smoothing factor with a smaller value, and values of all parameters of Dirichlet distribution are obtained on the basis;
s512, integrating uncertainty calibration loss in fault diagnosis into overall loss, realizing uncertainty calibration while training a fault diagnosis model, and expressing a loss function corresponding to fault monitoring data in distribution as follows:
Figure FDA0003742394920000068
wherein gamma is a hyper-parameter controlling the contribution of the calibration loss in the total loss;
the step S52 specifically includes the following steps:
s521, for the fault monitoring data outside the distribution, the cognition and the inherent uncertainty of the fault monitoring data do not need to be modeled, the distribution uncertainty only needs to be modeled to be distinguished from the fault monitoring data inside the distribution, the predicted value and the uncertainty quantification of the fault monitoring data outside the distribution are not credible, the uncertainty in fault diagnosis is not corrected, and the corresponding loss function is expressed as:
Figure FDA0003742394920000071
predictive dirichlet distribution of out-of-distribution fault monitoring dataRelatively dispersed, corresponding to the target distribution Dir (μ | α) t ) Parameter (d) of
Figure FDA0003742394920000072
Taking a smaller value, the model output corresponding to the sample outside the distribution should be equal in probability to indicate that the predicted value is not credible, namely alpha t All distribution parameters in
Figure FDA0003742394920000073
Is set to 1.
5. The method for quantitatively calibrating the uncertainty in the fault diagnosis of the electromechanical device based on the deep learning of claim 1, wherein the step S9 specifically comprises the following steps:
s91, measuring uncertainty in fault diagnosis through entropy, and then predicting the uncertainty to be expressed as:
Figure FDA0003742394920000074
wherein H [. C]Denotes the entropy of the distribution, p (Y = c | X, Y) denotes the prediction probability of the class c, by subtracting q from q θ S model parameters omega are sampled in omega s To yield p (Y = c | X, Y):
Figure FDA0003742394920000075
the prediction uncertainty is further broken down into:
Figure FDA0003742394920000076
wherein, I [. C]Representing mutual information, E μ|x,X,Y [·]Denotes the expectation under the distribution p (μ | X, Y), and p (μ | X, Y) = p (μ | X, ω) p (ω | X, Y) d ω, then there are:
Figure FDA0003742394920000081
wherein E is μ|x,X,Y [H[p(y|μ)]]The inherent uncertainty in fault diagnosis, mutual information I [ Y, mu | X, X, Y, is measured]Cognition and distribution uncertainty in fault diagnosis are measured, and distribution uncertainty performance is measured by an expected differential entropy alone:
Figure FDA0003742394920000082
s92, monitoring test data x for faults * And executing S times of Monte Carlo dropout in the testing stage to obtain a sampling value set:
Figure FDA0003742394920000083
s93, calculating the failure prediction probability, and obtaining the final prediction type and uncertainty quantization result:
prediction probability
Figure FDA0003742394920000084
The expression of (c) is:
Figure FDA0003742394920000085
distributing uncertainty in fault diagnosis
Figure FDA0003742394920000086
The expression of (c) is:
Figure FDA0003742394920000087
given a threshold value ε, if
Figure FDA0003742394920000088
X is then * The final prediction type and other uncertainty quantification results are obtained for the samples outside the fault monitoring data distribution without calculation, otherwise, x * For the samples within the fault-monitoring data distribution, its final fault prediction class c * The expression of (a) is:
Figure FDA0003742394920000089
wherein,
Figure FDA00037423949200000810
expressing a function for solving the subscript of the maximum element in the vector;
inherent uncertainty in fault diagnosis
Figure FDA00037423949200000811
The expression of (a) is:
Figure FDA0003742394920000091
prediction uncertainty in fault diagnosis
Figure FDA0003742394920000092
The expression of (a) is:
Figure FDA0003742394920000093
cognitive uncertainty in fault diagnosis
Figure FDA0003742394920000094
From Iy, μ | X, X, Y]Approximated, the expression is:
Figure FDA0003742394920000095
CN202210820966.4A 2022-07-12 2022-07-12 Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning Pending CN115204227A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210820966.4A CN115204227A (en) 2022-07-12 2022-07-12 Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210820966.4A CN115204227A (en) 2022-07-12 2022-07-12 Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning

Publications (1)

Publication Number Publication Date
CN115204227A true CN115204227A (en) 2022-10-18

Family

ID=83580273

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210820966.4A Pending CN115204227A (en) 2022-07-12 2022-07-12 Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning

Country Status (1)

Country Link
CN (1) CN115204227A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116502073A (en) * 2023-06-27 2023-07-28 北京理工大学 High-reliability intelligent fault diagnosis and health management method for wind generating set
CN116680589A (en) * 2023-05-18 2023-09-01 哈尔滨工业大学 DC charging pile remote verification method based on Dirichlet process and folded rod structural representation
CN117371299A (en) * 2023-12-08 2024-01-09 安徽大学 Machine learning method for Tokamak new classical circumferential viscous torque
CN117828481A (en) * 2024-03-04 2024-04-05 烟台哈尔滨工程大学研究院 Fuel system fault diagnosis method and medium for common rail ship based on dynamic integrated frame
CN118519818A (en) * 2024-07-23 2024-08-20 国富瑞(福建)信息技术产业园有限公司 Deep recursion network-based big data computer system fault detection method
CN118519818B (en) * 2024-07-23 2024-09-27 国富瑞(福建)信息技术产业园有限公司 Deep recursion network-based big data computer system fault detection method

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116680589A (en) * 2023-05-18 2023-09-01 哈尔滨工业大学 DC charging pile remote verification method based on Dirichlet process and folded rod structural representation
CN116502073A (en) * 2023-06-27 2023-07-28 北京理工大学 High-reliability intelligent fault diagnosis and health management method for wind generating set
CN117371299A (en) * 2023-12-08 2024-01-09 安徽大学 Machine learning method for Tokamak new classical circumferential viscous torque
CN117371299B (en) * 2023-12-08 2024-02-27 安徽大学 Machine learning method for Tokamak new classical circumferential viscous torque
CN117828481A (en) * 2024-03-04 2024-04-05 烟台哈尔滨工程大学研究院 Fuel system fault diagnosis method and medium for common rail ship based on dynamic integrated frame
CN118519818A (en) * 2024-07-23 2024-08-20 国富瑞(福建)信息技术产业园有限公司 Deep recursion network-based big data computer system fault detection method
CN118519818B (en) * 2024-07-23 2024-09-27 国富瑞(福建)信息技术产业园有限公司 Deep recursion network-based big data computer system fault detection method

Similar Documents

Publication Publication Date Title
CN115204227A (en) Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning
CN109948117A (en) A kind of satellite method for detecting abnormality fighting network self-encoding encoder
CN110598851A (en) Time series data abnormity detection method fusing LSTM and GAN
CN111813084B (en) Mechanical equipment fault diagnosis method based on deep learning
Yan et al. A dynamic multi-scale Markov model based methodology for remaining life prediction
Chang et al. A theoretical survey on Mahalanobis-Taguchi system
Turner et al. Likelihood-free Bayesian analysis of memory models.
CN113255848A (en) Water turbine cavitation sound signal identification method based on big data learning
CN102609612B (en) Data fusion method for calibration of multi-parameter instruments
CN112488235A (en) Elevator time sequence data abnormity diagnosis method based on deep learning
CN110555247A (en) structure damage early warning method based on multipoint sensor data and BilSTM
CN112651119B (en) Multi-performance parameter acceleration degradation test evaluation method for space harmonic reducer
CN115659583A (en) Point switch fault diagnosis method
Chen et al. A deep learning feature fusion based health index construction method for prognostics using multiobjective optimization
CN114118219A (en) Data-driven real-time abnormal detection method for health state of long-term power-on equipment
CN110852906B (en) Method and system for identifying electricity stealing suspicion based on high-dimensional random matrix
CN114512239A (en) Cerebral apoplexy risk prediction method and system based on transfer learning
CN115185937A (en) SA-GAN architecture-based time sequence anomaly detection method
Xie et al. Internal defect inspection in magnetic tile by using acoustic resonance technology
CN113868957B (en) Residual life prediction and uncertainty quantitative calibration method under Bayes deep learning
CN114495438B (en) Disaster early warning method, system, equipment and storage medium based on multiple sensors
CN116431346A (en) Compensation method for main memory capacity of electronic equipment
CN116384223A (en) Nuclear equipment reliability assessment method and system based on intelligent degradation state identification
CN115153549A (en) BP neural network-based man-machine interaction interface cognitive load prediction method
CN113627621B (en) Active learning method for optical network transmission quality regression estimation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination