CN115204227A

CN115204227A - Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning

Info

Publication number: CN115204227A
Application number: CN202210820966.4A
Authority: CN
Inventors: 林焱辉; 李港辉
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2022-07-12
Filing date: 2022-07-12
Publication date: 2022-10-18

Abstract

The invention provides an uncertainty quantitative calibration method in electromechanical equipment fault diagnosis based on deep learning, which comprises the following steps: acquiring a basic data set by a monitoring signal of a preprocessing device; determining the type and scale of a deep neural network, and constructing a Bayesian deep neural network for fault diagnosis; selecting calibration errors to evaluate uncertainty calibration loss, and determining loss functions on the in-distribution and out-distribution fault monitoring data sets; and determining hyper-parameters of the model, alternately training on the fault monitoring data sets in and out of distribution, and outputting fault diagnosis and uncertainty quantification results of the test data after the model is converged. The modeling method for the uncertainty of the fault diagnosis of the electromechanical equipment integrates the calibration loss, realizes the uncertainty quantification and calibration, obtains the inherent, cognitive, distribution and prediction uncertainty quantification results after the calibration by combining the uncertainty decomposition with the Monte Carlo sampling method, and effectively improves the fault diagnosis precision and the uncertainty quantification precision of the equipment.

Description

Uncertainty quantitative calibration method in equipment fault diagnosis based on deep learning

Technical Field

The invention belongs to the technical field of fault diagnosis in prediction and health management, and particularly relates to an uncertainty quantitative calibration method in fault diagnosis of electromechanical equipment based on deep learning.

Background

With the rapid development of economic science and technology, the increasing complexity of equipment brings unprecedented challenges to fault diagnosis. Particularly for safety key application, whether the fault occurs or not and the fault category can be accurately and timely judged, so that the serious economic loss and casualties can be effectively avoided. Fault diagnosis plays an important role in prediction and health management as an important method for determining the state of health of equipment. With the development and wide application of sensor technology, real-time monitoring of the operation state of equipment becomes possible, and a data base is provided for data-driven diagnostic technology. In addition, deep learning relies on strong learning expression ability and is widely applied to fault diagnosis.

Although the fault diagnosis method based on deep learning obtains excellent performance, a reliable uncertainty quantification result cannot be given to the diagnosis result, and the practical application of the methods is greatly limited. In fault diagnosis, uncertainties are generally classified into three categories: intrinsic, cognitive and distribution uncertainties. Inherent uncertainty captures the inherent noise of the observed data, reflecting the uncertainty caused by unknown or missing information, such as measurement errors, and the like, which cannot be reduced by adding training data. Cognitive uncertainty reflects uncertainty caused by lack of knowledge and can be reduced by adding training data. The distribution uncertainty reflects the uncertainty caused by data distribution change, and can capture the test samples which are not seen in the model training process, namely the uncertainty caused by the fault monitoring data outside the distribution. When quantifying uncertainty using a deep learning approach, many studies have the output of softmax as a prediction distribution, but the prediction distribution is often overly confident and unable to quantify the true prediction uncertainty. On the other hand, the bayesian deep learning method has both uncertainty quantification capability of the bayesian method and expression learning capability of the deep learning method, and thus becomes a hot point of current research. In order to realize simple and convenient calculation of the Bayes deep neural network, the approximation method based on dropout draws attention by virtue of higher calculation efficiency and superior performance. However, due to errors in model selection and the use of approximate reasoning, the quantification of uncertainty based on bayesian deep neural networks is often inaccurate. For example, a sample set with a prediction probability of 0.9 typically cannot contain 90% of the correct prediction results. Therefore, in order to improve the accuracy of uncertainty quantification while quantifying uncertainty, a Bayesian deep learning-based method for calibrating uncertainty quantification in a fault diagnosis process needs to be researched.

Disclosure of Invention

The invention aims to solve the technical problems of constructing an uncertainty modeling method based on a Bayesian depth neural network, integrating calibration loss, realizing uncertainty quantification and calibration, acquiring inherent uncertainty, cognitive uncertainty, distribution uncertainty and prediction uncertainty quantification results after calibration by combining uncertainty decomposition with a Monte Carlo sampling method, and improving fault diagnosis precision and uncertainty quantification precision.

In order to solve the problems, the invention provides a deep learning-based method for quantitatively calibrating uncertainty in fault diagnosis of electromechanical equipment, which comprises the following steps:

s1, preprocessing a fault monitoring signal of electromechanical equipment to acquire a basic data set: preprocessing a fault monitoring signal of the electromechanical equipment, including signal screening, feature extraction, data normalization and set division, and dividing an obtained in-distribution fault monitoring data set and an obtained out-distribution fault monitoring data set to obtain a training set, a verification set and a test set;

s2, determining the type and scale of the fault diagnosis deep neural network: selecting a deep neural network according to the characteristics of the equipment fault monitoring signal, and determining the scale of the network according to the size of the data set, wherein the scale comprises the number of neurons and the number of network layers;

s3, constructing a Bayes deep neural network for electromechanical device fault diagnosis: selecting probability distribution to capture inherent, cognitive and distribution uncertainty in fault diagnosis, constructing probability distribution by using a Bayesian deep neural network, and determining an estimation method of the distribution;

s31, quantifying cognitive uncertainty in fault diagnosis is fused into a fault diagnosis network, and posterior distribution of model parameters is estimated in a Bayesian neural network by selecting variation inference and combining a Monte Carlo sampling method;

s32, integrating quantification of distribution uncertainty in fault diagnosis into a fault diagnosis network, and respectively constructing KL divergence aiming at a fault monitoring data set in distribution and a fault monitoring data set out of distribution so as to determine corresponding probability distribution of the fault monitoring data set;

s33, quantitatively integrating inherent uncertainty in fault diagnosis into a fault diagnosis network to obtain KL divergence of the Bayesian neural network;

s4, constructing uncertainty quantitative calibration loss in fault diagnosis: determining a calibration evaluation index and evaluating calibration loss, and integrating the calibration loss into an overall loss function so as to realize calibration while realizing uncertainty quantification in the fault diagnosis of the electromechanical equipment;

s5, determining a loss function: constructing a loss function of the whole model;

s51, determining a loss function on a fault monitoring data set in distribution;

s52, determining a loss function on the external fault monitoring data set;

s6, determining the hyper-parameters of the fault diagnosis model of the electromechanical equipment: determining hyper-parameters of the model through a trial and error strategy, wherein the hyper-parameters comprise learning rate, batch size and the like, and determining the optimal combination of the hyper-parameters through grid search;

s7, training a fault diagnosis model of the electromechanical equipment: selecting an optimization method and combining the selected hyper-parameters to train a model on the in-distribution fault monitoring data set and the out-distribution fault monitoring data set in sequence;

s8, judging whether the fault diagnosis model of the electromechanical equipment is converged: judging whether the model parameter variation of the optimal model before and after training is smaller than a specified threshold, if so, executing a step S9, otherwise, executing a step S7;

s9, outputting a fault diagnosis result and an uncertainty quantification result of the electromechanical equipment: and outputting the fault diagnosis result and the quantitative results of inherent uncertainty, cognitive uncertainty, distribution uncertainty and prediction uncertainty of the fault diagnosis result by the trained model.

Further, the step S31 specifically includes the following steps:

s311, capturing cognitive uncertainty by utilizing posterior distribution of electromechanical equipment fault diagnosis model parameters, and aiming at historical data

And

wherein X is fault observation data, Y is a fault category label, and for a classification task with the classification category number of C, there is Y _n ∈{1,2,L,C}，x _n Representing single fault observation data, N representing observation data amount, and modeling cognitive uncertainty in fault diagnosis as posterior distribution p (omega | X, Y), wherein omega represents model parameters;

s312, introducing an inferred distribution q by means of variational inference _θ (ω) approximate p (ω | X, Y) and infer distribution q using KL divergence measure _θ Distance between (ω) and true posterior distribution p (ω | X, Y):

KL(q _θ (ω)||p(ω|X,Y))＝KL(q _θ (ω)||p(ω))-∫q _θ (ω)log(p(Y|X,ω))dω (1)

wherein p (Y | X, ω) represents a likelihood function based on the historical data set, p (ω) represents a prior distribution of weights, KL (| | -) represents KL divergence, θ represents a variation parameter, and θ is optimized to minimize KL (q | X, ω) _θ (ω) | p (ω | X, Y)) to obtain an estimate of the posterior distribution;

s313, for a deep neural network with L layers, the number of units in each layer is K _l The model parameter ω is expressed as:

wherein, W _l Model parameters of the l layer representing a deep neural network;

s314, applying the Concrete drop to the deep neural network to enable the deep neural network to be approximate to a Bayesian deep neural network, namely processing the fixed model parameter omega into a random variable which is inferred to be distributed:

wherein, for the variation parameter θ, there are:

wherein M is _l The expression dimension is K _l+1 ×K _l Average weight matrix of p _l The dropout probability of the l-th network, the inferred distribution q of each network _θ (ω) is expressed as:

wherein Bernoulli (·) represents a Bernoulli distribution function;

s315, selecting prior distribution of the fault diagnosis model parameter omega as follows:

wherein, p (W) _l ) A priori distribution of model parameters of layer I representing a deep neural network and having

υ _l A control parameter representing the degree of smoothness of the function;

s316, combining the Monte Carlo sampling method to obtain KL (q) _θ An analytical expression of (ω) | p (ω | X, Y)):

wherein, p (y) _i |x _i ω) represents the likelihood function of each sample, h (p) _l ) Entropy representing Bernoulli random variable and has h (p) _l )＝-p _l logp _l -(1-p _l )log(1-p _l )；

Further, the step S32 specifically includes the following steps:

s321, inputting the uncertain distribution performance of the fault monitoring data x, and capturing the uncertain distribution performance by placing a Dirichlet distribution on an output probability mu with a model parameter of omega:

wherein Γ (·) represents a gamma function, dir (·) represents a Dirichlet distribution, and μ = [ μ ]) ₁ ,…,μ _C ] ^T ＝[p(y＝1),…,p(y＝C)] ^T Representing the respective class probabilities of the model outputs, where μ ₁ Indicates the output probability, μ, of class 1 _C The output probability of class C is represented, and so on. α (x, ω) = [ α = ₁ (x,ω),…,α _C (x,ω)] ^T More than 0 is the parameter of Dirichlet distribution, and

the precision parameter representing the Dirichlet distribution reflects the discrete degree of the Dirichlet distribution, and the distribution is more concentrated when the value is larger;

s322, monitoring data for faults in distribution

And out-of-distribution fault monitoring data

The corresponding predicted Dirichlet distribution is Dir (μ | α (x) _n ω)) and Dir (μ | α (x) _t ω), and; to give the model the ability to discriminate between in-distribution and out-of-distribution fault monitoring data, input data x for in-distribution _n And distributing the external input data x _t Respectively selecting more concentrated Dirichlet distributionsDir(μ|α ⁿ ) And a relatively discrete Dirichlet distribution Dir (μ | α) ^t ) As target distribution, and using KL divergence to measure target distribution and predict the distance of dirichlet distribution:

for intra-distribution fault monitoring data:

for out-of-distribution fault monitoring data:

wherein T is the sample size of the out-of-distribution fault monitoring dataset, and α = [ α ] for any two parameters respectively ₁ ,…,α _C ] ^T And β = [ β = ₁ ,…,β _C ] ^T The KL divergence of the dirichlet distribution of (1) and (b) has an analytic form, namely:

where ψ (-) is a double gamma function,

and

respectively representing the precision parameters of the two Dirichlet distributions;

further, the step S33 specifically includes the following steps:

s331, capturing the inherent uncertain performance in the fault diagnosis process by predicting the classification distribution on the categories:

wherein, I {. Is used for indicating function, and Cat (·) is used for classifying distribution;

s332, calculating a likelihood function:

wherein,

is Dir (μ | α (x) _n ω)) of the measured values; then, KL (q) _θ (ω) | p (ω | X, Y)) is expressed as:

further, the step S4 specifically includes the following steps:

s41, defining calibration of uncertainty in a fault diagnosis process: calibration is defined as the failure category prediction probability can reflect the accuracy of the predicted value, when the failure diagnosis model has been calibrated, for a set of samples with failure category prediction probability κ, there should be samples with ratio κ correctly classified, and thus, for each category c ∈ {1, l, c }, it satisfies:

wherein Y | X ~ P, and

the predicted probability of the category c corresponding to the input X is input;

s42, uncertainty calibration error evaluation in fault diagnosis: for quantitatively evaluating calibration errors, selecting adaptive calibration errors as evaluation indexes, adjusting probability intervals according to the number of samples by the adaptive calibration errors for each category to ensure that each probability interval contains equal number of samples, calculating the accuracy and the average prediction probability in each interval, and if the accuracy and the average prediction probability are approximately equalIf the number of probability intervals for the class c is M, the probability interval for the M-th interval is represented as U _c,m Can be arranged in sequence

Obtaining a corresponding sample set of

The accuracy within this interval can then be calculated:

and average prediction probability:

the adaptive calibration error is obtained by averaging the calibration errors for each class and interval:

the self-adaptive calibration error considers the prediction probabilities of all categories, not only the prediction probability of the final prediction category, and can better perform comprehensive evaluation on the calibration errors of the multi-classification task, and in addition, when the sample size of fault monitoring data is small or the prediction probabilities are concentrated in 0 and 1, the self-adaptive calibration error can ensure that each section has enough samples by adaptively adjusting the length of the probability section, and the calibration error is objectively evaluated;

s43, constructing uncertainty calibration loss in fault diagnosis: in order to improve the quantitative accuracy of uncertainty in fault diagnosis during model training, calibration loss is fused into an integral loss function, the calibration loss is evaluated by a calibration error, a small batch gradient descent algorithm is selected for model optimization during Bayesian deep neural network training, each small batch cannot have sufficient fault monitoring data samples to evaluate the calibration loss, and therefore a self-adaptive calibration error can be selected to evaluate the calibration loss:

where CL (·) represents the uncertainty calibration loss in fault diagnosis.

Further, the step S51 specifically includes the following steps:

s511, for the fault monitoring data in distribution, the cognition and inherent uncertainty modeling in fault diagnosis can be realized by minimizing KL (q) _θ (ω) | p (ω | X, Y)) and distributed uncertainty modeling in fault diagnosis can be achieved by minimizing

The method is realized by the following steps that the loss function corresponding to uncertainty quantification in the fault diagnosis of the electromechanical equipment based on Bayesian deep learning is as follows:

wherein λ is a hyper-parameter controlling contribution degree of loss corresponding to distribution uncertainty in fault diagnosis in total loss, and in addition, predicted Dirichlet distribution of fault monitoring data in distribution is concentrated, and target distribution Dir (μ | α) is distributed ⁿ ) Parameter (d) of

Should be larger, in order to ensure the prediction accuracy, the average value of the target distribution should be corresponding 0-1 label, i.e. the distribution average value satisfies

Can be considered as a constant independent of the input, however, this will result in a ⁿ Most of the parameters in the Dirichlet distribution take values of 0, so that the corresponding KL divergence is difficult to optimize, and the problem is avoided by endowing a region with the probability density of 0 in the Dirichlet distribution with a lower probability;

the xi is a smoothing factor with a smaller value, and values of all parameters of Dirichlet distribution are obtained on the basis;

s512, integrating uncertainty calibration loss in fault diagnosis into overall loss, realizing uncertainty calibration while training a fault diagnosis model, and expressing a loss function corresponding to fault monitoring data in distribution as follows:

wherein gamma is a hyper-parameter controlling the contribution of the calibration loss in the total loss;

further, the step S52 specifically includes the following steps:

s521, for the fault monitoring data outside the distribution, the cognition and the inherent uncertainty of the fault monitoring data do not need to be modeled, the distribution uncertainty only needs to be modeled to be distinguished from the fault monitoring data inside the distribution, the predicted value and the uncertainty quantification of the fault monitoring data outside the distribution are not credible, the uncertainty in fault diagnosis is not corrected, and the corresponding loss function is expressed as:

the predicted Dirichlet distribution of the out-of-distribution fault monitoring data is more dispersed and corresponds to the target distribution Dir (mu | alpha) ^t ) Parameter (d) of

Taking a smaller value, the model output corresponding to the sample outside the distribution should beEqui-probability to indicate that the prediction is not reliable, i.e. alpha ^t All distribution parameters in

Is set to 1.

Preferably, the step S9 specifically includes the following steps:

s91, measuring uncertainty in fault diagnosis through entropy, and then predicting the uncertainty to be expressed as:

wherein H [. C]Denotes the entropy of the distribution, p (Y = c | X, Y) denotes the prediction probability of the class c, by subtracting q from q _θ S model parameters omega are sampled in omega _s To yield p (Y = c | X, Y):

the prediction uncertainty is further broken down into:

wherein, E _μ|x,X,Y [H[p(y|μ)]]Measures the inherent uncertainty and mutual information I [ Y, mu | X, X, Y in fault diagnosis]Cognition and distribution uncertainty in fault diagnosis are measured, and distribution uncertainty performance is measured by an expected differential entropy alone:

s92, monitoring test data x for faults _* In the testing stage, S monte carlo dropouts are executed to obtain a set of sampling values:

s93, calculating the failure prediction probability, and obtaining the final prediction type and uncertainty quantization result:

prediction probability

The expression of (a) is:

distributing uncertainty in fault diagnosis

The expression of (a) is:

given a threshold value ε, if

X is then _* The final prediction type and other uncertainty quantification results are obtained for the samples outside the fault monitoring data distribution without calculation, otherwise, x _* For the samples within the fault-monitoring data distribution, its final fault prediction class c ^* The expression of (a) is:

wherein,

expressing a function for solving the subscript of the maximum element in the vector;

inherent uncertainty in fault diagnosis

The expression of (a) is:

prediction uncertainty in fault diagnosis

The expression of (a) is:

cognitive uncertainty in fault diagnosis

From Iy, μ | X, X, Y]Approximated, the expression is:

compared with the prior art, the technical effect that this scheme produced is:

compared with the traditional fault diagnosis method, the uncertainty quantification method in the equipment fault diagnosis based on the deep learning, provided by the invention, has the advantages that the uncertainty in the fault diagnosis is more effectively and accurately quantified due to the comprehensive consideration of the comprehensive influence caused by inherent uncertainty, cognitive uncertainty and distribution uncertainty in the uncertainty quantification process; the uncertainty calibration method in the electromechanical device fault diagnosis process based on Bayesian deep learning is based on specific diagnosis tasks and combined with self-adaptive calibration errors, and provides uncertainty calibration loss to realize uncertainty calibration, so that an important basis is provided for accurate quantification of uncertainty in electromechanical device fault diagnosis; for the trained network, various uncertain quantification results are obtained by combining uncertainty decomposition and Monte Carlo sampling, so that the diagnosis precision and the uncertain quantification precision are effectively improved.

Drawings

FIG. 1 is a flowchart of an uncertainty quantitative calibration method in deep learning-based fault diagnosis of electromechanical devices according to the present invention;

FIG. 2 is a graph of a prediction network constructed for a bearing dataset;

FIG. 3 is a statistical result of the distribution uncertainty on the test data of the present invention both outside and inside the distribution of the PU data set;

FIG. 4 is a statistical result of the distribution uncertainty on the test data outside and inside the distribution of the IMS dataset according to the present invention;

Detailed Description

The present application will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings. The embodiments and features of the embodiments in the present application may be combined with each other without conflict.

The following fig. 1 shows an uncertainty quantitative calibration method in deep learning based fault diagnosis of an electromechanical device, which includes the following steps:

s1, preprocessing a monitoring signal to obtain a basic data set: two bearing datasets were analyzed and preprocessed, the University of Padrborn (PU) dataset, produced and provided by the University of padrborn, germany, which was collected from a modular test stand and contained three operating conditions: four working condition data consisting of rotating speed, loading torque and radial force. In the data set, the monitoring data of each bearing under each working condition consists of 20 data files, and each data file stores vibration signals and motor current data with the sampling frequency of 64kHz and the sampling time of 4 seconds. Considering that the vibration signal can effectively reflect the health state of the bearing in fault diagnosis, only the vibration signal data is selected for analysis. Further, the bearings in the data set are classified into three categories according to the cause of the occurrence of a failure: healthy, human damaged and accelerated testing caused damage. Wherein, select rotational speed 900rpm, loading torque 0.7Nm, the bearing state monitoring data of health condition and artificial damage under the radial force 1000N operating mode as experimental data set, wherein the trouble of artificial damage bearing is produced by artificial processing, and the damage mode includes: electrical discharge machining, drilling and electrical engraving. In addition, the damaged portion and the damaged degree of each bearing were different, and specific information of the selected bearing is shown in table 1. Selecting a bearing with a damage mode of electric spark machining as an outer distribution sample, selecting bearings with other damage modes as inner distribution samples, and setting the proportion of the samples in a training set, a verification set and a test set in the inner distribution fault monitoring data and the outer distribution fault monitoring data to be 7.

TABLE 1

An Intelligent Maintenance System (IMS) data set is provided by the american Intelligent Maintenance system center, which contains three sub data sets. Each subdata set consists of a plurality of data files, and each data file stores vibration signals of 4 bearings with the sampling frequency of 20kHz and the sampling time of 1 second. And selecting the bearing 1 data in the second subdata to carry out an experiment, wherein the total number of the bearing 1 data is 984 data files, and the acquisition interval time of each file is 10 minutes. The bearing data are grouped according to the degree of degradation of the bearing as shown in table 2. Similarly, the degradation later stage and fault data are selected as samples outside the distribution, the rest are samples inside the distribution, and the proportion of the samples in the training set, the verification set and the test set is 7. Compared with the PU data set, the experimental data set describes a complete life cycle of the bearing, and faults are generated in the running process of the bearing instead of being generated by artificial machining. Thus, the data set is closer to reality, with higher observed noise. In addition, the goal of fault diagnosis on this data set is to determine the current state of health of the bearing in order to develop a maintenance and security strategy.

TABLE 2

S2, determining the type and scale of the deep neural network: and selecting a Recurrent Neural Network (RNN) as a basic frame according to the time sequence characteristics of the bearing monitoring signals. And time domain characteristics of the vibration signal such as mean value, kurtosis, skewness, root mean square value and the like are used as input of the RNN to further extract characteristics and make prediction, and in order to capture Long-time dependence characteristics, variants of the RNN such as a Long Short-term Memory (LSTM) network and a Gated Recurrent Unit (GRU) are selected to carry out model construction. To verify the universality of the proposed method, LSTM and GRU were chosen to model the PU and IMS datasets, respectively, and to obtain a stronger expressive power by stacking multiple RNN layers. And connecting a full connection layer to output parameters of Dirichlet distribution at the last time step of the last layer of the model. Considering the non-negativity of the dirichlet distribution parameters, the model output layer selects an exponential activation function, as shown in fig. 2. After multiple training verification, three layers of LSTM or GRU are selected to establish a network, and the number of neurons in each layer is 128, 64 and 32 respectively.

S3, establishing a Bayes deep neural network for fault diagnosis: and (3) applying Concrete dropouts to each layer of the network to approximate the Concrete to be a Bayesian deep neural network, and selecting variation dropouts when applying the dropouts in order to ensure the correctness of Bayesian reasoning, namely ensuring that dropouts masks of each time step are the same. In addition, probability distribution is selected to capture inherent, cognitive and distribution uncertainty, the probability distribution is constructed by utilizing a Bayesian deep neural network, and an estimation method of the distribution is determined.

S31, quantifying of cognitive uncertainty is fused into a network, and posterior distribution of model parameters is estimated in a Bayesian neural network by selecting variation inference and combining a Monte Carlo sampling method.

And

wherein X is fault observation data, Y is a fault category label, and for a classification task with the classification category number of C, there is Y _n ∈{1,2,…,C}，x _n Representing single fault observation data, N representing an observation data amount, and modeling cognitive uncertainty in fault diagnosis as posterior distribution p (omega | X, Y), wherein omega represents a model parameter;

s312, introducing an inferred distribution q by means of variational inference _θ (ω) approximate p (ω | X, Y) and infer the distribution q using KL divergence measure _θ Distance between (ω) and the true posterior distribution p (ω | X, Y):

wherein, W _l Layer I model representing deep neural networkA type parameter;

wherein, for the variation parameter θ, there are:

wherein Bernoulli (·) represents a Bernoulli distribution function;

s316, combining the Monte Carlo sampling method to obtain KL (q) _θ Analytical expression of (ω) | p (ω | X, Y)):

S32, quantizing the distribution uncertainty into a network, and respectively constructing KL divergence aiming at the fault monitoring data sets inside and outside the distribution so as to estimate the corresponding probability distribution.

S321, the distribution uncertainty of the input fault monitoring data x is captured by placing a Dirichlet distribution on the output probability mu under the model parameter omega:

s322, monitoring data for faults in distribution

And out-of-distribution fault monitoring data

The corresponding predicted Dirichlet distribution is Dir (μ | α (x) _n ω)) and Dir (μ | α (x)) _t ω)). To give the model the ability to discriminate between in-distribution and out-of-distribution fault monitoring data, input data x for in-distribution _n And distributing the external input data x _t Respectively selecting more concentrated Dirichlet distribution Dir (mu | alpha) ⁿ ) And a relatively discrete Dirichlet distribution Dir (μ | α) ^t ) As target distribution, and measure target distribution and predict distance of dirichlet distribution using KL divergence:

for intra-distribution fault monitoring data:

for out-of-distribution fault monitoring data:

wherein, T is the sample size of the out-of-distribution fault monitoring data set, and α = [ α ] for any two parameters respectively ₁ ,…,α _C ] ^T And β = [ β = ₁ ,…,β _C ] ^T The KL divergence of the dirichlet distribution of (1) and (b) has an analytic form, namely:

where ψ (-) is a double gamma function,

and

and S33, fusing the quantification of the inherent uncertainty into the network to obtain the KL divergence of the Bayesian neural network.

S331, capturing the inherent uncertainty in the fault diagnosis process by predicting the classification distribution on the categories:

s332, calculating a likelihood function:

wherein,

the step S3 is an important invention point of the invention, and is mainly embodied in that a Bayesian deep learning network is constructed to model inherent, cognitive and distribution uncertainty in fault diagnosis, and a distribution estimation method of the Bayesian deep learning network is determined, so that an important basis is provided for quantifying the uncertainty.

S4, establishing a calibration loss: the adaptive calibration error is selected to evaluate the calibration loss and the number of intervals M is set to 10 and 20 and the number of classes C is 7 and 3 for the PU data set and the IMS data set, respectively.

S41, defining calibration of uncertainty in a fault diagnosis process: calibration is defined as the failure category prediction probability that reflects the accuracy of the predicted value, when the failure diagnosis model has been calibrated, for a set of samples with failure category prediction probability κ, there should be samples with ratio κ correctly classified, and thus, for each category C ∈ {1, …, C }, both satisfy:

wherein Y | X ~ P, and

s42, uncertainty calibration error evaluation in fault diagnosis: for quantitative evaluation of calibration errors, selecting adaptive calibration errors as evaluation indexes, for each category, adjusting a probability interval by the adaptive calibration errors according to the number of samples to ensure that each probability interval contains equal number of samples, calculating the accuracy and the average prediction probability in each interval, if the accuracy and the average prediction probability are approximately equal, considering the model as a calibrated model, and for category c, if the number of the probability intervals is M, expressing the probability interval of the mth interval as U _c,m By arranging them in sequence

Obtained as a corresponding sample set

The accuracy within this interval is then calculated:

and average prediction probability:

s43, constructing uncertainty calibration loss in fault diagnosis: in order to improve the quantitative accuracy of uncertainty in fault diagnosis during model training, calibration loss is fused into an integral loss function, the calibration loss is evaluated by a calibration error, a small batch gradient descent algorithm is selected for model optimization during Bayesian deep neural network training, each small batch cannot have sufficient fault monitoring data samples to evaluate the calibration loss, and therefore a self-adaptive calibration error is selected to evaluate the calibration loss:

wherein CL (·) represents uncertainty calibration loss in fault diagnosis;

the step S4 is an important invention point of the present invention, and is mainly reflected in that the uncertainty calibration loss in the fault diagnosis is provided in combination with the adaptive calibration error, so as to provide an important basis for accurate quantification of uncertainty in the fault diagnosis.

S5, determining a loss function: and constructing a loss function of the fault diagnosis model.

And S51, determining a loss function on the fault monitoring data set in the distribution.

S511, for the fault monitoring data in distribution, the cognitive uncertainty and the inherent uncertainty in fault diagnosis are modeled by minimizing KL (q) _θ (ω) | p (ω | X, Y)) and distributed uncertainty modeling in fault diagnosis by minimizing

The method is realized by the following steps that the loss function corresponding to uncertainty quantification in the electromechanical device fault diagnosis based on Bayesian deep learning is as follows:

Treated as a constant independent of the input, however, this would result in a ⁿ Most of the parameters in the Dirichlet distribution take values of 0, so that the corresponding KL divergence is difficult to optimize, and the problem is avoided by endowing a region with the probability density of 0 in the Dirichlet distribution with a lower probability;

xi is a smoothing factor with a smaller value, and values of all parameters of Dirichlet distribution are obtained on the basis;

s512, integrating uncertainty calibration loss in fault diagnosis into overall loss, realizing uncertainty calibration while training a fault diagnosis model, and writing a loss function corresponding to fault monitoring data in distribution as follows:

wherein gamma is a hyper-parameter for controlling the contribution degree of the calibration loss in the total loss;

further, step S52 specifically includes the following steps:

Taking a smaller value, the model output corresponding to the sample outside the distribution should be equal in probability to indicate that the predicted value is not credible, namely alpha ^t All distribution parameters in

Is set to 1;

after multiple training verification, the selection of the hyper-parameters in the loss function, such as the control parameters of each layer and the hyper-parameters controlling the contribution degree of each loss item, is shown in table 3.

TABLE 3

S6, determining the hyper-parameters of the fault diagnosis model: for the PU data set and the IMS data set, the selected sequence lengths are 30 and 20, respectively, and the initial dropout probability for the network is set to 0.2 for each. And selecting an Adam algorithm to optimize the model on the fault monitoring data sets in and out of distribution, trying various combinations of hyper-parameters by combining a trial and error strategy and grid search, and verifying and selecting the optimal combination by training as shown in Table 4.

TABLE 4

S7, training a fault diagnosis model of the electromechanical equipment: alternately training the models on the fault monitoring data sets in and out of the distribution, saving the loss on the verification set of each training round after each cycle is finished, and if the minimum loss on the verification sets of two continuous cycles does not decrease any more, considering that the models are converged, terminating the cycle, and saving the final models;

s8, judging whether the fault diagnosis model of the electromechanical equipment is converged: executing step S9 when the model parameter variation of the optimal model before and after training is smaller than a specified threshold;

s9, outputting a fault diagnosis result and an uncertainty quantification result of the electromechanical equipment: and outputting the fault diagnosis result and the quantitative results of the inherent uncertainty, the cognitive uncertainty, the distribution uncertainty and the prediction uncertainty of the fault diagnosis result by the trained model.

S91, measuring uncertainty through entropy, and describing prediction uncertainty as follows:

and the prediction uncertainty is further decomposed into:

wherein,

wherein E is _μ|x,X,Y [H[p(y|μ)]]The inherent uncertainty and mutual information I [ Y, mu | X, X, Y are measured]Cognitive and distribution uncertainty is measured. In addition, distribution uncertainty alone passes through the periodThe differential entropy is expected to measure:

s92, for test data x _* And executing S times of Monte Carlo dropout in the testing stage to obtain a sampling value set:

s93, calculating the prediction probability, and obtaining the final prediction type and uncertainty quantization result:

wherein the probability is predicted

The expression of (a) is:

distribution uncertainty

The expression of (a) is:

given a threshold value ε, if

X is then _* To distribute the out-of-range samples, in this case, no computation is needed to get the final prediction classes and other uncertainty quantification results. Otherwise, x _* For the intra-distribution samples, their final prediction class c ^* The expression of (a) is:

inherent uncertainty

The expression of (a) is:

prediction uncertainty

The expression of (c) is:

cognitive uncertainty

From Iy, μ | X, X, Y]Approximated, the expression is:

and repeating the Monte Carlo dropout for 1000 times in the testing stage to obtain 1000 groups of sampling values of Dirichlet distribution, and calculating to obtain a fault diagnosis and uncertainty quantification result. Table 5 statistics of the mean uncertainty, accuracy and adaptive calibration error for each class of test samples within the distribution. Fig. 3 and 4 show statistical results of the distribution uncertainty of the in-distribution and out-distribution test samples.

TABLE 5

Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made thereto without departing from the spirit and scope of the invention and it is intended to cover in the claims the invention as defined in the appended claims.

Claims

1. A method for quantitatively calibrating uncertainty in fault diagnosis of electromechanical equipment based on deep learning is characterized by comprising the following steps:

s52, determining a loss function on the external fault monitoring data set;

2. The method for quantifying and calibrating the uncertainty in the deep learning based fault diagnosis of the electromechanical device according to claim 1, wherein the step S31 specifically comprises the following steps:

And

wherein X is fault observation data, Y is a fault category label, and for a classification task with the classification category number of C, there is Y _n ∈{1,2,L,C}，x _n Representing single fault observation data, N representing an observation data amount, and modeling cognitive uncertainty in fault diagnosis as posterior distribution p (omega | X, Y), wherein omega represents a model parameter;

s312, introducing an inferred distribution q by variation inference _θ (ω) approximate p (ω | X, Y) and infer distribution q using KL divergence measure _θ Distance between (ω) and the true posterior distribution p (ω | X, Y):

wherein, for the variation parameter θ, there are:

wherein M is _l The expression dimension is K _l+1 ×K _l Average weight matrix of p _l The dropout probability of the l-th network, the inferred distribution q of each network _θ (ω) is represented by:

wherein Bernoulli (·) represents a Bernoulli distribution function;

The step S32 specifically includes the following steps:

wherein Γ (·) represents a gamma function, dir (·) represents a Dirichlet distribution, and μ = [ μ ]) ₁ ,…,μ _C ] ^T ＝[p(y＝1),…,p(y＝C)] ^T Representing the respective class probabilities of the model outputs, where μ ₁ Indicates the output probability, μ, of class 1 _C Representing the output probability of class C, and so on. α (x, ω) = [ α = ₁ (x,ω),…,α _C (x,ω)] ^T More than 0 is the parameter of Dirichlet distribution, and

the accuracy parameter representing the Dirichlet distribution reflects the discrete degree of the Dirichlet distribution, and the distribution is more concentrated when the value is larger;

s322, monitoring data for faults in distribution

And out-of-distribution fault monitoring data

The corresponding predicted Dirichlet distribution is Dir (μ | α (x) _n ω)) and Dir (μ | α (x) _t ω), and; to give the model the ability to discriminate between in-distribution and out-of-distribution fault monitoring data, input data x for in-distribution _n And distributing the external input data x _t Respectively selecting more concentrated Dirichlet distribution Dir (mu | alpha) ⁿ ) And a relatively discrete Dirichlet distribution Dir (μ | α) ^t ) As target distribution, and using KL divergence to measure target distribution and predict the distance of dirichlet distribution:

for intra-distribution fault monitoring data:

for out-of-distribution fault monitoring data:

where ψ (-) is a double gamma function,

and

the step S33 specifically includes the following steps:

wherein, I {. Represents an indication function, and Cat (·) represents classification distribution;

s332, calculating a likelihood function:

wherein,

3. the method for quantitatively calibrating the uncertainty in the fault diagnosis of the electromechanical device based on the deep learning of claim 1, wherein the step S4 specifically comprises the following steps:

wherein Y | X ~ P, and

s42, uncertainty calibration error evaluation in fault diagnosis: for quantitative evaluation of calibration errors, selecting adaptive calibration errors as evaluation indexes, for each category, adjusting probability intervals according to the number of samples by the adaptive calibration errors, so that each probability interval contains equal number of samples, calculating the accuracy and the average prediction probability in each interval, if the accuracy and the average prediction probability are approximately equal, considering the model as a calibrated model, and for category c, if the number of the probability intervals is M, considering the model as a calibrated modelFor the mth interval, the probability interval is represented as U _c,m Can be arranged in sequence

Obtained as a corresponding sample set

The accuracy within this interval can then be calculated:

and average prediction probability:

where CL (·) represents the uncertainty calibration loss in fault diagnosis.

4. The method for quantitatively calibrating the uncertainty in the fault diagnosis of the electromechanical device based on the deep learning of claim 1, wherein the step S51 specifically comprises the following steps:

Can be considered as a constant independent of the input, however, this will result in a ⁿ Most of the parameters in the dirichlet allocation are taken as 0, so that the corresponding KL divergence is difficult to optimize, and the problem is avoided by endowing a region with the probability density of 0 in the dirichlet allocation with a lower probability;

the step S52 specifically includes the following steps:

predictive dirichlet distribution of out-of-distribution fault monitoring dataRelatively dispersed, corresponding to the target distribution Dir (μ | α) ^t ) Parameter (d) of

Is set to 1.

5. The method for quantitatively calibrating the uncertainty in the fault diagnosis of the electromechanical device based on the deep learning of claim 1, wherein the step S9 specifically comprises the following steps:

the prediction uncertainty is further broken down into:

wherein E is _μ|x,X,Y [H[p(y|μ)]]The inherent uncertainty in fault diagnosis, mutual information I [ Y, mu | X, X, Y, is measured]Cognition and distribution uncertainty in fault diagnosis are measured, and distribution uncertainty performance is measured by an expected differential entropy alone: