CN114580472B - Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet - Google Patents

Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet Download PDF

Info

Publication number
CN114580472B
CN114580472B CN202210187119.9A CN202210187119A CN114580472B CN 114580472 B CN114580472 B CN 114580472B CN 202210187119 A CN202210187119 A CN 202210187119A CN 114580472 B CN114580472 B CN 114580472B
Authority
CN
China
Prior art keywords
sequence
fault
attention
formula
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210187119.9A
Other languages
Chinese (zh)
Other versions
CN114580472A (en
Inventor
尹小燕
南鑫
刘长友
龚志敏
王禹
田苗
崔瑾
陈晓江
房鼎益
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest University
Original Assignee
Northwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University filed Critical Northwest University
Priority to CN202210187119.9A priority Critical patent/CN114580472B/en
Publication of CN114580472A publication Critical patent/CN114580472A/en
Application granted granted Critical
Publication of CN114580472B publication Critical patent/CN114580472B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention provides a large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet. The prediction method provided by the invention adopts a supervised learning mode, collects fault samples, extracts fault characteristics, constructs a causal analysis model by analyzing potential causal relationship between the characteristics and faults, and realizes the prediction of equipment faults by combining causal analysis and a time attention mechanism. The prediction method is based on causal analysis, the potential relation between the characteristics and the fault prediction accuracy is explored, a feasible method is provided for characteristic selection of a large-scale equipment fault prediction model, and then the major factor characteristics of the fault are distributed with larger weight, and the minor factor characteristics are distributed with smaller weight.

Description

Large equipment fault prediction method with both causality and attention in industrial Internet
Technical Field
The invention belongs to the technical field of industrial internet, relates to a large-scale equipment fault prediction method, and particularly relates to a large-scale equipment fault prediction method with the effect and attention being the same in industrial internet.
Background
The long-term stable operation of large-scale equipment in the industrial Internet has important significance for safe production. The sensor nodes deployed for all-weather monitoring requirements on large equipment can cause mass data, based on feature extraction of fault samples, real-time analysis of monitoring data facing a full life cycle can accurately control the running state of the large equipment, and can predict possible faults, so that an equipment fault emergency plan is started, the equipment is maintained in time, and the occurrence of industrial safety accidents is avoided. Therefore, the operation state monitoring and fault prediction of industrial internet large-scale equipment are urgently needed to be researched. On the other hand, the conventional digital, networked and intelligent large-scale equipment fault analysis technology is insufficient, the real-time processing and analysis requirements of massive monitoring large data cannot be met, a large data analysis framework facing to the industrial internet is indispensable to construct, and a high-level data model facing to the industrial internet and large data analysis capability are urgent.
At present, several methods of predicting the failure of the device have been proposed in succession:
(A) The signal analysis method comprises the following steps: and carrying out numerical transformation analysis according to the signal change monitored by the deployed sensor of the large-scale equipment, and carrying out equipment state detection and fault prediction based on the knowledge and experience in the professional field.
(B) Method based on linear discriminant: and (3) counting the signal characteristics monitored by the sensor in the fault state, extracting the fault characteristics by using a principal component analysis method, and inputting the extracted important characteristics to a linear discriminator for classification.
(C) The method based on the convolutional neural network comprises the following steps: and (4) counting information in a period of time, converting the extracted time domain signal into a frequency domain signal by using the convolutional layer, and then training by using the full link layer to obtain a fault result.
(D) The method based on data fusion comprises the following steps: and synthesizing the data of each sensor, and predicting the equipment fault through a fusion algorithm.
The method can realize the operation state monitoring and the fault prediction of the large-scale equipment under specific conditions. However, the traditional signal analysis method needs deeper professional knowledge reserves, the linear arbiter and the data fusion depend on the number and the feature dimensions of the sensor data, and the deep learning can directly carry out the high-dimensional feature calculation characteristic of end-to-end learning, so that the sensor data of the equipment can be directly used. However, the deep learning is a black box process, and how the characteristics selected by the learning algorithm influence the experimental result still remains to be solved.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a large-scale equipment fault prediction method with serious cause and effect and attention in the industrial internet so as to solve the technical problem that the accuracy of the prediction method in the prior art needs to be further improved.
In order to solve the technical problems, the invention adopts the following technical scheme:
a method for predicting the failure of a large-scale device with repeated causality and attention in an industrial internet is characterized by comprising the following steps:
step 1, collecting fault data of large equipment, and taking the fault data as a training sample;
step 2, preprocessing the data of the training samples of the large-scale equipment faults classified in the step 1, obtaining time domain characteristics of the samples by using a signal time domain analysis method, coding the time domain characteristics, and normalizing the numerical characteristics to obtain a sample data sequence of the preprocessed large-scale equipment sensor;
step 3, performing causal analysis on the sample data sequence of the large-scale equipment sensor obtained in the step 2, and quantifying the influence degree of each characteristic on a prediction result based on a set causal analysis objective function;
step 4, combining the time information and the result of the step 3, and obtaining hidden layer data and attention scores of the large-scale equipment based on a model of a time attention mechanism;
and 5, predicting the fault of the large equipment by using the hidden layer data and the attention score obtained in the step 4.
Compared with the prior art, the invention has the following technical effects:
the prediction method is based on causal analysis, the potential relation between the characteristics and the fault prediction accuracy is explored, a feasible method is provided for characteristic selection of a large-scale equipment fault prediction model, and then the main cause characteristics of the fault are distributed with large weight, and the secondary cause characteristics are distributed with small weight.
And (II) analyzing fine-grained change of a large-scale equipment fault sample in a time dimension based on an attention mechanism, searching for a key time point, improving the accuracy of fault prediction, timely starting an equipment fault emergency plan, timely overhauling equipment and avoiding the occurrence of an industrial safety accident.
Drawings
FIG. 1 is a diagram of an early plateau signal and a mid-fault signal.
FIG. 2 is a selected feature sequence diagram.
FIG. 3 is a diagram showing the internal structure of the transducer.
The present invention will be explained in further detail with reference to examples.
Detailed Description
In recent years, researchers have made various attempts and extensions to cause-effect analysis, and have achieved some research results in the field of neural networks. Attention mechanism is widely applied to translation tasks, medical tasks and the like, and the attention mechanism can be modeled through serialization, so that the change situation of numerical values in a time dimension can be captured. Recent studies have demonstrated the effectiveness of attention mechanisms, but there has been no much attention in the field of industrial internets, because attention mechanisms are usually a highly dimensional feature learning on discrete data, capturing the relationship between data and tasks. Therefore, the method for monitoring and predicting the fault of the large-scale equipment facing the industrial internet is explored based on causal analysis and a time attention mechanism.
The invention provides a large-scale equipment fault prediction method with repeated cause and effect and attention in an industrial internet, namely a cause and effect perception large-scale equipment operation state monitoring and fault prediction method based on an attention mechanism in the industrial internet.
The prediction method provided by the invention adopts a supervised learning mode, collects fault samples, extracts fault characteristics, constructs a causal analysis model by analyzing potential causal relationship between the characteristics and faults, and realizes the prediction of equipment faults by combining causal analysis and a time attention mechanism.
It should be noted that all algorithms in the present invention, if not specifically mentioned, all employ algorithms known in the art.
In the present invention, it is noted that:
the SVM algorithm refers to a support vector machine algorithm.
The RF algorithm refers to a random forest algorithm.
The LR algorithm refers to a logistic regression algorithm.
The LSTM algorithm refers to a long-short term memory network algorithm.
The GRU algorithm refers to a gated round-robin unit algorithm.
The DFC-CNN algorithm refers to a Deep full Convolutional Neural Network (English).
The DA-RNN algorithm refers to a two-stage Attention-cycle Neural Network algorithm (English).
The DW-AE algorithm refers to a depth Wavelet Auto-Encoder algorithm (English).
A Transformer refers to a deep self-attention network.
AUC refers to the subject operating characteristic curve.
The Softmax function refers to a normalization function.
The present invention is not limited to the following embodiments, and all equivalent changes based on the technical solutions of the present invention fall within the protection scope of the present invention.
The embodiment is as follows:
the embodiment provides a method for predicting the fault of large equipment with repeated cause and effect and attention in the industrial internet, which comprises the following steps:
step 1, collecting fault data of large equipment, and taking the fault data as a training sample;
step 1 comprises the following substeps:
step 1.1, classifying fault data based on collected large-scale equipment fault samples, and marking each type of fault data;
specifically, in this embodiment, the experimental data is derived from the data set of the life cycle vibration signal bearing manufactured by the science and technology of the university of western-safety transportation and the Shang Yang. The experiment platform comprises a rotating speed control motor, a rotating shaft, a supporting bearing, a hydraulic loading system, a test bearing and the like. The data set had 3 conditions, 5 bearings for each condition. A total of 15 bearing full life cycle signal samples. In the test, the sampling frequency is set to be 25.6kHz, the sampling interval is set to be 1min, and the sampling time length of each time is set to be 1.28s. The number of signals collected by each sensor per minute is 32769. Given the fewer types of conditions in the data set, the data is first preprocessed. The essence of the data enhancement, i.e. increasing the size of the data set, is to increase the size of the data set under reasonable operation to obtain the learning result, and the time required for collecting 2000 data is taken as the unit time tau, wherein there are 15 devices of the same type, and each device has two acceleration sensors to record the data change of the bearing.
Step 1.2, carrying out mass sampling on the marked fault data of each type of large equipment, recording the time of each sampling point in the sampling process, and arranging the acquired signal segments according to the labels of the large equipment where the signal segments are located to obtain training samples;
the specific process of the finishing is as follows: the number of the large-scale equipment is G, and each large-scale equipment is provided with I monitoring sensors; the fault type of the large-scale equipment is Q; defining unit time tau, and making data acquired by the large-scale equipment every time the large-scale equipment goes through tau into a data slice; the time from starting to failure of the large-scale equipment is T; the time of the kth data slice is tau multiplied by k; the data collected by the large-scale equipment in each T process comprises K data slices which are arranged in time sequence.
In the present embodiment, sampling is used. Taking each device as an example, 2000 data may be taken as one slice, and a slice of the full-period signal may be obtained. After a fault occurs, the signal characteristics are obvious, the fault prediction of equipment is meaningless at the moment, the fault is predicted as early as possible in time, when the fault occurs early or does not occur, due to the periodicity of signals, information contained in a large amount of data is very little at the moment, the signal condition in the period is considered as few as possible, when the signals are abnormal before the fault occurs, the signal change condition in the period is paid attention to as much as possible, and the method accords with the general detection process flow of equipment. Therefore, when a data set is acquired, in consideration of real industrial equipment inspection, when sequence analysis is performed each time, and the size of the sequence, as shown in fig. 1, 20 slices are randomly acquired at the early stage of signal stabilization, 40 slices are randomly acquired at the middle stage of failure and combined into one slice sequence, and K =60, and 1000 slice sequences are acquired according to each working condition by the method.
The method has the advantages of the step 1: the sequence data set with the characteristic information and the time information is obtained through the steps, and the requirement of model data volume is met. The enhanced data set is taken entirely from the original data set and no additional information is introduced.
There are 3 fault types in this example for experimental demonstration of the new method.
Step 2, preprocessing the data of the training samples of the large equipment faults sorted and classified in the step 1, obtaining time domain characteristics of the samples by using a signal time domain analysis method, coding the time domain characteristics, and normalizing the numerical characteristics to obtain a preprocessed sample data sequence of the large equipment sensor;
step 2 comprises the following substeps:
step 2.1, obtaining the time domain characteristics of the data slice by using a signal time domain analysis method for the well-regulated data slice, and making the time domain characteristics of the corresponding sample into a time domain characteristic slice;
and (3) obtaining unique time domain characteristic parameters from the data of each time domain characteristic slice by adopting a signal time domain analysis method, wherein the time domain characteristic parameters are divided into dimensional characteristic parameters and dimensionless characteristic parameters. Such as variance, root mean square, mean, etc.
Step 2.2, standardizing the extracted time domain feature slices to generate a uniform feature code, wherein the standardized formula is defined as:
Figure GDA0003934155400000071
in the formula:
f ijk slicing time domain features for a kth sequence of jth features for an ith sensor;
s ijk a signature code of a kth sequence of jth signatures of an ith sensor;
Max(f ijk ) Is the maximum value of the jth characteristic of the ith sensor;
Min(f ijk ) Is the minimum value of the jth characteristic of the ith sensor;
gamma is a coefficient for controlling the space size of the feature code;
j is the characteristic number of the sensor;
i is the ith sensor;
j is the jth characteristic of the sensor;
k is the kth sequence;
the purpose of this step is to normalize and convert all time domain features into signatures.
Step 2.3, converting the feature code obtained in step 2.2 into binary input:
n k =[b 11k ,b 12k ,b 13k ,...,b IJk ]formula 2;
obtaining a sample data sequence N = [ N ] of a large-scale equipment sensor 1 ,n 2 ,n 3 ,...,n K ];
Each feature code s ijk Can be converted into a binary input b ijk By {0,1} o Denotes, o = γ × I × J, feature code s ijk The small and medium numbers are rounded down.
In the formula:
b IJk is a feature code s IJk A binary input of (2);
n is a sample data sequence of a large-scale equipment sensor;
n k binary input of the characteristic code of the kth sequence;
n K is the binary input of the K-th sequence of feature codes.
In the embodiment, in the process of analyzing the signal value, the time domain characteristics of every 2000 data points are counted, the time domain signal itself contains huge information, and it is important to select a proper time domain analysis index for analyzing the bearing state. And 7 feature description slice time domain feature values of variance, root mean square value, average value, kurtosis, skewness, peak factor and margin factor are selected. In the system of signal digital characteristics, the values represented are different due to the different characteristics of the respective characteristics. To take into account the meaning of its features in the input model space, the sample data needs to be normalized before fault detection. The data is mapped to a specific interval, and the effect of value difference caused by properties between data characteristics is eliminated. The convergence rate of the model can be accelerated, and the accuracy of the model can be improved. The characteristic value intervals are unified by using a normalization method, and then are coded. Specifically, the sequence points of all bearings are normalized. On the existing data set, a dispersion normalization method is adopted for each feature. The spatial size coefficient assigned to each feature is 1400. All features are mapped on a sparse space of total size 9800. As shown in fig. 2, we define each sample of the input sequence as an embedded vector of size 7 × 1 × 60, where 7 is the number of rows of the sequence feature, 1 is the number of columns of the sequence feature, and 60 is the number of slices.
As shown in fig. 3, the present invention has the advantages of step 2: by analyzing the characteristic signals of numerical analysis, the characteristic dimensions which are as useful as possible for the detection result are selected, and the influence of the subsequent causal analysis on the characteristics is intuitively judged. Increasing the understanding of the impact of the features.
Step 3, performing causal analysis on the sample data sequence of the large-scale equipment sensor obtained in the step 2, and quantifying the influence degree of each characteristic on a prediction result based on a set causal analysis objective function;
step 3 comprises the following substeps:
step 3.1, performing causal analysis by using the preprocessed sample data sequence of the large-scale equipment sensor obtained in the step 2;
in a sample of sensors of a large-scale device, i sensors are provided, and the sensors have j characteristics; when all the feature calculations are performed, the objective function of the quantized features to the prediction result is defined as:
Figure GDA0003934155400000091
in the formula:
Δε,f ij is characterized by ij Impact on fault prediction;
f ij is the jth characteristic of the ith sensor;
Figure GDA0003934155400000092
is free of the feature f ij Error in fault prediction;
ε F error predicted for a fault;
step 3.2, according to the formula 3, measuring the influence of one feature on the prediction result and calculating the model error epsilon of the complete feature F And does not contain feature f ij Model error of (2)
Figure GDA0003934155400000093
Using a layer of Transformer based on an attention mechanism as a model for calculating errors; generating an embedded sequence M of a Transformer and a characteristic f-free sequence from an equation 4 based on a sample data sequence N of a large-scale equipment sensor ij Embedded sequence of (D) M \ f ij },M=[m 1 ,m 2 ,m 3 ,...,m K ],M\{f ij }=[m` 1 ,m` 2 ,m`3,...,m` K ];
m k =w m n k +b m Formula 4;
Figure GDA0003934155400000094
Figure GDA0003934155400000101
in the formula:
TF (-) is a Transformer;
Figure GDA0003934155400000102
a label that is a prediction result;
m is an embedded sequence of a Transformer;
M\{f ij no feature f for Transformer ij The embedding sequence of (a);
m K embedding data of a Kth sequence of the embedding sequence M;
n k binary input of the characteristic code of the kth sequence;
m` k for embedding the sequence M \ f ij Embedded data of the kth sequence of };
w m for the initialization of the weight matrix for the embedding sequence,
Figure GDA0003934155400000103
b m to initialize the deviation matrix for the embedding sequence,
Figure GDA0003934155400000104
v is w m And b m Dimension (d);
sigma is n k σ = spatial dimension of (c)γ×I×J;
Figure GDA0003934155400000105
Is a real number set;
step 3.3, representing the real label by e, using cross entropy loss function
Figure GDA0003934155400000108
To represent the error of the prediction result after the transform learning, the failure prediction error of equation 3 is expressed as:
Figure GDA0003934155400000106
Figure GDA0003934155400000107
causal contribution of features can be calculated by using the models of equations 3, 4, 5, 6, 7 and 8 and by using the models without the feature f ij The difference in loss function between the errors of the fault predictions;
step 3.4, calculating model errors of the input characteristics by using the formulas 7 and 8 to obtain causal influence of the characteristics on the model; each input feature is assigned a weight according to causal influence, and the weight assignment is as shown in equation 9:
Figure GDA0003934155400000111
deriving causal impact weights
Figure GDA0003934155400000112
In the formula:
Figure GDA0003934155400000113
a weight of a jth input feature for an ith sensor;
W F are causal impact weights.
Specifically, in this embodiment, for step 3.1, due to the limitation of the data set, the acceleration sensor signal is selected for numerical analysis, and we have selected 7 features from the time domain signal analysis, i.e. I =1, j =7, which are as related as possible to the fault signal. And respectively removing a certain characteristic in the pre-training process, entering Transformer learning, and obtaining a loss value without a certain characteristic and a loss value containing all the characteristics, wherein the difference between the two values reflects the influence degree of the characteristics on the result. And using the weight as a reference to distribute weight for each feature, and using the product of the weight and the feature value as formal input of the Transformer.
For step 3.2, step 3.3 and step 3.4, the influence of each feature in the causal analysis module on the final result is calculated by using formulas 3 to 8, a causal weight value is generated by using the result and is respectively added to each feature value for the input of the perceptron, and 7 important features are analyzed by using numerical analysis in the used large-scale equipment data set. And respectively carrying out causal calculation on the two signals, and calculating which characteristics have larger influence on fault detection.
The invention has the advantages of step 3: the characteristics are analyzed through the causal analysis module, unimportant characteristics are restrained, all the characteristics can be treated equally through an existing algorithm, in fact, different signal characteristics have different meanings at different stages of equipment, and screening of characteristics with important influences at the early stage of a fault has a large influence on a final detection result.
And 4, combining the time information and the result of the step 3, and obtaining the hidden layer data and the attention score of the large-scale equipment based on the model of the time attention mechanism.
Step 4 comprises the following substeps:
step 4.1, the causal influence weight obtained in step 3.4 is used to combine with formula 1 to recalculate the feature code, and the formula is shown in formula 10:
Figure GDA0003934155400000121
in the formula:
Figure GDA0003934155400000122
the feature code of the k sequence of the j feature of the ith sensor is obtained by recalculation;
obtaining a sample data sequence combining causal influence weights by adopting the same operation as the step 2.3 based on the formula 10
Figure GDA0003934155400000123
In the formula:
N W is a sequence of sample data incorporating causal impact weights;
Figure GDA0003934155400000124
a binary input of a signature for the kth sequence in combination with causal influence weights;
Figure GDA0003934155400000125
the characteristic representation of the combined causal influence weight is in the same type of large equipment fault state; after predicting all sample data at a fault
Figure GDA0003934155400000126
Are all the same;
step 4.2, processing the time information and embedding the characteristics into a uniform dimensional sequence, wherein a formula is as follows:
Figure GDA0003934155400000127
in the formula:
z k embedding a sequence for the time information;
tan h is a hyperbolic tangent function;
t is the time from the start-up of the large-scale equipment to the occurrence of the fault;
p k time difference from failure to slice acquisition, p k =T-τ×k;
Tau is a unit time, and a data slice is made for the data acquired every time tau is processed;
w z an initialized weight matrix for the time information embedding sequence,
Figure GDA0003934155400000128
b z the deviation matrix is initialized for the time information embedding sequence,
Figure GDA0003934155400000129
v is w z And b z Dimension (d);
as described above, the time information is initialized to be a vector embedded with the features in a uniform dimension, and the closer the time of slicing is to the fault, the more likely the data is to be abnormal, and higher attention should be paid.
And 4.3, generating embedded data by combining the time information and the sample data after the causal influence weight:
Figure GDA0003934155400000131
the combined embedded sequence C = [ C ] can be obtained from equation 12 1 ,c 2 ,c 3 ,...,c K ,c T ];
In the formula:
c k embedding data for the combined kth sequence;
c K embedding data for the combined kth sequence;
Figure GDA0003934155400000132
binary input of the feature code of the k-th sequence after combination;
w c and b c Initializing a weight matrix and a bias matrix for the combined embedding sequence, wherein
Figure GDA0003934155400000133
Figure GDA0003934155400000134
C is an embedded sequence after combination;
c T embedded data combined with causal influence weight in the same type of large equipment fault state; predicting c after all sample data when a fault occurs T Are all the same;
step 4.4, according to the combined embedded sequence, learning the relation between the embedded sequence containing time information and the large-scale equipment fault by using a single-layer structure Transformer:
[h 1 ,h 2 ,h 3 ,...,h K ,h T ]=TF([c 1 ,c 2 ,c 3 ,...,c K ,c T ]) Formula 13;
in the formula:
TF (-) is a Transformer;
h K is c K Hidden layer data learned through a Transformer;
h T is c T Hidden layer data learned through a Transformer, namely fault state hidden layer representation;
step 4.5, calculating the local attention score of the combined embedding sequence, and generating local feature attention weight after obtaining the local attention score;
Figure GDA0003934155400000146
in the formula:
u k a local attention score for the kth sequence of the combined embedded sequences;
h k the data is the hidden layer data learned by a Transformer;
Figure GDA0003934155400000141
to initialize the weight matrix for local attention,
Figure GDA0003934155400000142
b u to initialize the deviation matrix for local attention,
Figure GDA0003934155400000143
l is hidden layer data h k Dimension (d);
p is b u Dimension (d);
after obtaining the local attention score, a local feature attention weight is generated using the Softmax function, namely:
W local =Softmax([u 1 ,u 2 ,u 3 ,...,u K ])=[l 1 ,l 2 ,l 3 ,...,l K ]formula 15;
in the formula:
W local attention weights for local features;
u K a local attention score for the kth sequence of the combined embedded sequences;
l K a local attention score weight value for a kth sequence of the combined embedded sequences;
step 4.6, judging the influence of the sample time on the fault prediction by using an attention mechanism, and firstly, expressing the fault state hidden layer obtained in the step 4.4 as h T Converting into a query vector in an attention mechanism;
x=ReLU(W x h T +b x ) Formula 16;
in the formula:
x is a query vector in the attention mechanism;
ReLU () is a modified linear unit activation function;
h T hiding the layer representation for a fault condition;
W x for initialization of query vectorsThe weight matrix is a matrix of the weights,
Figure GDA0003934155400000144
b x to initialize the bias matrix for the query vector,
Figure GDA0003934155400000145
l is hidden layer data h k Dimension of (d);
q is b x Dimension of (d);
step 4.7, time difference p from fault occurrence to data slice acquisition k As a key vector for the attention mechanism, as shown in equation 17:
Figure GDA0003934155400000151
to obtain E = [ E ] 1 ,e 2 ,e 3 ,...,e K ];
In the formula:
e k a key vector for the kth sequence;
e K a key vector for the Kth sequence;
e is a time key vector set of the attention mechanism;
w e is the initialized weight matrix for the time key vector,
Figure GDA0003934155400000152
b e the bias matrix is initialized for the time key vector,
Figure GDA0003934155400000153
q is w e And b e Dimension (d);
step 4.8, based on the query vector x and the key vector e obtained in step 4.6 and step 4.7 k Using the attention mechanism, a global time attention score can be obtained, as shown in equations 18 and 19:
Figure GDA0003934155400000154
in the formula:
r k a global temporal attention score for the kth sequence;
x T is a transpose of the query vector x;
δ is the dimension of the time key vector;
applying the Softmax layer to normalize the global temporal attention score, the global temporal attention weight can be expressed as:
W global =Softmax([r 1 ,r 2 ,r 3 ,...,r K ])=[g 1 ,g 2 ,g 3 ,...,g K ]formula 19;
in the formula:
W global is a global temporal attention weight;
r K global temporal attention score for the kth sequence;
g K global temporal attention weight for the kth sequence;
step 4.9, combining the local attention score of step 4.5 with the global temporal attention score of step 4.8;
first of all use h T The embedding assigns weights to the local features and time information, which are normalized by Softmax, as shown in equation 20:
V=Soffmax(W v h T +b v )=[a loacl ,a global ]formula 20;
in the formula:
h T hiding the layer representation for a fault condition;
w v an initialization weight matrix is assigned to the integrated information,
Figure GDA0003934155400000161
b v an initial bias matrix is assigned to the consolidated information,
Figure GDA0003934155400000162
l is hidden layer data h K Dimension (d);
obtaining a fused attention weight according to the local feature attention weight and the global time attention weight, as shown in formula 21;
Figure GDA0003934155400000163
in the formula:
Figure GDA0003934155400000164
an attention weight for fusion;
a loacl to represent h by a fault state hidden layer T Is distributed to W local The weight of (c);
l K a local feature attention weight value for a kth sequence of the combined embedded sequences;
a global to represent h by a fault status hidden layer T Is distributed to W global The weight of (c);
g K a global temporal attention weight for the kth sequence;
step 4.10, normalizing the fused attention weight to obtain the attention score of the embedded sequence
Figure GDA0003934155400000171
As shown in equation 22:
Figure GDA0003934155400000172
in the embodiment, steps 4.1 to 4.5 are a local information attention module, which is used for analyzing the feature information of each collected signal, and for step 4.1, an embedded sequence meeting the requirement of a Transformer is made by using the features of the previous causal analysis and preprocessing, and thenStep 4.2 and step 4.3 also integrate the time information into the embedding sequence, step 4.4 and step 4.5 learn the dependency relationship between the embedding sequence information and obtain the hidden vector, and obtain the local attention score converted from the embedding sequence to the hidden vector. Steps 4.6 to 4.8 are a global time attention module by analyzing the importance of the time information for the overall time signal. Step 4.6, a hidden variable h for analyzing the overall condition of the equipment is obtained by a local attention module θ The attention mechanism is used to convert to a query vector. Step 4.7 emphasizes the two former modules of the time information on judging the current state of the equipment from different angles, and the two modules need to be considered in combination. Steps 4.9 to 4.10 therefore design an attention fusion mechanism to capture the relevant information of signal characterization and time characterization under different conditions, and give a composite score after fusing the local attention score and the global time attention score.
The invention has the advantages of step 4: the analysis is respectively carried out from two aspects, the characteristics are obtained by carrying out numerical analysis on the full-period signals of the data set, and the time information comes from the original data set. The most important thing in this study is health condition detection of large-scale equipment, so we introduce periodic signal time information, fuse signal characteristics and time information together, and analyze the change situation of long-period signal by using the time interval environment between the collected information sequences. Meanwhile, the contribution degree of the characteristics to the detection result can be better distributed by adopting fusion.
And 5, predicting the fault of the large equipment by using the hidden layer data and the attention score obtained in the step 4.
Step 5 comprises the following substeps:
and step 5.1, obtaining a fault prediction score of the large equipment according to the hidden layer data in the step 4.4 and the attention score in the step 4.10:
Figure GDA0003934155400000181
in the formula:
Figure GDA0003934155400000182
predicting a score for a fault of the large scale equipment;
Figure GDA0003934155400000183
an attention score for the embedded sequence;
h k is c k Hidden layer data learned through a Transformer;
step 5.2, the probability of the fault prediction of the large equipment is obtained by using a Softmax function for the fault prediction score of the large equipment obtained in the step 5.1;
Figure GDA0003934155400000184
in the formula:
w d a weight matrix initialized for the failure prediction probability,
Figure GDA0003934155400000185
b d a bias matrix initialized for the failure prediction probability,
Figure GDA0003934155400000186
l is hidden layer data h K Dimension (d);
and judging the possibility that the large equipment is about to have a certain fault type according to the probability of fault prediction of the large equipment.
Specifically, in this embodiment, for step 5.1, the comprehensive attention score obtained in step 4.10 and the failure prediction score obtained in step 4.4 are used. For step 5.2, the fault detection score obtained in step 5.1 is detected, and the probability of the corresponding fault is finally detected through a Softmax function and the like.
In this embodiment, the trained model is used to verify the accuracy of the model based on the test sample. Specifically, let the parameter in the model be psi, and use the cross entropy loss functionNumber as predicted value
Figure GDA0003934155400000188
The objective is to minimize the average loss function, as shown in equation 25, from the actual value d:
Figure GDA0003934155400000187
in the formula:
Figure GDA0003934155400000191
to minimize the average loss function;
d is the actual value;
Figure GDA0003934155400000192
is a predicted value;
g is the total number of large-scale equipment.
The performance analysis of the method of the invention:
the method uses the data set of the whole life cycle vibration signal bearing manufactured by the science and technology of the university of Xian transportation and the Shanyang as the data set to prove the effectiveness of the method of the present invention.
The data set had 3 conditions, 5 bearings for each condition. A total of 15 bearing full life cycle signal samples. In the test, the sampling frequency is set to be 25.6kHz, the sampling interval is 1min, and the sampling time length is 1.28s each time. The number of signals collected by each sensor per minute is 32769.
When mechanical equipment fails, it may behave to different degrees in the time, frequency and time-frequency domains. Taking the bearing 11 as an example, when the outer ring of the bearing fails at the end of the test, the vibration signal in the horizontal direction can contain more degradation information because the load is applied in the horizontal direction. The data collected for the bearing 11 in the horizontal direction is shown in fig. 1.
The best results can be obtained by using data of the whole life cycle, but in an actual scene, the service life of the bearing can reach tens of thousands of hours. The value of the sensor for collecting a large amount of data information is extremely low. Often important data is distributed over the second half of the life cycle of the bearing. Therefore, during the normal operation of the bearing, a large amount of data should not be collected, and a signal sequence consisting of sensor information at several time points is selected as an indication of the normal state of the bearing.
In order to verify the effectiveness of the algorithm in the chapter, three sample data are selected, wherein the rotating speed is 2100r/min, and the bearing fault is outer ring crack loss under the working condition that the radial force is 12 kN. In order to ensure the effectiveness of model training, signals of each bearing are divided into a sequence, and characteristic values of 2000 data are obtained through numerical analysis. Sequence points of the full-period signal are obtained. In the later stage of the fault, the signal characteristics are already obvious, and the meaning of fault detection on the bearing is lost at the moment, so that the fault is selected to be detected in time in the early stage of the fault, as shown in fig. 2, when the fault is in the early stage or does not occur, because of the periodicity of the signal, information contained in a large amount of data is very little at the moment, the signal condition in the period is considered as few as possible at the moment, and when the signal is abnormal in the early stage or the middle stage of the fault, the signal change condition in the period is paid attention to as much as possible, which accords with the flow of the general detection process of the equipment. Therefore, when a data set is collected, in consideration of real industrial equipment inspection, 20 sequence points are randomly collected at an early stage of stable signal collection each time, 40 signal points are randomly collected at a middle stage of a fault to be combined into a signal characteristic sequence, and 1000 signal characteristic sequences are collected at each working condition. The corresponding data set description is shown in table 1.
Table 1 data set description
Bearing assembly Bearing 1 u 1 Bearing 1 u 4 Bearing 1 u 5 Bearing 2 u 1 Bearing 2 u 5
Training set 600 600 600 600 600
Verification set 200 200 200 200 200
Test set 200 200 200 200 200
Because data has certain level of deletion, bearing data sets are respectively mixed to detect the accuracy of different types of faults. As shown in tables 2 and 3, in the bearing 1 and 1_4 data sets, 1 _u1 was the fault to be detected. In the bearing 1 _1and 1 _5datasets, 1 _u1 was the fault to be detected. In the bearing 2 _1and 2 _5datasets, 2 _1is the fault to be detected.
TABLE 2 hybrid bearing test accuracy results
Figure GDA0003934155400000201
Figure GDA0003934155400000211
TABLE 3 comparison of Experimental results for bearing 2_1 and 2_5 data sets
Figure GDA0003934155400000212
As shown in tables 2 and 3, the hybrid data at bearings 1 and 1 \ u 1 and 1 \4achieved full recognition on all algorithms. This is because the data characteristic information of the two fault types is very different. But on the detection of the identification of different fault types. The algorithm achieves good results. The method is improved compared with a benchmark algorithm.
The experiments also verified the average performance of the proposed model and other baseline models across the data set. It is next necessary to analyze how causal targets affect the model results, as shown in table 4, allowing the model to learn the features with the highest correlation to the targets being tested. The dimensionless parameters are insensitive to the bearing load and speed of the bearing, do not need to consider the comparison between relative standard values and previous data, are more sensitive to the early stage of faults, but have poor serious anti-interference faults, and are easy to cause misjudgment. Although the parameters such as peak value, crest factor, kurtosis and the like are sensitive to the impact fault, when the fault enters a severe development stage, the parameters such as the peak value factor, the crest factor and the like are in a saturated state and lose the diagnosis capability. However, different types of faults may result in different trends for different factors. This also leads to causal analysis focusing on different features. Note that the mechanism may force the model to focus on signal features that contain important risk factors, while mitigating the impact of other features on the detection results. The contribution of each feature of the model to the final performance can be clearly known through causal analysis, and the method can be extended to other models.
TABLE 4 bearing 2_1 and 2_5 data set causal analysis results
Figure GDA0003934155400000221

Claims (5)

1. A method for predicting the fault of a large-scale device with serious causality and attention in industrial Internet is characterized by comprising the following steps:
step 1, collecting fault data of large equipment, and taking the fault data as a training sample;
step 2, preprocessing the data of the training samples of the large-scale equipment faults classified in the step 1, obtaining time domain characteristics of the samples by using a signal time domain analysis method, coding the time domain characteristics, and normalizing the numerical characteristics to obtain a sample data sequence of the preprocessed large-scale equipment sensor;
step 3, performing causal analysis on the sample data sequence of the large-scale equipment sensor obtained in the step 2, and quantifying the influence degree of each characteristic on a prediction result based on a set causal analysis objective function;
step 4, combining the time information and the result of the step 3, and obtaining hidden layer data and attention scores of the large-scale equipment based on a model of a time attention mechanism;
step 5, predicting the fault of the large equipment by using the hidden layer data and the attention score obtained in the step 4;
wherein, step 3 comprises the following substeps:
step 3.1, performing causal analysis by using the preprocessed sample data sequence of the large-scale equipment sensor obtained in the step 2;
in a sample of sensors of a large-scale device, i sensors are provided, and the sensors have j characteristics; when all the feature calculations are performed, the objective function of the quantized features to the prediction result is defined as:
Figure FDA0003934155390000011
in the formula:
Δε,f ij is characterized by ij Impact on fault prediction;
f ij is the jth characteristic of the ith sensor;
Figure FDA0003934155390000021
is free of feature f ij Error in fault prediction;
ε F error predicted for a fault;
step 3.2, according to the formula 3, measuring the influence of one characteristic on the prediction result and calculating the model error epsilon of the complete characteristic F And does not contain feature f ij Model error of
Figure FDA0003934155390000022
Using a layer of Transformer based on an attention mechanism as a model for calculating errors; generating an embedded sequence M of a Transformer and a characteristic f-free sequence from an equation 4 based on a sample data sequence N of a large-scale equipment sensor ij Embedded sequence of (M \ f) ij },M=[m 1 ,m 2 ,m 3 ,...,m K ],M\{f ij }=[m` 1 ,m` 2 ,m` 3 ,...,m` K ];
m k =w m n k +b m Formula 4;
Figure FDA0003934155390000023
Figure FDA0003934155390000024
in the formula:
TF (-) is a Transformer;
Figure FDA0003934155390000027
a label that is a prediction result;
m is an embedded sequence of a Transformer;
M\{f ij featureless f for Transformer ij The embedding sequence of (a);
m K embedding data of a Kth sequence of the embedding sequence M;
n k binary input of the characteristic code of the kth sequence;
m` k for embedding the sequence M \ f ij Embedded data of the kth sequence of };
w m for the initialization of the weight matrix for the embedding sequence,
Figure FDA0003934155390000025
b m to initialize the deviation matrix for the embedding sequence,
Figure FDA0003934155390000026
v is w m And b m Dimension (d);
sigma is n k σ = γ × I × J;
gamma is a coefficient for controlling the space size of the feature code;
i is the number of the sensors;
j is the characteristic number of the sensor;
Figure FDA0003934155390000031
is a real number set;
step 3.3, representing the real label by e, using the cross entropy loss function
Figure FDA0003934155390000032
To represent the error of the prediction result after the transform learning, the failure prediction error of equation 3 is expressed as:
Figure FDA0003934155390000033
Figure FDA0003934155390000034
causal contribution of a feature can be calculated by the calculation of the model using equations 3, 4, 5, 6, 7 and 8, and the calculation without the feature f ij The difference in the loss function between the errors of the fault predictions;
step 3.4, calculating model errors of the input characteristics by using the formulas 7 and 8 to obtain causal influence of the characteristics on the model; assigning a weight to each input feature based on causal influence, the weight assignment is as shown in equation 9:
Figure FDA0003934155390000035
deriving causal impact weights
Figure FDA0003934155390000036
In the formula:
Figure FDA0003934155390000037
a weight of a jth input feature for an ith sensor;
W F are causal impact weights.
2. The method for predicting the fault of the large-scale equipment with the characteristics of causality and attention in the industrial internet as claimed in claim 1, wherein the step 1 comprises the following sub-steps:
step 1.1, classifying fault data based on collected large-scale equipment fault samples, and marking each type of fault data;
and step 1.2, carrying out mass sampling on the marked fault data of each type of large equipment, recording the time of each sampling point in the sampling process, and arranging the acquired signal segments according to the labels of the large equipment where the signal segments are located to obtain training samples.
3. The method for predicting the fault of the large-scale equipment with the characteristics of causality and attention in the industrial internet as claimed in claim 1, wherein the step 2 comprises the following sub-steps:
step 2.1, obtaining the time domain characteristics of the data slice by using a signal time domain analysis method for the well-regulated data slice, and making the time domain characteristics of the corresponding sample into a time domain characteristic slice;
step 2.2, standardizing the extracted time domain feature slices to generate a uniform feature code, wherein the standardized formula is defined as:
Figure FDA0003934155390000041
in the formula:
f ijk slicing time domain features for a kth sequence of jth features for an ith sensor;
s ijk a signature code of a kth sequence of jth signatures of an ith sensor;
Max(f ijk ) Is the maximum value of the jth characteristic of the ith sensor;
Min(f ijk ) Is the minimum value of the jth characteristic of the ith sensor;
gamma is a coefficient for controlling the size of the feature code space;
j is the characteristic number of the sensor;
i is the ith sensor;
j is the jth characteristic of the sensor;
k is the kth sequence;
step 2.3, converting the feature code obtained in step 2.2 into binary input:
n k =[b 11k ,b 12k ,b 13k ,...,b IJk ]formula 2;
obtaining a sample data sequence N = [ N ] of a large-scale equipment sensor 1 ,n 2 ,n 3 ,...,n K ];
In the formula:
b IJk is a feature code s IJk A binary input of (2);
n is a sample data sequence of a large-scale equipment sensor;
n k binary input of the characteristic code of the kth sequence;
n K is the binary input of the K-th sequence of feature codes.
4. The method for predicting the fault of the large-scale equipment with the characteristics of causality and attention in the industrial internet as claimed in claim 1, wherein the step 4 comprises the following sub-steps:
step 4.1, the causal influence weight obtained in step 3.4 is used to combine with formula 1 to recalculate the feature code, and the formula is shown in formula 10:
Figure FDA0003934155390000051
in the formula:
Figure FDA0003934155390000052
the feature code of the k sequence of the j feature of the ith sensor is obtained by recalculation;
obtaining a sample data sequence combining causal influence weights by adopting the same operation as the step 2.3 based on the formula 10
Figure FDA0003934155390000053
In the formula:
N W is a sequence of sample data incorporating causal impact weights;
Figure FDA0003934155390000054
second of the signature codes for the Kth sequence combined with causal weightsCarrying out binary input;
Figure FDA0003934155390000055
the characteristic representation of the combined causal influence weight is in the same type of large equipment fault state; after predicting all sample data at a fault
Figure FDA0003934155390000056
Are all the same;
step 4.2, processing the time information and embedding the characteristics into a uniform dimensional sequence, wherein the formula is as follows:
Figure FDA0003934155390000057
in the formula:
z k embedding a sequence for time information;
tan h is a hyperbolic tangent function;
t is the time taken by the large-scale equipment from startup to failure;
p k time difference from failure to slice acquisition, p k =T-τ×k;
Tau is unit time, and a data slice is made by data acquired every time tau is passed;
in unit time tau, making a data slice by the data acquired by the large-scale equipment every time the large-scale equipment goes through tau;
w z an initialized weight matrix for the time information embedding sequence,
Figure FDA0003934155390000061
b z an initial bias matrix for the time information embedding sequence,
Figure FDA0003934155390000062
v is w z And b z Dimension of (d);
and 4.3, generating embedded data by combining the time information and the sample data after the causal influence weight:
Figure FDA0003934155390000063
the combined embedded sequence C = [ C ] can be obtained from formula 12 1 ,c 2 ,c 3 ,...,c K ,c T ];
In the formula:
c k embedding data for the combined kth sequence;
c K embedding data of the combined Kth sequence;
Figure FDA0003934155390000064
a binary input of the combined kth sequence of feature codes;
w c and b c Initializing a weight matrix and a bias matrix for the combined embedding sequence, wherein
Figure FDA0003934155390000065
Figure FDA0003934155390000066
C is an embedded sequence after combination;
c T embedded data combined with causal influence weight in the same type of large equipment fault state; predicting c after all sample data when a fault occurs T Are all the same;
and 4.4, learning the relation between the embedded sequence containing the time information and the large-scale equipment fault each time by using a single-layer structure Transformer according to the combined embedded sequence:
[h 1 ,h 2 ,h 3 ,...,h K ,h T ]=TF([c 1 ,c 2 ,c 3 ,...,c K ,c T ]) formula 13;
in the formula:
TF (-) is a Transformer;
h K is c K Hidden layer data learned through a Transformer;
h T is c T Hidden layer data learned through a Transformer, namely fault state hidden layer representation;
step 4.5, calculating the local attention score of the combined embedding sequence, and generating local feature attention weight after obtaining the local attention score;
Figure FDA0003934155390000071
in the formula:
u k a local attention score for the kth sequence of the combined embedded sequences;
h k the data is the hidden layer data learned by a Transformer;
Figure FDA0003934155390000072
to initialize the weight matrix for local attention,
Figure FDA0003934155390000073
b u to initialize the deviation matrix for local attention,
Figure FDA0003934155390000074
l is hidden layer data h k Dimension (d);
p is b u Dimension (d);
after obtaining the local attention score, a local feature attention weight is generated using the Softmax function, i.e.:
W local =Softmax([u 1 ,u 2 ,u 3 ,...,u K ])=[l 1 ,l 2 ,l 3 ,...,l K ]formula 15;
in the formula:
W local attention weights for local features;
u K a local attention score for the kth sequence of the combined embedded sequences;
l K a local attention score weight value for a kth sequence of the combined embedded sequences;
step 4.6, judging the influence of the sample time on the fault prediction by using an attention mechanism, and firstly, expressing the fault state hidden layer obtained in the step 4.4 as h T Converting into a query vector in an attention mechanism;
x=ReLU(W x h T +b x ) Formula 16;
in the formula:
x is a query vector in the attention mechanism;
ReLU () is a modified linear unit activation function;
h T hiding the layer representation for a fault condition;
W x to initialize the weight matrix for the query vector,
Figure FDA0003934155390000081
b x to initialize the bias matrix for the query vector,
Figure FDA0003934155390000082
l is hidden layer data h k Dimension (d);
q is b x Dimension (d);
step 4.7, time difference p from fault occurrence to data slice acquisition k As a key vector for the attention mechanism, as shown in equation 17:
Figure FDA0003934155390000083
to obtain E = [ E ] 1 ,e 2 ,e 3 ,...,e K ];
In the formula:
e k a key vector for the kth sequence;
e K a key vector for the kth sequence;
e is a time key vector set of the attention mechanism;
w e is the initialized weight matrix for the time key vector,
Figure FDA0003934155390000084
b e the bias matrix is initialized for the time key vector,
Figure FDA0003934155390000085
q is w e And b e Dimension (d);
step 4.8, based on the query vector x and the key vector e obtained in step 4.6 and step 4.7 k Using the attention mechanism, a global time attention score can be obtained, as shown in equations 18 and 19:
Figure FDA0003934155390000086
in the formula:
r k a global temporal attention score for the kth sequence;
x T is a transpose of the query vector x;
δ is the dimension of the time key vector;
applying the Softmax layer to normalize the global temporal attention score, the global temporal attention weight can be expressed as:
W global =Softmax([r 1 ,r 2 ,r 3 ,...,r K ])=[g 1 ,g 2 ,g 3 ,...,g K ]formula 19;
in the formula:
W global a global temporal attention weight;
r K global temporal attention score for the kth sequence;
g K global temporal attention weight for the kth sequence;
step 4.9, combining the local attention score of step 4.5 with the global temporal attention score of step 4.8;
first of all use h T Embedding assigns weights to the local features and time information, which are normalized by Softmax, as shown in equation 20:
V=Softmax(W v h T +b v )=[a loacl ,a global ]formula 20;
in the formula:
h T hiding the layer representation for a fault condition;
w v an initialization weight matrix is assigned to the integrated information,
Figure FDA0003934155390000091
b v an initial bias matrix is assigned to the integrated information,
Figure FDA0003934155390000092
l is hidden layer data h K Dimension of (d);
obtaining a fused attention weight according to the local feature attention weight and the global time attention weight, as shown in formula 21;
Figure FDA0003934155390000093
in the formula:
in the formula:
Figure FDA0003934155390000101
an attention weight that is a fusion;
a loacl to represent h by a fault status hidden layer T Is distributed to W local The weight of (c);
l K a local feature attention weight value for a kth sequence of the combined embedded sequences;
a global to represent h by a fault status hidden layer T Is distributed to W global The weight of (c);
g K global temporal attention weight for the kth sequence;
step 4.10, normalizing the fused attention weight to obtain the attention score of the embedded sequence
Figure FDA0003934155390000109
As shown in equation 22:
Figure FDA0003934155390000102
5. the method for predicting the failure of the large equipment with the cause of disease and the attention in the industrial internet, as claimed in claim 1, wherein the step 5 comprises the following sub-steps:
and 5.1, obtaining a fault prediction score of the large equipment according to the hidden layer data in the step 4.4 and the attention score in the step 4.10:
Figure FDA0003934155390000103
in the formula:
Figure FDA0003934155390000104
predicting a score for a fault of the large scale equipment;
Figure FDA0003934155390000105
to be embedded into(ii) an attention score of the in-sequence;
h k is c k Hidden layer data learned through a Transformer;
step 5.2, for the fault prediction score of the large-scale equipment obtained in the step 5.1, obtaining the probability of the fault prediction of the large-scale equipment by using a Softmax function;
Figure FDA0003934155390000106
in the formula:
w d a weight matrix initialized for the failure prediction probability,
Figure FDA0003934155390000107
b d a bias matrix initialized for the probability of failure prediction,
Figure FDA0003934155390000108
l is hidden layer data h K Dimension of (d);
and judging the possibility that the large equipment is about to have a certain fault type according to the probability of fault prediction of the large equipment.
CN202210187119.9A 2022-02-28 2022-02-28 Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet Active CN114580472B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210187119.9A CN114580472B (en) 2022-02-28 2022-02-28 Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210187119.9A CN114580472B (en) 2022-02-28 2022-02-28 Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet

Publications (2)

Publication Number Publication Date
CN114580472A CN114580472A (en) 2022-06-03
CN114580472B true CN114580472B (en) 2022-12-23

Family

ID=81777013

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210187119.9A Active CN114580472B (en) 2022-02-28 2022-02-28 Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet

Country Status (1)

Country Link
CN (1) CN114580472B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116383096B (en) * 2023-06-06 2023-08-18 安徽思高智能科技有限公司 Micro-service system anomaly detection method and device based on multi-index time sequence prediction

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112910695A (en) * 2021-01-22 2021-06-04 湖北工业大学 Network fault prediction method based on global attention time domain convolutional network
CN113283631A (en) * 2021-04-13 2021-08-20 中国石油大学(华东) Industrial equipment fault prediction method based on self-attention mechanism and time sequence convolution network
CN113987834A (en) * 2021-11-15 2022-01-28 华东交通大学 CAN-LSTM-based railway train bearing residual life prediction method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109084980B (en) * 2018-10-10 2019-11-05 北京交通大学 Bearing fault prediction technique and device based on equalization segmentation
WO2020112337A1 (en) * 2018-11-26 2020-06-04 Exxonmobil Research And Engineering Company Predictive maintenance
CN109828549A (en) * 2019-01-28 2019-05-31 中国石油大学(华东) A kind of industry internet equipment fault prediction technique based on deep learning
CN111460728B (en) * 2020-03-09 2022-08-12 华南理工大学 Method and device for predicting residual life of industrial equipment, storage medium and equipment
CN111856958A (en) * 2020-07-27 2020-10-30 西北大学 Intelligent household control system, control method, computer equipment and storage medium
CN112862209B (en) * 2021-03-05 2023-08-29 重庆大学 Industrial equipment monitoring data prediction method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112910695A (en) * 2021-01-22 2021-06-04 湖北工业大学 Network fault prediction method based on global attention time domain convolutional network
CN113283631A (en) * 2021-04-13 2021-08-20 中国石油大学(华东) Industrial equipment fault prediction method based on self-attention mechanism and time sequence convolution network
CN113987834A (en) * 2021-11-15 2022-01-28 华东交通大学 CAN-LSTM-based railway train bearing residual life prediction method

Also Published As

Publication number Publication date
CN114580472A (en) 2022-06-03

Similar Documents

Publication Publication Date Title
Ma et al. Deep coupling autoencoder for fault diagnosis with multimodal sensory data
Wang et al. A method for rapidly evaluating reliability and predicting remaining useful life using two-dimensional convolutional neural network with signal conversion
CN111914873A (en) Two-stage cloud server unsupervised anomaly prediction method
Song et al. Wind turbine health state monitoring based on a Bayesian data-driven approach
CN110636066B (en) Network security threat situation assessment method based on unsupervised generative reasoning
Yang et al. Refined composite multivariate multiscale symbolic dynamic entropy and its application to fault diagnosis of rotating machine
CN114297918A (en) Aero-engine residual life prediction method based on full-attention depth network and dynamic ensemble learning
CN114580472B (en) Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet
Zhao et al. A novel deep fuzzy clustering neural network model and its application in rolling bearing fault recognition
CN117076869B (en) Time-frequency domain fusion fault diagnosis method and system for rotary machine
CN114444910A (en) Electric power Internet of things-oriented edge network system health degree evaluation method
CN117094184B (en) Modeling method, system and medium of risk prediction model based on intranet platform
Zhou et al. Degradation state recognition of rolling bearing based on K-means and CNN algorithm
CN116861331A (en) Expert model decision-fused data identification method and system
Bond et al. A hybrid learning approach to prognostics and health management applied to military ground vehicles using time-series and maintenance event data
CN115392109A (en) LSTM multivariable time series anomaly detection method based on generative model
CN114648076A (en) Unsupervised learning battery production process abnormal fluctuation detection method
CN113469247B (en) Network asset abnormity detection method
Duan et al. Data mining technology for structural health monitoring
CN113111575B (en) Combustion engine degradation evaluation method based on depth feature coding and Gaussian mixture model
CN115831339B (en) Medical system risk management and control pre-prediction method and system based on deep learning
CN117611015B (en) Real-time monitoring system for quality of building engineering
Tan et al. Multivariate Time-Series Anomaly Detection in IoT Using Attention-Based Gated Recurrent Unit
Zhang et al. Deep Learning Based Fault Diagnosis for Chemical Process with Statistical Feature Fusion
CN117933531A (en) Distributed photovoltaic power generation power prediction system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant