CN114580472A - Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet - Google Patents

Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet Download PDF

Info

Publication number
CN114580472A
CN114580472A CN202210187119.9A CN202210187119A CN114580472A CN 114580472 A CN114580472 A CN 114580472A CN 202210187119 A CN202210187119 A CN 202210187119A CN 114580472 A CN114580472 A CN 114580472A
Authority
CN
China
Prior art keywords
sequence
fault
attention
formula
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210187119.9A
Other languages
Chinese (zh)
Other versions
CN114580472B (en
Inventor
尹小燕
南鑫
刘长友
龚志敏
王禹
田苗
崔瑾
陈晓江
房鼎益
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest University
Original Assignee
Northwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University filed Critical Northwest University
Priority to CN202210187119.9A priority Critical patent/CN114580472B/en
Publication of CN114580472A publication Critical patent/CN114580472A/en
Application granted granted Critical
Publication of CN114580472B publication Critical patent/CN114580472B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet. The prediction method provided by the invention adopts a supervised learning mode, collects fault samples, extracts fault characteristics, constructs a causal analysis model by analyzing potential causal relationship between the characteristics and faults, and realizes the prediction of equipment faults by combining causal analysis and a time attention mechanism. The prediction method is based on causal analysis, the potential relation between the characteristics and the fault prediction accuracy is explored, a feasible method is provided for characteristic selection of a large-scale equipment fault prediction model, and then the major factor characteristics of the fault are distributed with larger weight, and the minor factor characteristics are distributed with smaller weight.

Description

Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet
Technical Field
The invention belongs to the technical field of industrial internet, relates to a large-scale equipment fault prediction method, and particularly relates to a large-scale equipment fault prediction method with the effect and attention being the same in industrial internet.
Background
The long-term stable operation of large-scale equipment in the industrial Internet has important significance for safe production. The sensor nodes deployed for all-weather monitoring of large equipment can cause mass data, based on feature extraction of fault samples, real-time analysis oriented to the monitoring data in the whole life cycle can accurately control the running state of the large equipment, and can predict possible faults, so that an equipment fault emergency plan is started, the equipment is overhauled in time, and the occurrence of industrial safety accidents is avoided. Therefore, the operation state monitoring and fault prediction for industrial internet large-scale equipment are urgently needed to be researched. On the other hand, the traditional digital, networked and intelligent large-scale equipment fault analysis technology is insufficient, the requirement for real-time processing and analysis of mass monitoring large data cannot be met, a large data analysis framework facing to the industrial internet is indispensable to construct, and a high-level data model facing to the industrial internet and large data analysis capability are urgently needed.
At present, several methods of predicting the failure of the device have been proposed in succession:
(A) the signal analysis method comprises the following steps: and carrying out numerical transformation analysis according to the signal change monitored by the deployed sensor of the large-scale equipment, and carrying out equipment state detection and fault prediction based on the knowledge and experience in the professional field.
(B) Method based on linear discriminant: and (3) counting the signal characteristics monitored by the sensor in the fault state, extracting the fault characteristics by using a principal component analysis method, and inputting the extracted important characteristics to a linear discriminator for classification.
(C) The method based on the convolutional neural network comprises the following steps: and counting information in a period of time, converting the extracted time domain signal into a frequency domain signal by using the convolutional layer, and then training by using the full link layer to obtain a fault result.
(D) The method based on data fusion comprises the following steps: and synthesizing the data of each sensor, and predicting the equipment fault through a fusion algorithm.
The method can realize the operation state monitoring and the fault prediction of the large-scale equipment under specific conditions. However, the traditional signal analysis method needs deeper professional knowledge reserve, the linear discriminator and data fusion depends on the number and the feature dimension of the sensor data, and the deep learning can directly carry out the high-dimensional feature calculation characteristic of end-to-end learning, so that the sensor data of the equipment can be directly used. However, the deep learning is a black box process, and how the characteristics selected by the learning algorithm influence the experimental result still remains to be solved.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a large-scale equipment fault prediction method with serious cause and effect and attention in the industrial internet so as to solve the technical problem that the accuracy of the prediction method in the prior art needs to be further improved.
In order to solve the technical problems, the invention adopts the following technical scheme:
a method for predicting the fault of a large-scale device with serious causality and attention in industrial Internet is characterized by comprising the following steps:
step 1, collecting fault data of large equipment, and taking the fault data as a training sample;
step 2, preprocessing the data of the training samples of the large equipment faults sorted and classified in the step 1, obtaining time domain characteristics of the samples by using a signal time domain analysis method, coding the time domain characteristics, and normalizing the numerical characteristics to obtain a preprocessed sample data sequence of the large equipment sensor;
step 3, performing causal analysis on the sample data sequence of the large-scale equipment sensor obtained in the step 2, and quantifying the influence degree of each characteristic on a prediction result based on a set causal analysis objective function;
step 4, combining the time information and the result of the step 3, and obtaining hidden layer data and attention scores of the large-scale equipment based on a model of a time attention mechanism;
and 5, predicting the fault of the large equipment by using the hidden layer data and the attention score obtained in the step 4.
Compared with the prior art, the invention has the following technical effects:
the prediction method is based on causal analysis, the potential relation between the characteristics and the fault prediction accuracy is explored, a feasible method is provided for characteristic selection of a large-scale equipment fault prediction model, and then the major factor characteristics of the fault are distributed with large weight, and the minor factor characteristics are distributed with small weight.
And (II) analyzing fine-grained change of a large-scale equipment fault sample in a time dimension based on an attention mechanism, searching for a key time point, improving the accuracy of fault prediction, timely starting an equipment fault emergency plan, timely overhauling equipment and avoiding the occurrence of an industrial safety accident.
Drawings
FIG. 1 is a diagram of an early stationary phase of a signal and a mid-fault phase of the signal.
FIG. 2 is a selected feature sequence chart.
FIG. 3 is a diagram showing the internal structure of the transducer.
The present invention will be explained in further detail with reference to examples.
Detailed Description
In recent years, researchers have made various attempts and extensions to cause-effect analysis, and have achieved some research results in the field of neural networks. Attention mechanism is widely applied to translation tasks, medical tasks and the like, and the attention mechanism can be modeled through serialization, so that the change situation of numerical values in a time dimension can be captured. Recent studies have demonstrated the effectiveness of attention mechanisms, but there has been no much attention in the field of industrial internets, because attention mechanisms are usually a highly dimensional feature learning on discrete data, capturing the relationship between data and tasks. Therefore, the method for monitoring and predicting the fault of the large-scale equipment facing the industrial internet is explored based on the causal analysis and the time attention mechanism.
The invention provides a large-scale equipment fault prediction method with repeated cause and effect and attention in an industrial internet, namely a cause and effect perception large-scale equipment operation state monitoring and fault prediction method based on an attention mechanism in the industrial internet.
The prediction method provided by the invention adopts a supervised learning mode, collects fault samples, extracts fault characteristics, constructs a causal analysis model by analyzing potential causal relationship between the characteristics and faults, and realizes the prediction of equipment faults by combining causal analysis and a time attention mechanism.
It should be noted that all algorithms in the present invention, if not specifically mentioned, all employ algorithms known in the art.
In the present invention, it is to be noted that:
the SVM algorithm refers to a support vector machine algorithm.
The RF algorithm refers to a random forest algorithm.
The LR algorithm refers to a logistic regression algorithm.
The LSTM algorithm refers to a long short term memory network algorithm.
The GRU algorithm refers to a gated round-robin unit algorithm.
The DFC-CNN algorithm refers to a Deep full convolution Neural Network (English).
The DA-RNN algorithm refers to a two-stage Attention-cycle Neural Network algorithm (English).
The DW-AE algorithm refers to a depth Wavelet Auto-Encoder algorithm (English).
Transformer refers to a deep self-attention network.
AUC refers to the subject operating characteristic curve.
The Softmax function refers to a normalization function.
The present invention is not limited to the following embodiments, and all equivalent changes based on the technical solutions of the present invention fall within the protection scope of the present invention.
Example (b):
the embodiment provides a method for predicting the fault of large equipment with serious cause and attention in industrial internet, which comprises the following steps:
step 1, collecting fault data of large equipment, and taking the fault data as a training sample;
step 1 comprises the following substeps:
step 1.1, classifying fault data based on collected large-scale equipment fault samples, and marking each type of fault data;
specifically, in this embodiment, the experimental data is derived from the data set of the life cycle vibration signal bearing manufactured by the science and technology of the university of western-safety transportation and the Shang Yang. The experimental platform comprises a rotating speed control motor, a rotating shaft, a supporting bearing, a hydraulic loading system, a testing bearing and the like. The data set had 3 conditions, 5 bearings for each condition. A total of 15 bearing full life cycle signal samples. In the test, the sampling frequency is set to be 25.6kHz, the sampling interval is set to be 1min, and the sampling time length of each time is set to be 1.28 s. The number of signals collected by each sensor per minute is 32769. Given the fewer types of conditions in the data set, the data is first preprocessed. The essence of the data enhancement, i.e. increasing the size of the data set, is to increase the size of the data set under reasonable operation to obtain the learning result, and the time required for collecting 2000 data is taken as the unit time tau, wherein there are 15 devices of the same type, and each device has two acceleration sensors to record the data change of the bearing.
Step 1.2, carrying out mass sampling on the marked fault data of each type of large equipment, recording the time of each sampling point in the sampling process, and arranging the acquired signal segments according to the labels of the large equipment where the signal segments are located to obtain training samples;
the specific process of the arrangement is as follows: the number of the large-scale equipment is G, and each large-scale equipment is provided with I monitoring sensors; the fault type of the large-scale equipment is Q; defining unit time tau, and making data acquired by the large-scale equipment every time the large-scale equipment experiences tau into a data slice; the time from the start of the large-scale equipment to the occurrence of the fault is T; the time of the kth data slice is tau multiplied by k; the data collected by the large-scale equipment in each T process comprises K data slices which are arranged in time sequence.
In the present embodiment, sampling is used. Taking each device as an example, 2000 data can be taken as one slice, and a slice of the full-period signal can be obtained. After a fault occurs, the signal characteristics are obvious, the fault prediction of the equipment is meaningless at the moment, the fault is predicted in time at the early stage of the fault as far as possible, when the fault is early or does not occur, because of the periodicity of the signal, information contained in a large amount of data is extremely little at the moment, the signal condition of the period is considered as little as possible, when the signal is abnormal before the fault occurs, the signal change condition of the period is concerned as much as possible, and the method accords with the general detection process flow of the equipment. Therefore, when a data set is acquired, in consideration of real industrial equipment inspection, when sequence analysis is performed each time, and the size of the sequence, as shown in fig. 1, 20 slices are randomly acquired at the early stage of signal stabilization, 40 slices are randomly acquired at the middle stage of a fault and combined into a slice sequence, and K is 60, and 1000 slice sequences are acquired according to each working condition by the method.
The invention has the advantages of step 1: the sequence data set with the characteristic information and the time information is obtained through the steps, and the requirement of model data volume is met. The enhanced data set is taken entirely from the original data set and no additional information is introduced.
There are 3 fault types in this example for experimental demonstration of the new method.
Step 2, preprocessing the data of the training samples of the large-scale equipment faults classified in the step 1, obtaining time domain characteristics of the samples by using a signal time domain analysis method, coding the time domain characteristics, and normalizing the numerical characteristics to obtain a sample data sequence of the preprocessed large-scale equipment sensor;
step 2 comprises the following substeps:
step 2.1, obtaining the time domain characteristics of the data slice by using a signal time domain analysis method for the well-regulated data slice, and making the time domain characteristics of the corresponding sample into a time domain characteristic slice;
and obtaining unique time domain characteristic parameters from the data of each time domain characteristic slice by adopting a signal time domain analysis method, wherein the time domain characteristic parameters are divided into dimensional characteristic parameters and dimensionless characteristic parameters. Such as variance, root mean square, mean, etc.
Step 2.2, standardizing the extracted time domain feature slices to generate a unified feature code, wherein a standardized formula is defined as follows:
Figure BDA0003524005890000071
in the formula:
fijkslicing time domain features for a kth sequence of jth features for an ith sensor;
sijka signature code of a kth sequence of jth signatures of an ith sensor;
Max(fijk) Is the maximum value of the jth characteristic of the ith sensor;
Min(fijk) Is the minimum value of the jth characteristic of the ith sensor;
gamma is a coefficient for controlling the space size of the feature code;
j is the characteristic number of the sensor;
i is the ith sensor;
j is the jth characteristic of the sensor;
k is the kth sequence;
the purpose of this step is to normalize and convert all time domain features into signatures.
Step 2.3, converting the feature code obtained in step 2.2 into binary input:
nk=[b11k,b12k,b13k,...,bIJk]formula 2;
obtaining a sample data sequence N ═ N of a large-scale equipment sensor1,n2,n3,...,nK];
Each feature code sijkCan be converted into a binary input bijkUsing {0,1}oMeaning, o ═ γ × I × J, signature sijkThe small and medium numbers are rounded down.
In the formula:
bIJkis a feature code sijkA binary input of (2);
n is a sample data sequence of a large-scale equipment sensor;
nkas signature f of the kth sequenceijkA binary input of (2);
nKas signature f of the K-th sequenceijkIs input in binary.
In the embodiment, in the process of analyzing the signal value, the time domain characteristics of every 2000 data points are counted, the time domain signal itself contains huge information, and it is important to select a proper time domain analysis index for analyzing the bearing state. And 7 feature description slice time domain feature values of variance, root mean square value, average value, kurtosis, skewness, peak factor and margin factor are selected. In the system of signal digital characteristics, the values represented are different due to the different characteristics of the respective characteristics. To take into account the meaning of its features in the input model space, the sample data needs to be normalized before fault detection. The data is mapped to a specific interval, so that the influence of value difference caused by properties among data characteristics is eliminated. The convergence rate of the model can be accelerated, and the accuracy of the model can be improved. Firstly, unifying all characteristic value intervals by using a normalization method, and then coding the characteristic value intervals. Specifically, the sequence points of all bearings are normalized. On the existing data set, a dispersion normalization method is adopted for each feature. The spatial size coefficient assigned to each feature is 1400. All features are mapped on a sparse space of total size 9800. As shown in fig. 2, we define each sample of the input sequence as an embedded vector of size 7 × 1 × 60, where 7 is the number of rows of the sequence feature, 1 is the number of columns of the sequence feature, and 60 is the number of slices.
As shown in fig. 3, the present invention has the advantages of step 2: by analyzing the characteristic signals of numerical analysis, the characteristic dimensions which are as useful as possible for the detection result are selected, and the influence of the subsequent causal analysis on the characteristics is intuitively judged. Increasing the understanding of the impact of the features.
Step 3, performing causal analysis on the sample data sequence of the large-scale equipment sensor obtained in the step 2, and quantifying the influence degree of each characteristic on a prediction result based on a set causal analysis objective function;
step 3 comprises the following substeps:
step 3.1, performing causal analysis by using the preprocessed sample data sequence of the large-scale equipment sensor obtained in the step 2;
in a sample of sensors of a large-scale device, i sensors are provided, and the sensors have j characteristics; when all the feature calculations are performed, the objective function of the quantized features to the prediction result is defined as:
Figure BDA0003524005890000091
in the formula:
Δε,fijis characterized byijImpact on fault prediction;
fijis the jth characteristic of the ith sensor;
Figure BDA0003524005890000092
is free of the feature fijError in fault prediction;
εFerror predicted for a fault;
step 3.2, according to the formula 3, measuring the influence of one feature on the prediction result and calculating the model error epsilon of the complete featureFAnd does not contain feature fijModel error of
Figure BDA0003524005890000093
Using a layer of Transformer based on an attention mechanism as a model for calculating errors; generating an embedded sequence M of a Transformer and a characteristic-free f from an equation 4 based on a sample data sequence N of a large-scale equipment sensorijEmbedded sequence of (M \ f)ij},
M=[m1,m2,m3,...,mK],M\{fij}=[m`1,m`2,m`3,...,m`k];
mk=wmnk+bmFormula 4;
Figure BDA0003524005890000094
Figure BDA0003524005890000101
in the formula:
TF (-) is a Transformer;
Figure BDA0003524005890000102
a label that is a prediction result;
m is an embedded sequence of a Transformer;
M\{fijfeatureless f for TransformerijThe embedding sequence of (a);
mkembedding data for a kth sequence of the embedding sequence M;
nkas signature f of the k-th sequenceijkBinary input of
m`kFor embedding the sequence M \ fijEmbedded data of the kth sequence of };
wmfor the initialization of the weight matrix for the embedding sequence,
Figure BDA0003524005890000103
bmto initialize the deviation matrix for the embedding sequence,
Figure BDA0003524005890000104
v is wmAnd bmDimension (d);
o is nkO ═ γ × I × J;
Figure BDA0003524005890000107
is a real number set;
step 3.3, representing the real label by e, using cross entropy function loss function
Figure BDA0003524005890000108
To express the error of the prediction result after the transform learning, the failure prediction error of equation 3 can be expressed as:
Figure BDA0003524005890000105
Figure BDA0003524005890000106
causal contribution of a feature can be calculated by the calculation of the model using equations 3, 4, 5, 6, 7 and 8, and the calculation without the feature fijThe difference in the loss function between the errors of the fault predictions;
step 3.4, calculating model errors of the input characteristics by using the formulas 7 and 8 to obtain causal influence of the characteristics on the model; each input feature is assigned a weight according to causal influence, and the weight is assigned as shown in equation 9:
Figure BDA0003524005890000111
deriving causal impact weights
Figure BDA0003524005890000112
In the formula:
Figure BDA0003524005890000113
a weight of a jth input feature for an ith sensor;
WFare causal impact weights.
Specifically, in this embodiment, for step 3.1, due to the limitation of the data set, the acceleration sensor signal is selected for numerical analysis, and we select 7 features from the time domain signal analysis, i.e., I is 1 and J is 7, which are related to the fault signal as much as possible. And respectively removing a certain characteristic in the pre-training process, entering Transformer learning, and obtaining a loss value without a certain characteristic and a loss value containing all the characteristics, wherein the difference between the two values reflects the influence degree of the characteristics on the result. And using the weight as a reference to distribute weight for each feature, and using the product of the weight and the feature value as formal input of the Transformer.
For step 3.2, step 3.3 and step 3.4, the influence of each feature in the causal analysis module on the final result is calculated by using equations 3 to 8, a causal weight value is generated by using the result and is respectively added to each feature value for the input of the perceptron, and 7 important features are analyzed by using numerical analysis in the used large-scale equipment data set. And respectively carrying out causal calculation on the two signals, and calculating which characteristics have larger influence on fault detection.
The invention has the advantages of step 3: the characteristics are analyzed through the causal analysis module, unimportant characteristics are restrained, all the characteristics can be treated equally through an existing algorithm, in fact, different signal characteristics have different meanings at different stages of equipment, and the characteristics which influence the characteristics are screened out in the early stage of a fault and have larger influence on a final detection result.
And 4, combining the time information and the result of the step 3, and obtaining the hidden layer data and the attention score of the large-scale equipment based on the model of the time attention mechanism.
Step 4 comprises the following substeps:
step 4.1, the causal influence weight obtained in step 3.4 is used to combine with formula 1 to recalculate the feature code, and the formula is shown in formula 10:
Figure BDA0003524005890000121
in the formula:
Figure BDA0003524005890000122
the feature code of the k sequence of the j feature of the ith sensor is obtained by recalculation;
obtaining the sample data sequence combining the causal influence weight by adopting the same operation as the step 2.3 based on the formula 10
Figure BDA0003524005890000123
In the formula:
NWis a sequence of sample data incorporating causal impact weights;
Figure BDA0003524005890000124
signature f for the K-th sequence combined with causal influence weightsijkA binary input of (2);
Figure BDA0003524005890000125
the characteristic representation of the combined causal influence weight is in the same type of large equipment fault state; predicting a fault after all sample data
Figure BDA0003524005890000126
Are all the same;
step 4.2, processing the time information and embedding the characteristics into a uniform dimensional sequence, wherein the formula is as follows:
Figure BDA0003524005890000127
in the formula:
zkembedding a sequence for time information;
tan h is a hyperbolic tangent function;
t is the time taken by the large-scale equipment from startup to failure;
pktime difference from failure to slice acquisition, pk=T-t×k;
wzAn initialized weight matrix for the time information embedding sequence,
Figure BDA0003524005890000128
bzan initial bias matrix for the time information embedding sequence,
Figure BDA0003524005890000129
v is wzAnd bzDimension (d);
as described above, the time information is initialized to be a vector embedded with the features in a uniform dimension, and the closer the time of slicing is to the fault, the more likely the data is to be abnormal, and higher attention should be paid.
And 4.3, generating the combined embedded data by the time information and the sample data combined with the causal influence weight:
Figure BDA0003524005890000131
the combined embedded sequence C ═ C can be obtained from formula 121,c2,c3,...,cK,cT];
In the formula:
ckembedding data for the combined kth sequence;
cKembedding data of the combined Kth sequence;
Figure BDA0003524005890000132
as signature f of the combined k-th sequenceijkA binary input of (2);
wcand bcInitializing a weight matrix and a bias matrix for the combined embedding sequence, wherein
Figure BDA0003524005890000133
Figure BDA0003524005890000134
C is the combined embedded sequence;
cTembedded data combined with causal influence weight in the same type of large equipment fault state; predicting c after all sample data when a fault occursTAre all the same;
and 4.4, learning the relation between the embedded sequence containing the time information and the large-scale equipment fault each time by using a single-layer structure Transformer according to the combined embedded sequence:
[h1,h2,h3,...,hK,hT]=TF([c1,c2,c3,...,cK,cT]) Formula 13;
in the formula:
TF (-) is a Transformer;
hKis cKHidden layer data learned through a Transformer;
hTis cTHidden layer data learned through a Transformer, namely fault state hidden layer representation;
step 4.5, calculating the local score of the combined embedding sequence, and generating local feature attention weight after obtaining the attention score;
Figure BDA0003524005890000141
in the formula:
uka local fraction of the kth sequence being the combined embedded sequence;
hkthe data is the hidden layer data learned by a Transformer;
Figure BDA0003524005890000142
to initialize the weight matrix for local attention,
Figure BDA0003524005890000143
buto initialize the deviation matrix for local attention,
Figure BDA0003524005890000144
l is hidden layer data hkDimension (d);
p is wuAnd buDimension (d);
after obtaining the local attention score, a local feature attention weight is generated using the Softmax function, i.e.:
Wlocal=Softmax([u1,u2,u3,...,uK])=[l1,l2,l3,...,lK]formula 15;
in the formula:
Wlocalattention weights for local features;
uKa local feature score for a kth sequence of the combined embedded sequences;
lKa local feature score weight value for a Kth sequence of the combined embedded sequences;
step 4.6, judging the influence of the sample time on the fault prediction by using an attention mechanism, and firstly, expressing the fault state hidden layer obtained in the step 4.4 as hTConverting into a query vector in an attention mechanism;
x=ReLU(WxhT+bx) Formula 16;
in the formula:
x is a query vector in the attention mechanism;
ReLU () is a modified linear unit activation function;
hThiding the layer representation for a fault condition;
wxto initialize the weight matrix for the query vector,
Figure BDA0003524005890000145
bxto initialize the bias matrix for the query vector,
Figure BDA0003524005890000146
l is hidden layer data hkDimension (d);
q is wxAnd bxDimension (d);
step 4.7, time difference p from fault occurrence to data slice acquisitionkAs a key vector for the attention mechanism, as shown in equation 17:
Figure BDA0003524005890000151
to obtain E ═ E1,e2,e3,...,eK];
In the formula:
eka key vector for the kth sequence;
eKa key vector for the Kth sequence;
e is a time key vector set of the attention mechanism;
weis the initialized weight matrix for the time key vector,
Figure BDA0003524005890000152
bethe bias matrix is initialized for the time key vector,
Figure BDA0003524005890000153
q is weAnd beDimension (d);
step 4.8, based on the query vector x and the key vector e obtained in step 4.6 and step 4.7kUsing the attention mechanism, a global time attention score can be obtained, as shown in equations 18 and 19:
Figure BDA0003524005890000154
in the formula:
rkglobal temporal attention score for the kth sequence;
xTis a transpose of the query vector x;
δ is the dimension of the time key vector;
applying the Softmax layer to normalize the attention score, the global temporal attention weight can be expressed as:
Wglobal=Softmax([r1,r2,r3,...,rK])=[g1,g2,g3,...,gK]formula 19;
in the formula:
Wglobala global temporal attention weight;
rKglobal temporal attention score for the kth sequence;
gKglobal temporal attention weight for the kth sequence;
step 4.9, combining the local feature score of step 4.5 with the global time score of step 4.8;
first of all use hTThe embedding assigns weights to the local features and time information, which are normalized by Softmax, as shown in equation 20:
V=Softmax(WvhT+bv)=[aloacl,aglobal]formula 20;
in the formula:
hThiding the layer representation for a fault condition;
wvan initialization weight matrix is assigned to the integrated information,
Figure BDA0003524005890000161
bvan initial bias matrix is assigned to the integrated information,
Figure BDA0003524005890000162
l is hidden layer data hKDimension (d);
obtaining a fused attention weight according to the local feature attention weight and the global time attention weight, as shown in formula 21;
Figure BDA0003524005890000163
in the formula:
Figure BDA0003524005890000164
an attention weight that is a fusion;
aloaclto indicate h by a faultTIs distributed to WlocalThe weight of (c);
lKa local feature weight value for a Kth sequence of the combined embedded sequences;
aglobalto indicate h by a faultTIs distributed to WglobalThe weight of (c);
gKglobal temporal attention weight for the kth sequence;
step 4.10, normalizing the fused attention weight to obtain the attention score of the embedded sequence
Figure BDA0003524005890000165
As shown in equation 22:
Figure BDA0003524005890000171
specifically, in this embodiment, steps 4.1 to 4.5 are a local information attention module, which is used to analyze the feature information of each collected signal, for step 4.1, an embedded sequence meeting the requirement of a Transformer is made by using the features of the previous causal analysis and preprocessing, and the time information is also integrated into the embedded sequence in steps 4.2 and 4.3, and steps 4.4 and 4.5 learn the dependency relationship between the embedded sequence information and obtain a hidden vector, so as to obtain a local attention score converted from the embedded sequence to the hidden vector. Steps 4.6 to 4.8 are a global time attention module by analyzing the importance of the time information for the overall time signal. Step 4.6, a hidden variable h for analyzing the overall condition of the equipment is obtained by a local attention moduleθThe attention mechanism is used to convert to a query vector. Step 4.7 emphasizes the two former modules of the time information on judging the current state of the equipment from different angles, and the two modules need to be considered in combination. Steps 4.9 to 4.10 therefore design an attention fusion mechanism to capture the relevant information of signal characterization and time characterization under different conditions, and give a composite score after fusing the attention scores of the local features and the attention scores of the global time information.
The invention has the advantages of step 4: the analysis is respectively carried out from two aspects, the characteristics are obtained by carrying out numerical analysis on the full-period signals of the data set, and the time information is from the original data set. The most important thing in this study is health condition detection of large-scale equipment, so we introduce periodic signal time information, fuse signal characteristics and time information together, and analyze the change situation of long-period signal by using the time interval environment between the collected information sequences. Meanwhile, the contribution degree of the characteristics to the detection result can be better distributed by adopting fusion.
And 5, predicting the fault of the large equipment by using the hidden layer data and the attention score obtained in the step 4.
Step 5 comprises the following substeps:
step 5.1, according to the hidden layer data in step 4.4 and the attention score in step 4.10, a failure prediction score of the large-scale equipment can be obtained:
Figure BDA0003524005890000172
in the formula:
Figure BDA0003524005890000181
predicting a score for a fault of the large scale equipment;
Figure BDA0003524005890000182
an attention score for the embedded sequence;
hkis ckHidden layer data learned through a Transformer;
step 5.2, for the fault prediction score of the large-scale equipment obtained in the step 5.1, obtaining the probability of the fault prediction of the large-scale equipment by using a Softmax function;
Figure BDA0003524005890000183
in the formula:
wda weight matrix initialized for the failure prediction probability,
Figure BDA0003524005890000184
bda bias matrix initialized for the probability of failure prediction,
Figure BDA0003524005890000185
l is hidden layer data hKDimension (d);
and judging the possibility that the large equipment is about to have a certain fault type according to the probability of fault prediction of the large equipment.
Specifically, in this embodiment, for step 5.1, the comprehensive attention score obtained in step 4.10 and the failure prediction score obtained in step 4.4 are used. For step 5.2, the fault detection score obtained in step 5.1 is detected, and the probability of the corresponding fault is finally detected through a Softmax function or the like.
In this embodiment, the trained model is used to verify the accuracy of the model based on the test sample. Specifically, let the parameter in the model be psi, and use the cross entropy loss function as the predicted value
Figure BDA0003524005890000186
The objective is to minimize the average loss function, as shown in equation 25, from the actual value d:
Figure BDA0003524005890000188
in the formula:
Figure BDA0003524005890000187
to minimize the average loss function;
d is the actual value;
Figure BDA0003524005890000191
is a predicted value;
g is the total number of large-scale equipment.
The performance analysis of the method of the invention:
the method uses the data set of the whole life cycle vibration signal bearing manufactured by the science and technology of the university of Xian transportation and the Shanyang as the data set to prove the effectiveness of the method of the present invention.
The data set had 3 conditions, 5 bearings for each condition. A total of 15 bearing full life cycle signal samples. In the test, the sampling frequency is set to be 25.6kHz, the sampling interval is set to be 1min, and the sampling time length of each time is set to be 1.28 s. The number of signals collected by each sensor per minute is 32769.
When mechanical equipment fails, it may behave to different degrees in the time, frequency and time-frequency domains. Taking the bearing 1_1 as an example, when the outer ring of the bearing fails at the end of the test, the vibration signal in the horizontal direction can contain more degradation information because the load is applied in the horizontal direction. The data collected for the horizontal direction of the bearing 1_1 is shown in fig. 1.
The best results can be obtained by using data of the whole life cycle, but in an actual scene, the service life of the bearing can reach tens of thousands of hours. The value of the sensor for collecting a large amount of data information is extremely low. Often important data is distributed over the second half of the life cycle of the bearing. Therefore, during the normal operation of the bearing, a large amount of data should not be collected, and a signal sequence consisting of sensor information at several time points is selected as an indication of the normal state of the bearing.
In order to verify the effectiveness of the algorithm in the chapter, three sample data are selected, wherein the rotating speed is 2100r/min, and the bearing fault is outer ring crack loss under the working condition that the radial force is 12 kN. In order to ensure the effectiveness of model training, signals of each bearing are divided into a sequence, and characteristic values of 2000 data are obtained through numerical analysis. Sequence points of the full-period signal are obtained. In the later stage of the fault, the signal characteristics are already obvious, and the meaning of fault detection on the bearing is lost at the moment, so that the fault is selected to be detected in time in the early stage of the fault, as shown in fig. 2, when the fault is in the early stage or does not occur, because of the periodicity of the signal, information contained in a large amount of data is very little at the moment, the signal condition in the period is considered as few as possible at the moment, and when the signal is abnormal in the early stage or the middle stage of the fault, the signal change condition in the period is paid attention to as much as possible, which accords with the flow of the general detection process of the equipment. Therefore, when a data set is collected, in consideration of real industrial equipment inspection, 20 sequence points are randomly collected at an early stage of stable signal collection each time, 40 signal points are randomly collected at a middle stage of a fault to be combined into a signal characteristic sequence, and 1000 signal characteristic sequences are collected at each working condition. The corresponding data set description is shown in table 1.
Table 1 data set description
Bearing assembly Bearing 1_1 Bearing 1_4 Bearing 1_5 Bearing 2_1 Bearing 2_5
Training set 600 600 600 600 600
Verification set 200 200 200 200 200
Test set 200 200 200 200 200
Because data has certain level of deletion, bearing data sets are respectively mixed to detect the accuracy of different types of faults. As shown in tables 2 and 3, in the bearing 1_1 and 1_4 data sets, 1_1 is the fault to be detected. In the bearing 1_1 and 1_5 data sets, 1_1 is the fault to be detected. In the bearing 2_1 and 2_5 data sets, 2_1 is the fault to be detected.
TABLE 2 hybrid bearing test accuracy results
Figure BDA0003524005890000201
TABLE 3 comparison of Experimental results for bearing 2_1 and 2_5 data sets
Figure BDA0003524005890000211
As shown in tables 2 and 3, the hybrid data at bearings 1_1 and 1_4 achieved full recognition at all algorithms. This is because the data characteristic information of the two types of failure are greatly different. But on the detection of the identification of different fault types. The algorithm achieves good results. The method is improved compared with a benchmark algorithm.
The experiments also verified the average performance of the proposed model and other baseline models across the data set. It is next necessary to analyze how causal targets affect the model results, as shown in table 4, allowing the model to learn the features with the highest correlation to the targets being tested. The dimensionless parameters are insensitive to the bearing load and speed of the bearing, do not need to consider the comparison between relative standard values and previous data, are more sensitive to the early stage of the fault, but have serious anti-interference fault difference, and are easy to cause misjudgment. Although the parameters such as peak value, crest factor, kurtosis and the like are sensitive to the impact fault, when the fault enters a severe development stage, the parameters such as the peak value factor, the crest factor and the like are in a saturated state and lose the diagnosis capability. However, different types of faults may result in different trends for different factors. This also leads to causal analysis focusing on different features. Note that the mechanism may force the model to focus on signal features that contain significant risk factors, while mitigating the impact of other features on the detection results. The contribution of each feature of the model to the final performance can be clearly known through causal analysis, and the method can be extended to other models.
TABLE 4 bearing 2_1 and 2_5 data set causal analysis results
Figure BDA0003524005890000221

Claims (5)

1. A method for predicting the fault of a large-scale device with serious causality and attention in industrial Internet is characterized by comprising the following steps:
step 1, collecting fault data of large equipment, and taking the fault data as a training sample;
step 2, preprocessing the data of the training samples of the large equipment faults sorted and classified in the step 1, obtaining time domain characteristics of the samples by using a signal time domain analysis method, coding the time domain characteristics, and normalizing the numerical characteristics to obtain a preprocessed sample data sequence of the large equipment sensor;
step 3, performing causal analysis on the sample data sequence of the large-scale equipment sensor obtained in the step 2, and quantifying the influence degree of each characteristic on a prediction result based on a set causal analysis objective function;
step 4, combining the time information and the result of the step 3, and obtaining hidden layer data and attention scores of the large-scale equipment based on a model of a time attention mechanism;
step 5, predicting the fault of the large equipment by using the hidden layer data and the attention score obtained in the step 4;
wherein, step 3 comprises the following substeps:
step 3.1, performing causal analysis by using the preprocessed sample data sequence of the large-scale equipment sensor obtained in the step 2;
in a sample of sensors of a large-scale device, i sensors are provided, and the sensors have j characteristics; when all the feature calculations are performed, the objective function of the quantized features to the prediction result is defined as:
Figure FDA0003524005880000011
in the formula:
Δε,fijis characterized byijImpact on fault prediction;
fijis the jth characteristic of the ith sensor;
Figure FDA0003524005880000027
is free of the feature fijError in fault prediction;
εFerror predicted for a fault;
step 3.2, according to the formula 3, measuring the influence of one characteristic on the prediction result and calculating the model error epsilon of the complete characteristicFAnd does not contain feature fijModel error of
Figure FDA0003524005880000028
Using a layer of Transformer based on an attention mechanism as a model for calculating errors; generating an embedded sequence M of a Transformer and a characteristic f-free sequence from an equation 4 based on a sample data sequence N of a large-scale equipment sensorijEmbedded sequence of (M \ f)ij},M=[m1,m2,m3,...,mK],M\{fij}=[m`1,m`2,m`3,...,m`k];
mk=wmnk+bmFormula 4;
Figure FDA0003524005880000021
Figure FDA0003524005880000022
in the formula:
TF (-) is a Transformer;
Figure FDA0003524005880000023
a label that is a prediction result;
m is an embedded sequence of a Transformer;
M\{fijfeatureless f for TransformerijThe embedding sequence of (a);
mkembedding data for a kth sequence of the embedding sequence M;
nkas signature f of the k-th sequenceijkBinary input of
m`kFor embedding the sequence M \ fijEmbedded data of the kth sequence of };
wmfor the initialization of the weight matrix for the embedding sequence,
Figure FDA0003524005880000024
bmto initialize the deviation matrix for the embedding sequence,
Figure FDA0003524005880000025
v is wmAnd bmDimension (d);
o is nkO ═ γ × I × J;
Figure FDA0003524005880000026
is a real number set;
step 3.3, representing the real label by e, using cross entropy function loss function
Figure FDA0003524005880000031
To express the error of the prediction result after the transform learning, the failure prediction error of equation 3 can be expressed as:
Figure FDA0003524005880000032
Figure FDA0003524005880000033
causal contribution of a feature can be calculated by the calculation of the model using equations 3, 4, 5, 6, 7 and 8, and the calculation without the feature fijThe difference in the loss function between the errors of the fault predictions;
step 3.4, calculating model errors of the input characteristics by using the formulas 7 and 8 to obtain causal influence of the characteristics on the model; each input feature is assigned a weight according to causal influence, and the weight is assigned as shown in equation 9:
Figure FDA0003524005880000034
deriving causal impact weights
Figure FDA0003524005880000035
In the formula:
Figure FDA0003524005880000036
a weight of a jth input feature for an ith sensor;
WFare causal impact weights.
2. The method for predicting the fault of the large-scale equipment with the characteristics of causality and attention in the industrial internet as claimed in claim 1, wherein the step 1 comprises the following sub-steps:
step 1.1, classifying fault data based on collected large-scale equipment fault samples, and marking each type of fault data;
and step 1.2, carrying out mass sampling on the marked fault data of each type of large equipment, recording the time of each sampling point in the sampling process, and arranging the acquired signal segments according to the labels of the large equipment where the signal segments are located to obtain training samples.
3. The method for predicting the fault of the large-scale equipment with the characteristics of causality and attention in the industrial internet as claimed in claim 1, wherein the step 2 comprises the following sub-steps:
step 2.1, obtaining the time domain characteristics of the data slice by using a signal time domain analysis method for the well-regulated data slice, and making the time domain characteristics of the corresponding sample into a time domain characteristic slice;
step 2.2, standardizing the extracted time domain feature slices to generate a uniform feature code, wherein the standardized formula is defined as:
Figure FDA0003524005880000041
in the formula:
fijkslicing time domain features for a kth sequence of jth features for an ith sensor;
sijka signature code of a kth sequence of jth signatures of an ith sensor;
Max(fijk) Is the maximum value of the jth characteristic of the ith sensor;
Min(fijk) Is the minimum value of the jth characteristic of the ith sensor;
gamma is a coefficient for controlling the space size of the feature code;
j is the characteristic number of the sensor;
i is the ith sensor;
j is the jth characteristic of the sensor;
k is the kth sequence;
step 2.3, converting the feature code obtained in step 2.2 into binary input:
nk=[b11k,b12k,b13k,...,bIJk]formula 2;
obtaining a sample data sequence N ═ N of a large-scale equipment sensor1,n2,n3,...,nK];
In the formula:
bIJkis a feature code sijkA binary input of (2);
n is a sample data sequence of a large-scale equipment sensor;
nkas signature f of the kth sequenceijkA binary input of (2);
nKas signature f of the K-th sequenceijkIs input in binary.
4. The method for predicting the fault of the large-scale equipment with the characteristics of causality and attention in the industrial internet as claimed in claim 1, wherein the step 4 comprises the following sub-steps:
step 4.1, the causal influence weight obtained in step 3.4 is used to combine with formula 1 to recalculate the feature code, and the formula is shown in formula 10:
Figure FDA0003524005880000051
in the formula:
Figure FDA0003524005880000052
the feature code of the k sequence of the j feature of the ith sensor is obtained by recalculation;
based on the formula 10 andstep 2.3 same operation, obtaining sample data sequence combined with causal influence weight
Figure FDA0003524005880000053
In the formula:
NWis a sequence of sample data incorporating causal impact weights;
Figure FDA0003524005880000054
signature f for the K-th sequence combined with causal influence weightsijkA binary input of (2);
Figure FDA0003524005880000055
the characteristic representation of the combined causal influence weight is in the same type of large equipment fault state; after predicting all sample data at a fault
Figure FDA0003524005880000056
Are all the same;
step 4.2, processing the time information and embedding the characteristics into a uniform dimensional sequence, wherein the formula is as follows:
Figure FDA0003524005880000057
in the formula:
zkembedding a sequence for time information;
tan h is a hyperbolic tangent function;
t is the time taken by the large-scale equipment from startup to failure;
pktime difference from failure to slice acquisition, pk=T-t×k;
wzAn initialized weight matrix for the time information embedding sequence,
Figure FDA0003524005880000058
bzthe deviation matrix is initialized for the time information embedding sequence,
Figure FDA0003524005880000059
v is wzAnd bzDimension (d);
and 4.3, generating combined embedded data by the time information and the sample data combined with the causal influence weight:
Figure FDA0003524005880000061
the combined embedded sequence C ═ C can be obtained from formula 121,c2,c3,...,cK,cT];
In the formula:
ckembedding data for the combined kth sequence;
cKembedding data of the combined Kth sequence;
Figure FDA0003524005880000062
as signature f of the combined k-th sequenceijkA binary input of (2);
wcand bcInitializing a weight matrix and a bias matrix for the combined embedding sequence, wherein
Figure FDA0003524005880000063
Figure FDA0003524005880000064
C is the combined embedded sequence;
cTembedding data which are combined with the causal influence weight when the large-scale equipment of the same type is in a fault state; all samples at the time of predicting a failureC after this dataTAre all the same;
and 4.4, learning the relation between the embedded sequence containing the time information and the large-scale equipment fault each time by using a single-layer structure Transformer according to the combined embedded sequence:
[h1,h2,h3,...,hK,hT]=TF([c1,c2,c3,...,cK,cT]) Formula 13;
in the formula:
TF (-) is a Transformer;
hKis cKHidden layer data learned through a Transformer;
hTis cTHidden layer data learned through a Transformer, namely fault state hidden layer representation;
step 4.5, calculating the local score of the combined embedding sequence, and generating local feature attention weight after obtaining the attention score;
Figure FDA0003524005880000076
in the formula:
uka local fraction of the kth sequence being the combined embedded sequence;
hkthe data is the hidden layer data learned by a Transformer;
Figure FDA0003524005880000071
to initialize the weight matrix for local attention,
Figure FDA0003524005880000072
buto initialize the deviation matrix for local attention,
Figure FDA0003524005880000073
l is hidden layer data hkDimension (d);
p is wuAnd buDimension (d);
after obtaining the local attention score, a local feature attention weight is generated using the Softmax function, namely:
Wlocal=Softmax([u1,u2,u3,...,uK])=[l1,l2,l3,...,lK]formula 15;
in the formula:
Wlocalattention weights for local features;
uKa local feature score for a kth sequence of the combined embedded sequences;
lKa local feature score weight value for a Kth sequence of the combined embedded sequences;
step 4.6, judging the influence of the sample time on the fault prediction by using an attention mechanism, and firstly, expressing the fault state hidden layer obtained in the step 4.4 as hTConverting into a query vector in an attention mechanism;
x=ReLU(WxhT+bx) Formula 16;
in the formula:
x is a query vector in the attention mechanism;
ReLU () is a modified linear unit activation function;
hThiding the layer representation for a fault condition;
wxto initialize the weight matrix for the query vector,
Figure FDA0003524005880000074
bxto initialize the bias matrix for the query vector,
Figure FDA0003524005880000075
l is hidden layer data hkDimension of (d);
q is wxAnd bxDimension of (d);
step 4.7, time difference p from fault occurrence to data slice acquisitionkAs a key vector of the attention mechanism, as shown in equation 17:
Figure FDA0003524005880000081
to obtain E ═ E1,e2,e3,...,eK];
In the formula:
eka key vector for the kth sequence;
eKa key vector for the Kth sequence;
e is a time key vector set of the attention mechanism;
weis the initialized weight matrix for the time key vector,
Figure FDA0003524005880000082
bethe bias matrix is initialized for the time key vector,
Figure FDA0003524005880000083
q is weAnd beDimension (d);
step 4.8, based on the query vector x and the key vector e obtained in step 4.6 and step 4.7kUsing the attention mechanism, a global time attention score can be obtained, as shown in equations 18 and 19:
Figure FDA0003524005880000084
in the formula:
rkglobal temporal attention score for the kth sequence;
Figure FDA0003524005880000085
is a transpose of the query vector x;
δ is the dimension of the time key vector;
applying the Softmax layer to normalize the attention score, the global temporal attention weight can be expressed as:
Wglobal=Softmax([r1,r2,r3,...,rK])=[g1,g2,g3,...,gK]formula 19;
in the formula:
Wglobalis a global temporal attention weight;
rKglobal temporal attention score for the kth sequence;
gKglobal temporal attention weight for the kth sequence;
step 4.9, combining the local feature score of step 4.5 with the global time score of step 4.8;
first of all use hTThe embedding assigns weights to the local features and time information, which are normalized by Softmax, as shown in equation 20:
y=Soffmax(WvhT+bv)=[aloacl,aglobal]formula 20;
in the formula:
hThiding the layer representation for a fault condition;
wvan initialization weight matrix is assigned to the integrated information,
Figure FDA0003524005880000091
bvan initial bias matrix is assigned to the integrated information,
Figure FDA0003524005880000092
l is hidden layer data hKDimension (d);
obtaining a fused attention weight according to the local feature attention weight and the global time attention weight, as shown in formula 21;
Figure FDA0003524005880000093
in the formula:
Figure FDA0003524005880000094
an attention weight for fusion;
aloaclto indicate h by a faultTIs distributed to WlocalThe weight of (c);
lKa local feature weight value for a kth sequence of the combined embedded sequences;
aglobalto indicate h by a faultTIs distributed to WglobalThe weight of (c);
gKglobal temporal attention weight for the kth sequence;
step 4.10, normalizing the fused attention weight to obtain the attention score of the embedded sequence
Figure FDA0003524005880000095
As shown in equation 22:
Figure FDA0003524005880000101
5. the method for predicting the failure of the large equipment with the cause of disease and the attention in the industrial internet, as claimed in claim 1, wherein the step 5 comprises the following sub-steps:
step 5.1, according to the hidden layer data in step 4.4 and the attention score in step 4.10, a failure prediction score of the large-scale equipment can be obtained:
Figure FDA0003524005880000102
in the formula:
Figure FDA0003524005880000103
predicting a score for a fault of the large scale equipment;
Figure FDA0003524005880000104
an attention score for the embedded sequence;
hkis ckHidden layer data learned through a Transformer;
step 5.2, for the fault prediction score of the large-scale equipment obtained in the step 5.1, obtaining the probability of the fault prediction of the large-scale equipment by using a Softmax function;
Figure FDA0003524005880000105
in the formula:
wda weight matrix initialized for the failure prediction probability,
Figure FDA0003524005880000106
bda bias matrix initialized for the failure prediction probability,
Figure FDA0003524005880000107
l is hidden layer data hKDimension (d);
and judging the possibility that the large equipment is about to have a certain fault type according to the probability of fault prediction of the large equipment.
CN202210187119.9A 2022-02-28 2022-02-28 Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet Active CN114580472B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210187119.9A CN114580472B (en) 2022-02-28 2022-02-28 Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210187119.9A CN114580472B (en) 2022-02-28 2022-02-28 Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet

Publications (2)

Publication Number Publication Date
CN114580472A true CN114580472A (en) 2022-06-03
CN114580472B CN114580472B (en) 2022-12-23

Family

ID=81777013

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210187119.9A Active CN114580472B (en) 2022-02-28 2022-02-28 Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet

Country Status (1)

Country Link
CN (1) CN114580472B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116383096A (en) * 2023-06-06 2023-07-04 安徽思高智能科技有限公司 Micro-service system anomaly detection method and device based on multi-index time sequence prediction

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109084980A (en) * 2018-10-10 2018-12-25 北京交通大学 Bearing fault prediction technique and device based on equalization segmentation
CN109828549A (en) * 2019-01-28 2019-05-31 中国石油大学(华东) A kind of industry internet equipment fault prediction technique based on deep learning
WO2020112337A1 (en) * 2018-11-26 2020-06-04 Exxonmobil Research And Engineering Company Predictive maintenance
CN111460728A (en) * 2020-03-09 2020-07-28 华南理工大学 Method and device for predicting residual life of industrial equipment, storage medium and equipment
CN111856958A (en) * 2020-07-27 2020-10-30 西北大学 Intelligent household control system, control method, computer equipment and storage medium
CN112862209A (en) * 2021-03-05 2021-05-28 重庆大学 Industrial equipment monitoring data prediction method
CN112910695A (en) * 2021-01-22 2021-06-04 湖北工业大学 Network fault prediction method based on global attention time domain convolutional network
CN113283631A (en) * 2021-04-13 2021-08-20 中国石油大学(华东) Industrial equipment fault prediction method based on self-attention mechanism and time sequence convolution network
CN113987834A (en) * 2021-11-15 2022-01-28 华东交通大学 CAN-LSTM-based railway train bearing residual life prediction method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109084980A (en) * 2018-10-10 2018-12-25 北京交通大学 Bearing fault prediction technique and device based on equalization segmentation
WO2020112337A1 (en) * 2018-11-26 2020-06-04 Exxonmobil Research And Engineering Company Predictive maintenance
CN109828549A (en) * 2019-01-28 2019-05-31 中国石油大学(华东) A kind of industry internet equipment fault prediction technique based on deep learning
CN111460728A (en) * 2020-03-09 2020-07-28 华南理工大学 Method and device for predicting residual life of industrial equipment, storage medium and equipment
CN111856958A (en) * 2020-07-27 2020-10-30 西北大学 Intelligent household control system, control method, computer equipment and storage medium
CN112910695A (en) * 2021-01-22 2021-06-04 湖北工业大学 Network fault prediction method based on global attention time domain convolutional network
CN112862209A (en) * 2021-03-05 2021-05-28 重庆大学 Industrial equipment monitoring data prediction method
CN113283631A (en) * 2021-04-13 2021-08-20 中国石油大学(华东) Industrial equipment fault prediction method based on self-attention mechanism and time sequence convolution network
CN113987834A (en) * 2021-11-15 2022-01-28 华东交通大学 CAN-LSTM-based railway train bearing residual life prediction method

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
HUI WANGA等: "Intelligent Bearing Fault Diagnosis Using Multi-Head Attention-Based CNN", 《PROCEDIA MANUFACTURING》 *
WEN, YUXIN等: "Recent advances and trends of predictive maintenance from data-driven machine prognostics perspective", 《MEASUREMENT》 *
刘新: "基于TCN的时间序列数据特征提取研究与应用", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
王禹: "域间路由系统独立监控技术", 《中国博士学位论文全文数据库 信息科技辑》 *
苏小盟: "基于神经网络和注意力机制的变压器状态预测和故障诊断研究", 《中国优秀硕士学位论文全文数据库 工程科技II辑》 *
范子豪: "数据驱动的LTE-R网络故障诊断与预测研究", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116383096A (en) * 2023-06-06 2023-07-04 安徽思高智能科技有限公司 Micro-service system anomaly detection method and device based on multi-index time sequence prediction
CN116383096B (en) * 2023-06-06 2023-08-18 安徽思高智能科技有限公司 Micro-service system anomaly detection method and device based on multi-index time sequence prediction

Also Published As

Publication number Publication date
CN114580472B (en) 2022-12-23

Similar Documents

Publication Publication Date Title
CN111914873B (en) Two-stage cloud server unsupervised anomaly prediction method
Ma et al. Deep coupling autoencoder for fault diagnosis with multimodal sensory data
Wang et al. A method for rapidly evaluating reliability and predicting remaining useful life using two-dimensional convolutional neural network with signal conversion
CN111914883B (en) Spindle bearing state evaluation method and device based on deep fusion network
Song et al. Wind turbine health state monitoring based on a Bayesian data-driven approach
CN114282579A (en) Aviation bearing fault diagnosis method based on variational modal decomposition and residual error network
CN104614179B (en) A kind of gearbox of wind turbine state monitoring method
CN113642754B (en) Complex industrial process fault prediction method based on RF noise reduction self-coding information reconstruction and time convolution network
CN116226646B (en) Method, system, equipment and medium for predicting health state and residual life of bearing
CN115409131B (en) Production line abnormity detection method based on SPC process control system
Son et al. Deep learning-based anomaly detection to classify inaccurate data and damaged condition of a cable-stayed bridge
CN114580472B (en) Large-scale equipment fault prediction method with repeated cause and effect and attention in industrial internet
Sadoughi et al. A deep learning approach for failure prognostics of rolling element bearings
Wang et al. Wind turbine fault detection and identification through self-attention-based mechanism embedded with a multivariable query pattern
CN113469247B (en) Network asset abnormity detection method
CN117194163A (en) Computer equipment, fault detection system, method and readable storage medium
CN117782198B (en) Highway electromechanical equipment operation monitoring method and system based on cloud edge architecture
Zhou et al. Degradation State Recognition of Rolling Bearing Based on K‐Means and CNN Algorithm
CN117933531A (en) Distributed photovoltaic power generation power prediction system and method
CN117664558A (en) Generator gear box abnormality detection method, device, equipment and storage medium
CN117251817A (en) Radar fault detection method, device, equipment and storage medium
Xiao et al. Fault state identification of rolling bearings based on deep transfer convolutional autoencoder and a new health indicator
Pan et al. Intelligent fault diagnosis of rolling bearing via deep-layerwise feature extraction using deep belief network
CN116541771A (en) Unbalanced sample bearing fault diagnosis method based on multi-scale feature fusion
CN114492636A (en) Transformer winding state signal acquisition system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant