CN114298200A - Abnormal data diagnosis method based on deep parallel time sequence relation network - Google Patents

Abnormal data diagnosis method based on deep parallel time sequence relation network Download PDF

Info

Publication number
CN114298200A
CN114298200A CN202111589040.0A CN202111589040A CN114298200A CN 114298200 A CN114298200 A CN 114298200A CN 202111589040 A CN202111589040 A CN 202111589040A CN 114298200 A CN114298200 A CN 114298200A
Authority
CN
China
Prior art keywords
vector
data
feature
module
relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111589040.0A
Other languages
Chinese (zh)
Inventor
凡时财
杨淳
邹见效
徐红兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Higher Research Institute Of University Of Electronic Science And Technology Shenzhen
Original Assignee
Higher Research Institute Of University Of Electronic Science And Technology Shenzhen
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Higher Research Institute Of University Of Electronic Science And Technology Shenzhen filed Critical Higher Research Institute Of University Of Electronic Science And Technology Shenzhen
Priority to CN202111589040.0A priority Critical patent/CN114298200A/en
Publication of CN114298200A publication Critical patent/CN114298200A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an abnormal data diagnosis method based on a deep parallel time sequence relation network, which comprises the steps of collecting characteristic data under various abnormal working conditions of an industrial production system and standardizing the characteristic data to obtain a training data matrix, then extracting to obtain a characteristic vector sequence, taking the characteristic vector sequence as input and taking the corresponding abnormal working condition serial number as output to form a training sample, and constructing a DPTRN model. The invention can improve the processing speed of time sequence data and ensure the detection performance of abnormal data.

Description

Abnormal data diagnosis method based on deep parallel time sequence relation network
Technical Field
The invention belongs to the technical field of diagnosis of abnormal data in an industrial process, and particularly relates to an abnormal data diagnosis method based on a deep parallel time sequence relation network.
Background
With the continuous growth of modern industrial technology, the scale of modern industry is more and more complex. If abnormal data occurring in the industrial process is not identified and solved in time, not only economic loss is brought, but also the life safety of personnel is threatened in serious cases. Therefore, it is essential to monitor industrial processes using robust and reliable anomaly data diagnostic techniques.
The traditional abnormal data diagnosis method based on modeling cannot adapt to the increasingly modern industrial system due to the characteristics of high complexity, poor maintainability, low robustness and the like, so that the method based on data driving is concerned more and more widely. The data driving-based method analyzes the potential rule of a data mode according to historical data acquired in an industrial process, and obtains a data model with both robustness and accuracy, so that abnormal data detection or fault diagnosis can be realized on new data.
Deep learning, the most fire-based data-driven approach in recent years, has achieved a great deal of practical success in the field of industrial fault detection and diagnosis. Compared with the traditional machine learning method, deep learning can avoid a large amount of artificial feature engineering work, can automatically learn the potential high-dimensional expression of data, and has outstanding advantages on various evaluation indexes.
The time-series data is data recorded in time series. Industrial processes generally have characteristics that evolve over time, and the characteristics of the evolution of the industrial process cannot be fully considered with only a single point-in-time characteristic. The abnormal data diagnosis method based on the time series data can more fully utilize the historical information, learn the change characteristics of the industrial process along with the evolution of time, and have strong characteristic extraction capability and abnormal data diagnosis capability.
The deep learning processing of the time series data generally employs a Neural Network based on a Recurrent Neural Network (RNN), a Long-short Term Memory (LSTM), or a Gated Recurrent Unit (GRU). However, these neural network structures extract features in a manner of serially processing time series data, and the data processing speed is limited, and cannot meet the requirement of rapid and real-time diagnosis in an industrial process.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides an abnormal data diagnosis method based on a Deep Parallel Time sequence relation Network, which combines the relation characteristics among characteristic data at all times by adopting a Multilayer Perceptron (MLP), and provides a Deep Parallel Time sequence relation Network (DPTRN) model on the basis, thereby realizing the Parallel processing of Time sequence data, greatly improving the processing speed of the Time sequence data and ensuring the detection performance of abnormal data.
In order to achieve the above object, the method for diagnosing abnormal data based on a deep parallel time series relationship network of the present invention comprises the following steps:
s1: under D abnormal working conditions of the industrial production system, a plurality of preset sensors acquire working data of various abnormal working conditions, and the dimensionality of a feature vector at each sampling moment is M; recording the characteristic vector obtained at the t-th sampling moment under the d-th abnormal working condition as xd(t),d=1,2,…,D,t=1,2,…,Nd,NdRepresenting the number of sampling moments under the d-th abnormal working condition; the feature vector xd(t) as row vectors, and performing ascending arrangement according to sampling time to obtain an original training data matrix Xd
Figure BDA0003428529830000021
S2: the original data matrix XdNormalizing to obtain a normalized training data matrix
Figure BDA0003428529830000022
Figure BDA0003428529830000023
S3: will train the data matrix
Figure BDA0003428529830000024
Feature vector of
Figure BDA0003428529830000025
Divided into Q according to a predetermined time sequence length KdA sequence of feature vectors
Figure BDA0003428529830000026
Figure BDA0003428529830000031
Wherein Q is 1,2, …, Qd,QdRepresenting a training data matrix
Figure BDA0003428529830000032
The number of the feature vector sequences obtained by the division,
Figure BDA0003428529830000033
Figure BDA0003428529830000034
represents rounding down;
s4: each feature vector sequence obtained in the step S3
Figure BDA0003428529830000035
As an input in a training sample, taking the serial number d corresponding to the abnormal working condition as an expected output, namely forming a training sample;
s5: the method comprises the following steps of constructing a DPTRN model, wherein the DPTRN model comprises a relation module, a decoupling position vector calculation module, a relation weight calculation module, a historical information vector calculation module, a vector splicing module and a multilayer perceptron, wherein:
the relation module is used for extracting an input feature vector sequence to obtain a preliminary relation weight vector and sending the preliminary relation weight vector to the relation weight calculation module, and the specific method is as follows:
the input feature vector sequence is recorded as
Figure BDA0003428529830000036
Wherein Z (k) represents the kth M-dimensional feature vector of Z in the sequence of feature vectors;
the relation module comprises K-1 relation units, wherein the kth 'relation unit is used for calculating and obtaining the primary relation weight between the characteristic vector z (K') and the characteristic vector z (K)
Figure BDA0003428529830000037
K' is 1,2, …, K-1, thus obtaining a preliminary relationship weight vector
Figure BDA0003428529830000038
Each relationship unit comprises a vector splicing unit and a multilayer perceptron unit, wherein:
the vector splicing unit is used for splicing the eigenvector z (k ') and the eigenvector z (K), obtaining a spliced eigenvector C (k ') and sending the spliced eigenvector C (k ') to the multilayer perceptron, and the eigenvector Ck′The expression of (a) is as follows:
Figure BDA0003428529830000039
wherein contact () represents vector stitching;
the multi-layer perceptron unit receives the spliced eigenvector C (k '), processes to obtain the primary relation weight of the eigenvector z (k') and the eigenvector z (K)
Figure BDA00034285298300000310
The decoupling position vector calculation module is used for extracting decoupling position vectors of historical time and current time from the input characteristic vector sequence Z and sending the decoupling position vectors to the relation weight calculation module; the decoupling position vector calculation module comprises a position coding module and a position vector decoupling module, wherein:
the position coding module is used for respectively generating corresponding M-dimensional position codes PE (k) for each eigenvector Z (k) in the eigenvector sequence Z and sending the M-dimensional position codes PE (k) to the position vector decoupling module;
the position vector decoupling module calculates the decoupled position codes DPE (K ') of the feature vectors z (K ') at the previous K-1 historical moments according to the position codes pe (K) of each feature vector z (K), so as to obtain the decoupled position vectors DPE at the previous K-1 historical moments [ DPE (1), DPE (2), …, DPE (K-1) ], and the calculation formula of the position codes DPE (K ') is as follows:
dpe(k′)=[PE(k′)·Pos_query]⊙[PE(K)·Pos_key]
wherein, the lines indicate an inner product,
Figure BDA0003428529830000041
representing a query matrix of the location vector,
Figure BDA0003428529830000042
representing a position vector key value matrix;
the relation weight calculation module is used for calculating a relation weight vector according to the preliminary relation weight vector
Figure BDA0003428529830000043
And decoupling position vector DPE ═ DPE (1), DPE (2), …, DPE (K-1)]The final relationship weight vector RW ═ RW (1), RW (2), …, RW (K-1) is calculated]Wherein
Figure BDA0003428529830000044
Then sending the relation weight vector RW to a historical information vector calculation module;
the historical information vector calculation module is used for processing the first K-1 eigenvectors Z (K') in the eigenvector sequence Z according to the relation weight vector RW to obtain eigenvectors
Figure BDA0003428529830000045
Figure BDA0003428529830000046
Represents the vector outer product, then for K-1 feature vectors
Figure BDA0003428529830000047
Summing and pooling to obtain history information vector HI, and obtaining history informationThe information vector HI is sent to a vector splicing module;
the vector splicing module is used for splicing the historical information vector HI and the characteristic vector Z (K) in the characteristic vector sequence Z and sending the obtained splicing vector Con to the multilayer perceptron;
the multilayer perceptron is used for processing the splicing vector Con to obtain an abnormal working condition serial number corresponding to the input characteristic vector sequence;
s6: training the DPTRN model constructed in the step S5 by adopting the training sample obtained in the step S4 to obtain a trained DPTRN model;
s7: when abnormal data diagnosis needs to be carried out on the industrial production system, the same working data acquisition method as the step S1 is adopted to obtain M-dimensional characteristic vectors X (T-K) at the current moment T and the previous K-1 moments to form a data matrix XT
Figure BDA0003428529830000051
The same method as that in step S3 is applied to the data matrix XTCarrying out standardization processing to obtain a standardized data matrix
Figure BDA0003428529830000052
Data matrix
Figure BDA0003428529830000053
Inputting the DPTRN model trained in the step S6 to obtain an abnormal data diagnosis result.
The invention relates to an abnormal data diagnosis method based on a deep parallel time sequence relation network, which comprises the steps of collecting characteristic data under various abnormal working conditions of an industrial production system and standardizing the characteristic data to obtain a training data matrix, then extracting to obtain a characteristic vector sequence, taking the characteristic vector sequence as input and taking the corresponding abnormal working condition serial number as output to form a training sample, and constructing a DPTRN model.
The DPTRN model is constructed based on the multilayer perceptron, and can capture the relation between each historical moment and the current moment in time sequence data, so that the DPTRN model has the capability of processing data in parallel. Compared with the traditional method of extracting the time sequence data characteristics in a serial mode by using models such as RNN, LSTM and GRU, the data processing efficiency of the method is greatly improved. In addition, by means of technologies such as decoupling position coding and relation weight, the abnormal data diagnosis capability of the method is guaranteed.
Drawings
FIG. 1 is a flow chart of an embodiment of an abnormal data diagnosis method based on a deep parallel time series relationship network according to the present invention;
FIG. 2 is a block diagram of a DPTRN model according to the present invention;
FIG. 3 is a block diagram of a relationship unit in the present invention;
FIG. 4 is a block diagram of the decoupled position vector calculation module of the present invention.
Detailed Description
The following description of the embodiments of the present invention is provided in order to better understand the present invention for those skilled in the art with reference to the accompanying drawings. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.
Examples
FIG. 1 is a flowchart of an embodiment of an abnormal data diagnosis method based on a deep parallel time series relationship network according to the present invention. As shown in fig. 1, the method for diagnosing abnormal data based on a deep parallel time series relationship network of the present invention specifically includes the steps of:
s101: collecting training data:
under D abnormal working conditions of the industrial production system, a plurality of preset sensors collect working data of various abnormal working conditions, and the characteristics of each sampling momentThe dimension of the eigenvector is M, i.e., the number of feature data at each sampling instant is M. Recording the characteristic vector obtained at the t-th sampling moment under the d-th abnormal working condition as xd(t),d=1,2,…,D,t=1,2,…,Nd,NdAnd the number of sampling moments under the d-th abnormal condition is shown. The feature vector xd(t) as row vectors, and performing ascending arrangement according to sampling time to obtain an original training data matrix Xd
Figure BDA0003428529830000061
S102: training data normalization:
to facilitate subsequent data processing, the original data matrix X is useddNormalizing to obtain a normalized training data matrix
Figure BDA0003428529830000062
Figure BDA0003428529830000063
The standardized calculation formula in this embodiment is:
Figure BDA0003428529830000064
wherein x isd(t) (m) represents a feature vector xd(t) M-th feature data, M-1, 2, …, M,
Figure BDA0003428529830000071
representing characteristic data xd(t) (m) normalized value, mean (X)d(m)) represents the raw data matrix XdThe mean value of the mth feature data in all the feature vectors of (1), std (X)d(m)) represents the raw data matrix XdThe covariance of the mth feature data in all feature vectors of (a).
By using the above formula, each feature data can be expressed in a form of 0 as a mean and 1 as a variance.
S103: data time sequencing:
in order to improve the accuracy of abnormal data diagnosis, the invention adopts time sequence data as input data of an abnormal data diagnosis model, so that a training data matrix is required
Figure BDA0003428529830000072
The time-series data is divided. In order to avoid data leakage among samples, the invention adopts a non-crossed time sequence interception mode, and the specific method comprises the following steps:
will train the data matrix
Figure BDA0003428529830000073
Feature vector of
Figure BDA0003428529830000074
Divided into Q according to a predetermined time sequence length KdA sequence of feature vectors
Figure BDA0003428529830000075
Figure BDA0003428529830000076
Wherein Q is 1,2, …, Qd,QdRepresenting a training data matrix
Figure BDA0003428529830000077
The number of the feature vector sequences obtained by the division,
Figure BDA0003428529830000078
Figure BDA0003428529830000079
indicating a rounding down.
S104: obtaining a training sample:
each feature vector sequence obtained in step S103
Figure BDA00034285298300000710
And as an input in a training sample, taking the corresponding abnormal working condition serial number d as an expected output, namely forming a training sample.
S105: constructing a DPTRN model:
in order to realize abnormal data diagnosis, a DPTRN model is constructed in the invention. Fig. 2 is a structural diagram of a DPTRN model in the present invention. As shown in fig. 2, the DPTRN model of the present invention includes a relationship module, a decoupling position vector calculation module, a history information vector calculation module, a vector concatenation module, and a multilayer perceptron, and each of the constituent modules is described in detail below.
The relation module is used for extracting an input feature vector sequence to obtain a preliminary relation weight vector and sending the preliminary relation weight vector to the relation weight calculation module, and the specific method is as follows:
the input feature vector sequence is recorded as
Figure BDA00034285298300000711
Wherein Z (k) represents the kth M-dimensional feature vector of Z in the sequence of feature vectors;
the relation module comprises K-1 relation units, wherein the kth 'relation unit is used for calculating and obtaining the primary relation weight between the characteristic vector z (K') and the characteristic vector z (K)
Figure BDA0003428529830000081
K' is 1,2, …, K-1, thus obtaining a preliminary relationship weight vector
Figure BDA0003428529830000082
Fig. 3 is a structural diagram of a relationship unit in the present invention. As shown in fig. 3, each relationship unit in the present invention includes a vector stitching unit and a multi-layer sensor unit, wherein:
the vector splicing unit is used for splicing the feature vector z (k ') and the feature vector z (k), obtaining a spliced feature vector C (k') and sending the spliced feature vector C (k ') to the multilayer perceptron unit, wherein an expression of the feature vector C (k') is as follows:
Figure BDA0003428529830000083
wherein contact () represents vector concatenation.
Therefore, the feature vector at the moment K and the historical moment K' are subjected to subtraction operation and addition operation, and then the operation result is spliced with the two original feature vectors, so that the capability of model mining data relation can be improved.
The multi-layer perceptron unit receives the spliced eigenvector C (k '), processes to obtain the primary relation weight of the eigenvector z (k') and the eigenvector z (K)
Figure BDA0003428529830000084
The MLP module is a common neural network, and the specific principle and process thereof are not described herein.
In this embodiment, in order to reduce the size of the DPTRN model and the difficulty of model training, the multi-layer perceptron units in the K-1 relationship units in this embodiment share the weight parameters, that is, share the weight for all historical moments.
From the above process, it can be seen that in order to highlight the importance of a specific history time and prevent the important history time from being smoothed, the time relation unit will not apply to the final RWpreSoftmax normalization is performed, which means RWpreThe sum at each historical time is not 1.
The decoupling position vector calculation module is used for extracting decoupling position vectors of historical time and current time from the input characteristic vector sequence Z and sending the decoupling position vectors to the historical information vector calculation module. The decoupling position vector is introduced to enable the DPTRN model to consider the relation of each time node when parallel processing time series data. FIG. 4 is a block diagram of the decoupled position vector calculation module of the present invention. As shown in fig. 4, the decoupling position vector calculation module includes a position encoding module and a position vector decoupling module, wherein:
the position coding module is used for respectively generating corresponding M-dimensional position codes PE (k) for each eigenvector Z (k) in the eigenvector sequence Z and sending the M-dimensional position codes PE (k) to the position vector decoupling module. The position coding generally includes that the position coding module in the present embodiment adopts an absolute position coding method in a BERT model.
The position vector decoupling module calculates the decoupled position codes DPE (K ') of the feature vectors z (K ') at the previous K-1 historical moments according to the position codes pe (K) of each feature vector z (K), so as to obtain the decoupled position vectors DPE at the previous K-1 historical moments [ DPE (1), DPE (2), …, DPE (K-1) ], and the calculation formula of the position codes DPE (K ') is as follows:
dpe(k′)=[PE(k′)·Pos_query]⊙[PE(K)·Pos_key]
wherein, the lines indicate an inner product,
Figure BDA0003428529830000091
represents a trainable position vector query matrix that is,
Figure BDA0003428529830000092
representing a trainable position vector key-value matrix.
In the conventional method, position codes are usually added to the characteristic vectors in a live broadcast manner, but noise is easily introduced, so that the position vector query matrix and the position vector key value matrix are arranged, the problem of noise caused by the position codes is solved, the relation between each historical moment and the current moment is further learned, and the network convergence efficiency and robustness are improved.
The relation weight calculation module is used for calculating a relation weight vector according to the preliminary relation weight vector
Figure BDA0003428529830000093
And decoupling position vector DPE ═ DPE (1), DPE (2), …, DPE (K-1)]The final relationship weight vector RW ═ RW (1), RW (2), …, RW (K-1) is calculated]Wherein
Figure BDA0003428529830000094
The relationship weight vector RW is then sent to the history information vector calculation module.
Historical information vector calculation moduleProcessing the first K-1 eigenvectors Z (K') in the eigenvector sequence Z according to the relation weight vector RW to obtain eigenvectors
Figure BDA0003428529830000095
Figure BDA0003428529830000096
Represents the vector outer product, then for K-1 feature vectors
Figure BDA0003428529830000097
And performing summation pooling (summing) to obtain a historical information vector HI, and then sending the obtained historical information vector HI to a vector splicing module.
The vector splicing module is used for splicing the historical information vector HI and the feature vector Z (K) in the feature vector sequence Z and sending the obtained splicing vector Con to the multilayer perceptron.
The multilayer perceptron is used for processing the splicing vector Con to obtain an abnormal working condition serial number corresponding to the input characteristic vector sequence.
S106: training a DPTRN model:
and (5) training the DPTRN model constructed in the step (S105) by adopting the training sample obtained in the step (S104) to obtain a trained DPTRN model.
In order to improve the convergence speed and the convergence stability of the invention, the embodiment adopts an Adam optimization strategy in training the DPTRN model, and the strategy has the advantages of high calculation efficiency, small memory requirement and stable gradient propagation. In addition, in order to improve the robustness and the generalization of the model, the model introduces an L2 regularization and a Dropout strategy in the training process, so that the possible over-fitting tendency in the training process is avoided.
S107: and (3) abnormal data diagnosis:
when abnormal data diagnosis needs to be carried out on the industrial production system, the same working data acquisition method as the step S101 is adopted to obtain M-dimensional characteristic vectors X (T-K) at the current moment T and the previous K-1 moments to form a data matrix XT
Figure BDA0003428529830000101
The same method is adopted for the data matrix X in step S103TCarrying out standardization processing to obtain a standardized data matrix
Figure BDA0003428529830000102
Data matrix
Figure BDA0003428529830000103
Inputting the DPTRN model trained in the step S106 to obtain an abnormal data diagnosis result.
In order to better illustrate the technical effects of the invention, the invention is difficult to experiment by adopting a specific example. In this embodiment, two data sets are used, which are respectively a TE chemical process data set and a KDDCUP99 data set.
The TE chemical process is a real chemical process. The TE chemical process comprises five main units: because the internal mechanism of the reactor, the condenser, the compressor, the separator and the stripping tower is relatively complex, the TE process is widely applied to the verification of various abnormal data diagnosis methods. The whole TE chemical process mainly comprises 22 continuous process measurement variables, 19 composition measurement variables and 12 operation variables, and can simulate normal working conditions and 20 abnormal working conditions.
KDDCUP is a competition organized by ACM and SIGKDD in the field of annual machine learning and data mining, and KDDCUP99 is the dataset used for the 1999 competition. The data set collects 9 weeks of network connection data from a simulated U.S. air force local area network, and the data set has two types, namely a normal type and an attack type. The data set has 41 features, 9 of which are discrete features and the remaining features are continuous features. Since the data sets are acquired strictly in time sequence, they are widely used in the study of time series data methods.
An abnormal data diagnosis method based on 4 models is adopted in the experiment, wherein the 4 models are MLP, LSTM + MLP, Bi _ LSTM + MLP and 1DCNN + MLP. In order to ensure that the experimental result is only influenced by the feature extraction part, the classification layer of each network structure adopts the MLP with the same structure. Table 1 is a table of structural information of 4 models in this experiment.
Figure BDA0003428529830000111
TABLE 1
In addition, on the basis of the DPTRN model adopted by the present invention, two models, i.e., a DPTRN _ a that does not use a position vector and a DPTRN _ b that uses a position vector but is not decoupled, are provided as comparison methods. In terms of performance parameters, training time, reasoning time, recall rate, accuracy and F1 value of a single sample in the experiment are used as evaluation parameters.
Table 2 is a table of performance parameters for abnormal data diagnosis of the TE chemical process data set using the present invention and 6 comparison methods.
Figure BDA0003428529830000112
TABLE 2
Table 3 is a table of abnormal data diagnostic performance parameters for KDDCUP99 data sets using the present invention and 6 comparison methods.
Figure BDA0003428529830000121
TABLE 3
As shown in tables 2 and 3, under the condition of approximate parameter quantity, the method provided by the invention utilizes the characteristic of parallel computation, and has a better abnormal data diagnosis effect under the condition of ensuring that the training time and the inference time are low in cost.
Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.

Claims (4)

1. An abnormal data diagnosis method based on a deep parallel time sequence relation network is characterized by comprising the following steps:
s1: under D abnormal working conditions of the industrial production system, a plurality of preset sensors acquire working data of various abnormal working conditions, and the dimensionality of a feature vector at each sampling moment is M; recording the characteristic vector obtained at the t-th sampling moment under the d-th abnormal working condition as xd(t),d=1,2,…,D,t=1,2,…,Nd,NdRepresenting the number of sampling moments under the d-th abnormal working condition; the feature vector xd(t) as row vectors, and performing ascending arrangement according to sampling time to obtain an original training data matrix Xd
Figure FDA0003428529820000011
S2: the original data matrix XdNormalizing to obtain a normalized training data matrix
Figure FDA0003428529820000012
Figure FDA0003428529820000013
S3: will train the data matrix
Figure FDA0003428529820000014
Feature vector of
Figure FDA0003428529820000015
Divided into Q according to a predetermined time sequence length KdA sequence of feature vectors
Figure FDA0003428529820000016
Figure FDA0003428529820000017
Wherein Q is 1,2, …, Qd,QdRepresenting a training data matrix
Figure FDA0003428529820000018
The number of the feature vector sequences obtained by the division,
Figure FDA0003428529820000019
Figure FDA00034285298200000110
represents rounding down;
s4: each feature vector sequence obtained in the step S3
Figure FDA00034285298200000111
As an input in a training sample, taking the serial number d corresponding to the abnormal working condition as an expected output, namely forming a training sample;
s5: the method comprises the following steps of constructing a DPTRN model, wherein the DPTRN model comprises a relation module, a decoupling position vector calculation module, a relation weight calculation module, a historical information vector calculation module, a vector splicing module and a multilayer perceptron, wherein:
the relation module is used for extracting an input feature vector sequence to obtain a preliminary relation weight vector and sending the preliminary relation weight vector to the relation weight calculation module, and the specific method is as follows:
the input feature vector sequence is recorded as
Figure FDA0003428529820000021
Wherein Z (k) represents the kth M-dimensional feature vector of Z in the sequence of feature vectors;
the relation module comprises K-1 relation units, and the kth' relation unit is used for calculating to obtain the feature directionPreliminary relational weighting between the quantity z (k') and the feature vector z (K)
Figure FDA0003428529820000022
Thereby obtaining a preliminary relationship weight vector
Figure FDA0003428529820000023
Each relationship unit comprises a vector splicing unit and a multilayer perceptron unit, wherein:
the vector splicing unit is used for splicing the eigenvector z (k ') and the eigenvector z (K), obtaining a spliced eigenvector C (k ') and sending the spliced eigenvector C (k ') to the multilayer perceptron, and the eigenvector Ck′The expression of (a) is as follows:
Figure FDA0003428529820000024
wherein contact () represents vector stitching;
the multi-layer perceptron unit receives the spliced eigenvector C (k '), processes to obtain the primary relation weight of the eigenvector z (k') and the eigenvector z (K)
Figure FDA0003428529820000025
The decoupling position vector calculation module is used for extracting decoupling position vectors of historical time and current time from the input characteristic vector sequence Z and sending the decoupling position vectors to the relation weight calculation module; the decoupling position vector calculation module comprises a position coding module and a position vector decoupling module, wherein:
the position coding module is used for respectively generating corresponding M-dimensional position codes PE (k) for each eigenvector Z (k) in the eigenvector sequence Z and sending the M-dimensional position codes PE (k) to the position vector decoupling module;
the position vector decoupling module calculates the decoupled position codes DPE (K ') of the feature vectors z (K ') at the previous K-1 historical moments according to the position codes pe (K) of each feature vector z (K), so as to obtain the decoupled position vectors DPE at the previous K-1 historical moments [ DPE (1), DPE (2), …, DPE (K-1) ], and the calculation formula of the position codes DPE (K ') is as follows:
dpe(k′)=[PE(k′)·Pos_query]⊙[PE(K)·Pos_key]
wherein, the lines indicate an inner product,
Figure FDA0003428529820000026
representing a query matrix of the location vector,
Figure FDA0003428529820000027
representing a position vector key value matrix;
the relation weight calculation module is used for calculating a relation weight vector according to the preliminary relation weight vector
Figure FDA0003428529820000031
And decoupling position vector DPE ═ DPE (1), DPE (2), …, DPE (K-1)]The final relationship weight vector RW ═ RW (1), RW (2), …, RW (K-1) is calculated]Wherein
Figure FDA0003428529820000032
Then sending the relation weight vector RW to a historical information vector calculation module;
the historical information vector calculation module is used for processing the first K-1 eigenvectors Z (K') in the eigenvector sequence Z according to the relation weight vector RW to obtain eigenvectors
Figure FDA0003428529820000033
Figure FDA0003428529820000034
Represents the vector outer product, then for K-1 feature vectors
Figure FDA0003428529820000035
Summing and pooling to obtain a historical information vector HI, and then sending the obtained historical information vector HI to a vector splicing module;
the vector splicing module is used for splicing the historical information vector HI and the characteristic vector Z (K) in the characteristic vector sequence Z and sending the obtained splicing vector Con to the multilayer perceptron;
the multilayer perceptron is used for processing the splicing vector Con to obtain an abnormal working condition serial number corresponding to the input characteristic vector sequence;
s6: training the DPTRN model constructed in the step S5 by adopting the training sample obtained in the step S4 to obtain a trained DPTRN model;
s7: when abnormal data diagnosis needs to be carried out on the industrial production system, the same working data acquisition method as the step S1 is adopted to obtain M-dimensional characteristic vectors X (T-K) at the current moment T and the previous K-1 moments to form a data matrix XT
Figure FDA0003428529820000036
The same method as that in step S3 is applied to the data matrix XTCarrying out standardization processing to obtain a standardized data matrix
Figure FDA0003428529820000037
Data matrix
Figure FDA0003428529820000038
Inputting the DPTRN model trained in the step S6 to obtain an abnormal data diagnosis result.
2. The abnormal data diagnosis method according to claim 1, wherein the standardized calculation formula in step S2 is:
Figure FDA0003428529820000039
wherein x isd(t) (m) represents a feature vector xd(t) M-th feature data, M-1, 2, …, M,
Figure FDA0003428529820000041
representing characteristic data xd(t) (m) normalized value, mean (X)d(m)) represents the raw data matrix XdThe mean value of the mth feature data in all the feature vectors of (1), std (X)d(m)) represents the raw data matrix XdThe covariance of the mth feature data in all feature vectors of (a).
3. The abnormal data diagnosis method according to claim 1, wherein the weight parameters are shared by the multi-layer perceptron units among the K-1 relational units in step S5.
4. The abnormal data diagnosis method according to claim 1, wherein the position coding module in step S5 adopts an absolute position coding method in a BERT model.
CN202111589040.0A 2021-12-23 2021-12-23 Abnormal data diagnosis method based on deep parallel time sequence relation network Pending CN114298200A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111589040.0A CN114298200A (en) 2021-12-23 2021-12-23 Abnormal data diagnosis method based on deep parallel time sequence relation network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111589040.0A CN114298200A (en) 2021-12-23 2021-12-23 Abnormal data diagnosis method based on deep parallel time sequence relation network

Publications (1)

Publication Number Publication Date
CN114298200A true CN114298200A (en) 2022-04-08

Family

ID=80968964

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111589040.0A Pending CN114298200A (en) 2021-12-23 2021-12-23 Abnormal data diagnosis method based on deep parallel time sequence relation network

Country Status (1)

Country Link
CN (1) CN114298200A (en)

Similar Documents

Publication Publication Date Title
CN111830408B (en) Motor fault diagnosis system and method based on edge calculation and deep learning
CN109146246B (en) Fault detection method based on automatic encoder and Bayesian network
WO2022037068A1 (en) Method for diagnosis of fault in machine tool bearing
WO2022077901A1 (en) Bearing failure mode diagnosis method using small sample data sets, and system
CN111275288B (en) XGBoost-based multidimensional data anomaly detection method and device
CN109555566B (en) Steam turbine rotor fault diagnosis method based on LSTM
CN111337768A (en) Deep parallel fault diagnosis method and system for dissolved gas in transformer oil
CN111273623B (en) Fault diagnosis method based on Stacked LSTM
US20230023931A1 (en) Hydraulic turbine cavitation acoustic signal identification method based on big data machine learning
CN110533167B (en) Fault diagnosis method and system for electric valve actuating mechanism
CN114386521A (en) Method, system, device and storage medium for detecting abnormality of time-series data
CN108435819B (en) Energy consumption abnormity detection method for aluminum profile extruder
CN110705812A (en) Industrial fault analysis system based on fuzzy neural network
CN111860775B (en) Ship fault real-time diagnosis method based on CNN and RNN fusion
CN114297918A (en) Aero-engine residual life prediction method based on full-attention depth network and dynamic ensemble learning
CN112784920A (en) Cloud-side-end-coordinated dual-anti-domain self-adaptive fault diagnosis method for rotating part
CN116398418A (en) Nuclear power circulating water pump online abnormality monitoring and identifying method
CN114357372A (en) Aircraft fault diagnosis model generation method based on multi-sensor data driving
Zhang et al. Gated recurrent unit-enhanced deep convolutional neural network for real-time industrial process fault diagnosis
Xu et al. Global attention mechanism based deep learning for remaining useful life prediction of aero-engine
CN114169091A (en) Method for establishing prediction model of residual life of engineering mechanical part and prediction method
CN115290326A (en) Rolling bearing fault intelligent diagnosis method
CN113984389A (en) Rolling bearing fault diagnosis method based on multi-receptive-field and improved capsule map neural network
CN113536671A (en) Lithium battery life prediction method based on LSTM
CN115618732B (en) Nuclear reactor digital twin key parameter autonomous optimization data inversion method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination