CN110751169A - Time sequence classification method based on relation change among multivariate variables - Google Patents
Time sequence classification method based on relation change among multivariate variables Download PDFInfo
- Publication number
- CN110751169A CN110751169A CN201910833290.0A CN201910833290A CN110751169A CN 110751169 A CN110751169 A CN 110751169A CN 201910833290 A CN201910833290 A CN 201910833290A CN 110751169 A CN110751169 A CN 110751169A
- Authority
- CN
- China
- Prior art keywords
- variables
- correlation coefficient
- time
- partial correlation
- relation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 230000008859 change Effects 0.000 title claims abstract description 19
- 239000011159 matrix material Substances 0.000 claims abstract description 41
- 238000013528 artificial neural network Methods 0.000 claims abstract description 11
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 11
- 238000010586 diagram Methods 0.000 claims abstract description 5
- 230000015654 memory Effects 0.000 claims abstract description 4
- 239000013598 vector Substances 0.000 claims abstract description 4
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 6
- 241000287196 Asthenes Species 0.000 claims description 3
- 238000012512 characterization method Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 230000007547 defect Effects 0.000 description 3
- 230000036541 health Effects 0.000 description 3
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Complex Calculations (AREA)
Abstract
The invention provides a time sequence classification method based on relation change among multivariate variables, which comprises the following steps: acquiring sample data from the observation data set, calculating a partial correlation coefficient between every two variables of the sample data, and constructing a partial correlation coefficient matrix; coding the partial correlation coefficient matrix through a convolutional neural network to obtain a corresponding characteristic diagram; respectively stretching each feature map into feature vectors to be circularly input into the long and short memory neural network, thereby obtaining a hidden state for capturing variable inter-relation change modes; and inputting the hidden state into a label classifier, outputting a corresponding sample class, and finishing the classification of the time sequence. The time sequence classification method based on the relation change among the multivariate variables fully considers the relation among different variables in time sequence data, performs classification based on the relation mode of the variables, fully expresses the change mode of the relation among the different variables in the time sequence data, and has better robustness on the input noise value and high classification precision.
Description
Technical Field
The invention relates to the technical field of data mining, in particular to a time sequence classification method based on relation change among multivariate variables.
Background
Applications of time series data in the fields of industrial systems, information systems, medical health, financial markets, etc. are becoming more and more common today. Therefore, the task of time-series classification has become an important and valuable research topic, such as abnormality detection and the like. Conventional similarity-based time sequence classification methods such as K-nearest neighbor (KNN) and Dynamic Time Warping (DTW). However, this type of method is sensitive only to the values of the variables and does not take into account the relationships between the different variables.
Another type of method that is currently popular is to perform a series of feature transformations on time series data, so as to mine patterns therein for classification, such as multi-layer perceptrons (MLPs), long short term memory neural networks (LSTM), Convolutional Neural Networks (CNN), and so on. Although this kind of method implicitly captures the relationship between different variables on the feature space, it is difficult to characterize the change pattern of the relationship between the variables. In the time series classification problem, a certain type of variation mode of the relation between variables often represents a classification category. For example, in an information system, it is common that "CPU temperature" is increased by an increase in "CPU usage" of a certain server, and "fan speed" is increased by an increase in "CPU temperature", so that "CPU temperature" is maintained relatively stable when "CPU usage" continues to increase. It can be seen that the relationship between "CPU usage" and "CPU temperature" changes from being dependent to being independent during this time. However, when a fan of a server fails, then both "CPU temperature" and "fan speed" may be irrelevant, and an increase in "CPU usage" causes the "CPU temperature" to continue to rise, even causing the server to be down. Therefore, the relationship between "CPU usage" and "CPU temperature" has not been independent for this period of time.
The relationship between the variables in the two categories is different, but the current method cannot express and classify the variation well.
Disclosure of Invention
The invention provides a time sequence classification method based on the relation change among multiple variables, aiming at overcoming the technical defects that the existing time sequence data classification method cannot effectively express the change mode of the relation among the variables and is human.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a method for time-series classification based on relation change among multivariate variables comprises the following steps:
s1: acquiring an observation data set with a label;
s2: acquiring sample data from the observation data set, calculating a partial correlation coefficient between every two variables of the sample data, and constructing a partial correlation coefficient matrix to obtain a partial correlation coefficient matrix at each moment;
s3: the partial correlation coefficient matrix at each moment is used as an input convolutional neural network CNN, and the convolutional neural network encodes the partial correlation coefficient matrix to obtain a corresponding characteristic diagram;
s4: respectively stretching each feature map into feature vectors to be circularly input into a long and short memory neural network (LSTM), thereby obtaining a hidden state for capturing a variable mode among variable relations;
s5: and inputting the hidden state into a label classifier, outputting a corresponding sample class, and finishing the classification of the time sequence.
Wherein, the step S1 specifically includes:
sampling at fixed time by using a data acquisition device of an industrial system or an information system; different index values are obtained at each sampling moment, the system state corresponding to the moment is represented by a label variable, and an observation data set can be acquired after the system runs for a period of time, wherein:
characterization of the observation dataset as X ═ X1,x2,Λ,xm]Wherein m is the number of samples; set at time tSample data xt∈RnI.e. containing n variables, and each sample data corresponds to a tag variable ytWherein y ist∈R。
Wherein, the step S2 specifically includes:
s21: obtaining sample data X with time length of w from observation data sett=[xt-w+1,xt-w+2,Λ,xt]Wherein X istIs a time slice in X and is used for calculating partial correlation coefficient matrix Pt∈Rn×nAs a relation matrix between variables at time t;
s22: the time sequences of two variables i and j in the period of time are respectively set asThen one coefficient in the partial correlation coefficient matrixThe calculation method is as follows:
wherein,is a covariance matrix sigmatOf the inverse matrix of (d), and the covariance matrix sigmatElement (1) ofThe calculation method is as follows:
S23: obtaining a partial correlation coefficient matrix P at each moment according to the calculation mode of the step S22tAnd is used for representing the relation between different variables at each moment.
Wherein, the step S3 specifically includes: a partial correlation coefficient matrix P with a time length of lt-l+1,Pt-l+2,ΛPtInputting the convolution neural network, and coding the partial correlation coefficient matrix by the convolution neural network to obtain corresponding l characteristic graphs and corresponding labels y at each momentt。
Wherein, in the step S4, the hidden state htFor capturing the l variable inter-relationship variation patterns.
In step S5, the label classifier adopts a full connection layer, and outputs the obtained sample class
Wherein the method further comprises step S6: and (4) repeatedly performing the steps S3-S5 by using a gradient descent method by using the cross entropy of the output sample class as a loss function so as to improve the classification precision.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the time sequence classification method based on the relation change among the multivariate variables fully considers the relation among different variables in time sequence data and classifies based on the relation mode of the variables, improves the defects of the existing method, fully expresses the change mode of different variable relations in the time sequence data, has better robustness on the input noise value and high classification precision, and can be applied to the time sequence classification problem in the fields of industrial systems, information systems, medical health, financial markets and the like.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, a method for time-series classification based on relationship change between multivariate variables comprises the following steps:
s1: acquiring an observation data set with a label;
s2: acquiring sample data from the observation data set, calculating a partial correlation coefficient between every two variables of the sample data, and constructing a partial correlation coefficient matrix to obtain a partial correlation coefficient matrix at each moment;
s3: the partial correlation coefficient matrix at each moment is used as an input convolutional neural network CNN, and the convolutional neural network encodes the partial correlation coefficient matrix to obtain a corresponding characteristic diagram;
s4: respectively stretching each feature map into feature vectors to be circularly input into a long and short memory neural network (LSTM), thereby obtaining a hidden state for capturing a variable mode among variable relations;
s5: and inputting the hidden state into a label classifier, outputting a corresponding sample class, and finishing the classification of the time sequence.
Wherein, the step S1 specifically includes:
sampling at fixed time by using a data acquisition device of an industrial system or an information system; different index values are obtained at each sampling moment, the system state corresponding to the moment is represented by a label variable, and an observation data set can be acquired after the system runs for a period of time, wherein:
characterization of the observation dataset as X ═ X1,x2,Λ,xm]Wherein m is the number of samples; let the sample at time tData xt∈RnI.e. containing n variables, and each sample data corresponds to a tag variable ytWherein y ist∈R。
More specifically, the step S2 specifically includes:
s21: obtaining sample data X with time length of w from observation data sett=[xt-w+1,xt-w+2,Λ,xt]Wherein X istIs a time slice in X and is used for calculating partial correlation coefficient matrix Pt∈Rn×nAs a relation matrix between variables at time t;
s22: the time sequences of two variables i and j in the period of time are respectively set asThen one coefficient in the partial correlation coefficient matrixThe calculation method is as follows:
wherein,is a covariance matrix sigmatOf the inverse matrix of (d), and the covariance matrix sigmatElement (1) ofThe calculation method is as follows:
S23: obtaining a partial correlation coefficient matrix P at each moment according to the calculation mode of the step S22tAnd is used for representing the relation between different variables at each moment.
More specifically, the step S3 specifically includes: a partial correlation coefficient matrix P with a time length of lt-l+1,Pt-l+2,ΛPtInputting the convolution neural network, and coding the partial correlation coefficient matrix by the convolution neural network to obtain corresponding l characteristic graphs and corresponding labels y at each momentt。
More specifically, in the step S4, the hidden state htFor capturing the l variable inter-relationship variation patterns.
More specifically, in step S5, the label classifier adopts a full connection layer, and outputs the obtained sample class
More specifically, the method further includes step S6: and (4) repeatedly performing the steps S3-S5 by using a gradient descent method by using the cross entropy of the output sample class as a loss function so as to improve the classification precision.
In the specific implementation process, the time sequence classification method based on the relation change among the multivariate variables fully considers the relation among different variables in time sequence data, and simultaneously classifies based on the relation mode of the variables, so that the method overcomes the defects of the existing method, fully expresses the change mode of different variable relations in the time sequence data, has better robustness on the input noise value, has high classification precision, and can be applied to the time sequence classification problem in the fields of industrial systems, information systems, medical health, financial markets and the like.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.
Claims (7)
1. A method for time-series classification based on relation change among multivariate variables is characterized by comprising the following steps:
s1: acquiring an observation data set with a label;
s2: acquiring sample data from the observation data set, calculating a partial correlation coefficient between every two variables of the sample data, and constructing a partial correlation coefficient matrix to obtain a partial correlation coefficient matrix at each moment;
s3: the partial correlation coefficient matrix at each moment is used as an input convolutional neural network, and the convolutional neural network encodes the partial correlation coefficient matrix to obtain a corresponding characteristic diagram;
s4: respectively stretching each feature map into feature vectors to be circularly input into the long and short memory neural network, thereby obtaining a hidden state for capturing variable inter-relation change modes;
s5: and inputting the hidden state into a label classifier, outputting a corresponding sample class, and finishing the classification of the time sequence.
2. The method according to claim 1, wherein the step S1 is specifically performed by:
sampling at fixed time by using a data acquisition device of an industrial system or an information system; different index values are obtained at each sampling moment, the system state corresponding to the moment is represented by a label variable, and an observation data set can be acquired after the system runs for a period of time, wherein:
characterization of the observation dataset as X ═ X1,x2,Λ,xm]Wherein m is the number of samples; let sample data x at time tt∈RnI.e. containing n variables, with one for each sample dataIndividual tag variable ytWherein y ist∈R。
3. The method according to claim 2, wherein the step S2 is specifically performed by:
s21: obtaining sample data X with time length of w from observation data sett=[xt-w+1,xt-w+2,Λ,xt]Wherein X istIs a time slice in X and is used for calculating partial correlation coefficient matrix Pt∈Rn×nAs a relation matrix between variables at time t;
s22: the time sequences of two variables i and j in the period of time are respectively set asThen one coefficient in the partial correlation coefficient matrixThe calculation method is as follows:
wherein,is a covariance matrix sigmatOf the inverse matrix of (d), and the covariance matrix sigmatElement (1) ofThe calculation method is as follows:
S23: obtaining a partial correlation coefficient matrix P at each moment according to the calculation mode of the step S22tAnd is used for representing the relation between different variables at each moment.
4. The method according to claim 3, wherein the step S3 is specifically performed by: a partial correlation coefficient matrix P with a time length of lt-l+1,Pt-l+2,ΛPtInputting the convolution neural network, and coding the partial correlation coefficient matrix by the convolution neural network to obtain corresponding l characteristic graphs and corresponding labels y at each momentt。
5. The method according to claim 4, wherein in step S4, the hidden state h is hiddentFor capturing the l variable inter-relationship variation patterns.
7. The method for time-series classification based on the relation change between multivariate variables according to any one of claims 1-6, further comprising the step of S6: and (4) repeatedly performing the steps S3-S5 by using a gradient descent method by using the cross entropy of the output sample class as a loss function so as to improve the classification precision.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910833290.0A CN110751169B (en) | 2019-09-04 | 2019-09-04 | Time sequence classification method based on relation change among multiple variables |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910833290.0A CN110751169B (en) | 2019-09-04 | 2019-09-04 | Time sequence classification method based on relation change among multiple variables |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110751169A true CN110751169A (en) | 2020-02-04 |
CN110751169B CN110751169B (en) | 2023-09-29 |
Family
ID=69276116
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910833290.0A Active CN110751169B (en) | 2019-09-04 | 2019-09-04 | Time sequence classification method based on relation change among multiple variables |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110751169B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108182259A (en) * | 2018-01-03 | 2018-06-19 | 华南理工大学 | A kind of method classified based on depth shot and long term Memory Neural Networks to Multivariate Time Series |
CN108491886A (en) * | 2018-03-29 | 2018-09-04 | 重庆大学 | A kind of sorting technique of the polynary time series data based on convolutional neural networks |
-
2019
- 2019-09-04 CN CN201910833290.0A patent/CN110751169B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108182259A (en) * | 2018-01-03 | 2018-06-19 | 华南理工大学 | A kind of method classified based on depth shot and long term Memory Neural Networks to Multivariate Time Series |
CN108491886A (en) * | 2018-03-29 | 2018-09-04 | 重庆大学 | A kind of sorting technique of the polynary time series data based on convolutional neural networks |
Also Published As
Publication number | Publication date |
---|---|
CN110751169B (en) | 2023-09-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111915437B (en) | Training method, device, equipment and medium of money backwashing model based on RNN | |
Yıldız et al. | A thermal-based defect classification method in textile fabrics with K-nearest neighbor algorithm | |
Zhang et al. | Fault detection and recognition of multivariate process based on feature learning of one-dimensional convolutional neural network and stacked denoised autoencoder | |
CN104966105A (en) | Robust machine error retrieving method and system | |
CN116204266A (en) | Remote assisted information creation operation and maintenance system and method thereof | |
CN117041017B (en) | Intelligent operation and maintenance management method and system for data center | |
CN117034123B (en) | Fault monitoring system and method for fitness equipment | |
US20240185582A1 (en) | Annotation-efficient image anomaly detection | |
CN116451139B (en) | Live broadcast data rapid analysis method based on artificial intelligence | |
CN117578715A (en) | Intelligent monitoring and early warning method, system and storage medium for power operation and maintenance | |
CN111898704B (en) | Method and device for clustering content samples | |
Li et al. | Semi-supervised process fault classification based on convolutional ladder network with local and global feature fusion | |
Zheng et al. | A group lasso based sparse KNN classifier | |
Jaiswal et al. | Deep learned cumulative attribute regression | |
Tian et al. | High-performance fault classification based on feature importance ranking-XgBoost approach with feature selection of redundant sensor data | |
Cheng et al. | Learning Transferable Time Series Classifier with Cross-Domain Pre-training from Language Model | |
CN117333717A (en) | Security monitoring method and system based on network information technology | |
CN116561814B (en) | Textile chemical fiber supply chain information tamper-proof method and system thereof | |
CN112154453A (en) | Apparatus and method for clustering input data | |
Zhang et al. | Jointly learning dictionaries and subspace structure for video-based face recognition | |
Zhu et al. | Auto-starting semisupervised-learning-based identification of synchrophasor data anomalies | |
Wang et al. | Utilizing VQ-VAE for end-to-end health indicator generation in predicting rolling bearing RUL | |
Zeng et al. | A fault diagnosis method for motor vibration signals incorporating Swin transformer with locally sensitive hash attention | |
CN110751169B (en) | Time sequence classification method based on relation change among multiple variables | |
Wang et al. | Nonlinear feature selection neural network via structured sparse regularization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |