CN110751169B - Time sequence classification method based on relation change among multiple variables - Google Patents

Time sequence classification method based on relation change among multiple variables Download PDF

Info

Publication number
CN110751169B
CN110751169B CN201910833290.0A CN201910833290A CN110751169B CN 110751169 B CN110751169 B CN 110751169B CN 201910833290 A CN201910833290 A CN 201910833290A CN 110751169 B CN110751169 B CN 110751169B
Authority
CN
China
Prior art keywords
variables
correlation coefficient
time
time sequence
partial correlation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910833290.0A
Other languages
Chinese (zh)
Other versions
CN110751169A (en
Inventor
蔡瑞初
陈嘉伟
温雯
郝志峰
陈炳丰
李梓健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201910833290.0A priority Critical patent/CN110751169B/en
Publication of CN110751169A publication Critical patent/CN110751169A/en
Application granted granted Critical
Publication of CN110751169B publication Critical patent/CN110751169B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Complex Calculations (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a time sequence classification method based on relation change among multiple variables, which comprises the following steps: obtaining sample data from an observation data set, calculating the partial correlation coefficient between every two variables of the sample data, and constructing a partial correlation coefficient matrix; coding the bias correlation coefficient matrix through a convolutional neural network to obtain a corresponding feature map; stretching each feature map into feature vectors, circularly inputting the feature vectors into a long-short memory neural network, and obtaining a hidden state for capturing a change mode among variable relations; and inputting the hidden state into a tag classifier, outputting a corresponding sample category, and completing the classification of the time sequence. The time sequence classification method based on the relation change among the multiple variables fully considers the relation among different variables in the time sequence data, classifies the time sequence data based on the relation modes of the variables, fully expresses the change modes of the relation among different variables in the time sequence data, and has better robustness to the input noise value and high classification precision.

Description

Time sequence classification method based on relation change among multiple variables
Technical Field
The invention relates to the technical field of data mining, in particular to a time sequence classification method based on relation change among multiple variables.
Background
The use of time series data in industrial systems, information systems, medical health, financial markets, etc. is becoming more and more common today. Therefore, the task of classifying time series has become an important and valuable research topic, such as anomaly detection. Traditional time sequence classification methods based on similarity, such as K-nearest neighbor (KNN) and Dynamic Time Warping (DTW), etc. However, such methods are only sensitive to the values of the variables and do not take into account the relationship between the different variables.
Another type of method that is currently popular is to perform a series of feature transformations on the time series data, thereby mining patterns therein for classification, such as multi-layer perceptron (MLP), long-short-term memory neural network (LSTM), convolutional Neural Network (CNN), etc. Such methods, while implicitly capturing relationships between different variables in feature space, have difficulty characterizing the pattern of changes in relationships between the variables. On the time series classification problem, some kind of variation in the relationship between variables often represents a class of classification. For example, in an information system, generally, an increase in "CPU usage" of a certain server causes an increase in "CPU temperature", and an increase in "CPU temperature" causes an increase in "fan speed", so that the "CPU temperature" is maintained relatively stable while the "CPU usage" continues to increase. It can be seen that the relationship between "CPU utilization" and "CPU temperature" changes from independent to independent during this period. However, when the fan of the server fails, then both the "CPU temperature" and the "fan speed" may be irrelevant, and an increase in "CPU utilization" results in a continuous increase in "CPU temperature" and even in downtime of the server. Therefore, the relationship between the "CPU usage" and the "CPU temperature" for this period of time is always not independent.
The manner of change in the relationship between the variables is different in the two categories, however, the current method cannot express and classify the change well.
Disclosure of Invention
The invention provides a time sequence classification method based on the relation change among multiple variables, which aims to overcome the technical defects that the existing time sequence data classification method can not effectively express the change modes of the relation among the variables and is used for mankind.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a method of time series classification based on a change in a relationship between multiple variables, comprising the steps of:
s1: acquiring a labeled observation data set;
s2: obtaining sample data from an observation data set, calculating the partial correlation coefficient between every two variables of the sample data, and constructing a partial correlation coefficient matrix to obtain a partial correlation coefficient matrix at each moment;
s3: taking the partial correlation coefficient matrix at each moment as an input convolutional neural network CNN, and encoding the partial correlation coefficient matrix by the convolutional neural network to obtain a corresponding feature map;
s4: stretching each feature map into feature vectors, circularly inputting the feature vectors into a long-short memory neural network LSTM, and thus obtaining a hidden state for capturing a change mode among variable relations;
s5: and inputting the hidden state into a tag classifier, outputting a corresponding sample category, and completing the classification of the time sequence.
The step S1 specifically includes:
sampling by utilizing the fixed time of a data acquisition device of an industrial system or an information system; obtaining different index values at each sampling moment, simultaneously representing the system state corresponding to the moment by using a tag variable, and acquiring an observation data set after the system runs for a period of time, wherein:
characterizing the observation dataset as x= [ X 1 ,x 2 ,Λ,x m ]Wherein m is the number of samples; let sample data x at time t t ∈R n I.e. having n variables, with one tag variable y for each sample data t, wherein yt ∈R。
The step S2 specifically includes:
s21: acquisition of sample data X of length w from an observation dataset t =[x t-w+1 ,x t-w+2 ,Λ,x t], wherein Xt For a time slice in X, for calculating a matrix P of partial correlation coefficients t ∈R n×n As a relation matrix between variables at time t;
s22: the time sequence of two variables i and j in the time period is respectivelyThen one coefficient in the partial correlation coefficient matrix +.>The calculation method is as follows:
wherein ,is covariance matrix sigma t Is the inverse of the element of the inverse matrix of (b), and the covariance matrix sigma t Element->The calculation method is as follows:
wherein , and />Representing the mean of the two variables over the period of time, respectively.
S23: obtaining a bias correlation coefficient matrix P at each moment according to the calculation mode of the step S22 t For representing the relationship between the different variables at each instant.
The step S3 specifically includes: the partial correlation coefficient matrix P with a period of time of l t-l+1 ,P t-l+2 ,ΛP t Inputting a convolutional neural network, and encoding a bias correlation coefficient matrix by the convolutional neural network to obtain corresponding l feature graphs and corresponding labels y at each moment t
Wherein in said step S4, the hidden state h t For capturing the pattern of variation between the relationships of the l variables.
In the step S5, the label classifier uses a full connection layer and outputs the label classifierTo sample class
Wherein the method further comprises step S6: and (3) adopting the cross entropy of the output sample class as a loss function, and repeating the steps S3-S5 by using a gradient descent method so as to improve the classification accuracy.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the time sequence classification method based on the relation change among the multiple variables fully considers the relation among different variables in time sequence data, and classifies the time sequence based on the relation modes of the variables.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the present patent;
for the purpose of better illustrating the embodiments, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the actual product dimensions;
it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, a method for time sequence classification based on a change in a relationship between multiple variables includes the steps of:
s1: acquiring a labeled observation data set;
s2: obtaining sample data from an observation data set, calculating the partial correlation coefficient between every two variables of the sample data, and constructing a partial correlation coefficient matrix to obtain a partial correlation coefficient matrix at each moment;
s3: taking the partial correlation coefficient matrix at each moment as an input convolutional neural network CNN, and encoding the partial correlation coefficient matrix by the convolutional neural network to obtain a corresponding feature map;
s4: stretching each feature map into feature vectors, circularly inputting the feature vectors into a long-short memory neural network LSTM, and thus obtaining a hidden state for capturing a change mode among variable relations;
s5: and inputting the hidden state into a tag classifier, outputting a corresponding sample category, and completing the classification of the time sequence.
The step S1 specifically includes:
sampling by utilizing the fixed time of a data acquisition device of an industrial system or an information system; obtaining different index values at each sampling moment, simultaneously representing the system state corresponding to the moment by using a tag variable, and acquiring an observation data set after the system runs for a period of time, wherein:
characterizing the observation dataset as x= [ X 1 ,x 2 ,Λ,x m ]Wherein m is the number of samples; let sample data x at time t t ∈R n I.e. having n variables, with one tag variable y for each sample data t, wherein yt ∈R。
More specifically, the step S2 specifically includes:
s21: acquisition of sample data X of length w from an observation dataset t =[x t-w+1 ,x t-w+2 ,Λ,x t], wherein Xt For a time slice in X, for calculating a matrix P of partial correlation coefficients t ∈R n×n As a relation matrix between variables at time t;
s22: the time sequence of two variables i and j in the time period is respectivelyThen one coefficient in the partial correlation coefficient matrix +.>The calculation method is as follows:
wherein ,is covariance matrix sigma t Is the inverse of the element of the inverse matrix of (b), and the covariance matrix sigma t Element->The calculation method is as follows:
wherein , and />Representing the mean of the two variables over the period of time, respectively.
S23: obtaining a bias correlation coefficient matrix P at each moment according to the calculation mode of the step S22 t For representing the relationship between the different variables at each instant.
More specifically, the step S3 specifically includes: the partial correlation coefficient matrix P with a period of time of l t-l+1 ,P t-l+2 ,ΛP t Inputting a convolutional neural network, and encoding a bias correlation coefficient matrix by the convolutional neural network to obtain corresponding l feature graphs and corresponding labels y at each moment t
More specifically, in said step S4, the state h is hidden t For capturing the pattern of variation between the relationships of the l variables.
More specifically, the method comprises the steps of,in the step S5, the tag classifier uses a full connection layer, and outputs a sample class
More specifically, the method further comprises step S6: and (3) adopting the cross entropy of the output sample class as a loss function, and repeating the steps S3-S5 by using a gradient descent method so as to improve the classification accuracy.
In a specific implementation process, the time sequence classification method based on the relation change among multiple variables fully considers the relation among different variables in time sequence data, and classifies based on the relation modes of the variables.
It is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the invention are desired to be protected by the following claims.

Claims (3)

1. A method of time series classification based on a change in a relationship between multiple variables, comprising the steps of:
s1: acquiring a labeled observation data set;
the step S1 specifically comprises the following steps:
sampling by utilizing the fixed time of a data acquisition device of an industrial system or an information system; obtaining different index values at each sampling moment, simultaneously representing the system state corresponding to the moment by using a tag variable, and acquiring an observation data set after the system runs for a period of time, wherein:
characterizing the observation dataset as x= [ X 1 ,x 2 ,…,x m ]Wherein m is the number of samples; let sample data x at time t t ∈R n I.e. having n variables, with one tag variable y for each sample data t, wherein yt ∈R;
S2: obtaining sample data from an observation data set, calculating the partial correlation coefficient between every two variables of the sample data, and constructing a partial correlation coefficient matrix to obtain a partial correlation coefficient matrix at each moment;
the step S2 specifically comprises the following steps:
s21: acquisition of sample data X of length w from an observation dataset t =[x t-w+1 ,x t-w+2 ,…,x t], wherein Xt For a time slice in X, for calculating a matrix P of partial correlation coefficients t ∈R n×n As a relation matrix between variables at time t;
s22: the time sequence of two variables i and j in the time period is respectivelyThen one coefficient in the partial correlation coefficient matrix +.>The calculation method is as follows:
wherein ,is covariance matrix sigma t Is the inverse of the element of the inverse matrix of (b), and the covariance matrix sigma t Element->The calculation method is as follows:
wherein , and />Respectively representing the average value of the two variables in the period of time;
s23: obtaining a bias correlation coefficient matrix P at each moment according to the calculation mode of the step S22 t For representing the relationship between the different variables at each instant;
s3: taking the partial correlation coefficient matrix at each moment as an input convolutional neural network, and encoding the partial correlation coefficient matrix by the convolutional neural network to obtain a corresponding feature map;
the step S3 specifically comprises the following steps: the partial correlation coefficient matrix P with a period of time of l t-l+1 ,P t-l+2 ,…P t Inputting a convolutional neural network, and encoding a bias correlation coefficient matrix by the convolutional neural network to obtain corresponding l feature graphs and corresponding labels y at each moment t
S4: stretching each feature map into feature vectors, circularly inputting the feature vectors into a long-short memory neural network, and obtaining a hidden state for capturing a change mode among variable relations; in said step S4, the hidden state h t The method is used for capturing a variation mode among the variable relations;
s5: and inputting the hidden state into a tag classifier, outputting a corresponding sample category, and completing the classification of the time sequence.
2. The method according to claim 1, wherein in said step S5, said method comprises the steps ofThe label classifier adopts a full connection layer and outputs to obtain sample types
3. A method of time-series classification based on a change in a relationship between multiple variables according to any one of claims 1-2, further comprising step S6: and (3) adopting the cross entropy of the output sample class as a loss function, and repeating the steps S3-S5 by using a gradient descent method so as to improve the classification accuracy.
CN201910833290.0A 2019-09-04 2019-09-04 Time sequence classification method based on relation change among multiple variables Active CN110751169B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910833290.0A CN110751169B (en) 2019-09-04 2019-09-04 Time sequence classification method based on relation change among multiple variables

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910833290.0A CN110751169B (en) 2019-09-04 2019-09-04 Time sequence classification method based on relation change among multiple variables

Publications (2)

Publication Number Publication Date
CN110751169A CN110751169A (en) 2020-02-04
CN110751169B true CN110751169B (en) 2023-09-29

Family

ID=69276116

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910833290.0A Active CN110751169B (en) 2019-09-04 2019-09-04 Time sequence classification method based on relation change among multiple variables

Country Status (1)

Country Link
CN (1) CN110751169B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182259A (en) * 2018-01-03 2018-06-19 华南理工大学 A kind of method classified based on depth shot and long term Memory Neural Networks to Multivariate Time Series
CN108491886A (en) * 2018-03-29 2018-09-04 重庆大学 A kind of sorting technique of the polynary time series data based on convolutional neural networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182259A (en) * 2018-01-03 2018-06-19 华南理工大学 A kind of method classified based on depth shot and long term Memory Neural Networks to Multivariate Time Series
CN108491886A (en) * 2018-03-29 2018-09-04 重庆大学 A kind of sorting technique of the polynary time series data based on convolutional neural networks

Also Published As

Publication number Publication date
CN110751169A (en) 2020-02-04

Similar Documents

Publication Publication Date Title
Wang et al. A deformable CNN-DLSTM based transfer learning method for fault diagnosis of rolling bearing under multiple working conditions
Li et al. MAD-GAN: Multivariate anomaly detection for time series data with generative adversarial networks
Hsu et al. Multiple time-series convolutional neural network for fault detection and diagnosis and empirical study in semiconductor manufacturing
Jakubowski et al. Anomaly detection in asset degradation process using variational autoencoder and explanations
CN116204266A (en) Remote assisted information creation operation and maintenance system and method thereof
Liu et al. Supervised learning via unsupervised sparse autoencoder
Song et al. Unsupervised fault diagnosis method based on iterative multi‐manifold spectral clustering
Dornaika et al. Efficient dynamic graph construction for inductive semi-supervised learning
CN116703642A (en) Intelligent management system of product manufacturing production line based on digital twin technology
CN111898704B (en) Method and device for clustering content samples
CN110766042A (en) Multi-mark feature selection method and device based on maximum correlation minimum redundancy
Maggipinto et al. A deep learning-based approach to anomaly detection with 2-dimensional data in manufacturing
CN116597377A (en) Intelligent monitoring management method and system for cattle breeding
Kaupp et al. Outlier detection in temporal spatial log data using autoencoder for industry 4.0
CN110717602B (en) Noise data-based machine learning model robustness assessment method
Mohammadian et al. SiamixFormer: A fully-transformer Siamese network with temporal Fusion for accurate building detection and change detection in bi-temporal remote sensing images
CN116451139B (en) Live broadcast data rapid analysis method based on artificial intelligence
WO2022162427A1 (en) Annotation-efficient image anomaly detection
Gao et al. Sim: Open-world multi-task stream classifier with integral similarity metrics
CN110751169B (en) Time sequence classification method based on relation change among multiple variables
Zhang et al. Jointly learning dictionaries and subspace structure for video-based face recognition
Farady et al. Hierarchical Image Transformation and Multi-Level Features for Anomaly Defect Detection
US20230237371A1 (en) Systems and methods for providing predictions with supervised and unsupervised data in industrial systems
CN112154453A (en) Apparatus and method for clustering input data
Nurmamatovich et al. Neural network clustering methods

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant