CN110751169A - Time sequence classification method based on relation change among multivariate variables - Google Patents

Time sequence classification method based on relation change among multivariate variables Download PDF

Info

Publication number
CN110751169A
CN110751169A CN201910833290.0A CN201910833290A CN110751169A CN 110751169 A CN110751169 A CN 110751169A CN 201910833290 A CN201910833290 A CN 201910833290A CN 110751169 A CN110751169 A CN 110751169A
Authority
CN
China
Prior art keywords
variables
correlation coefficient
time
partial correlation
relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910833290.0A
Other languages
Chinese (zh)
Other versions
CN110751169B (en
Inventor
蔡瑞初
陈嘉伟
温雯
郝志峰
陈炳丰
李梓健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201910833290.0A priority Critical patent/CN110751169B/en
Publication of CN110751169A publication Critical patent/CN110751169A/en
Application granted granted Critical
Publication of CN110751169B publication Critical patent/CN110751169B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Complex Calculations (AREA)

Abstract

The invention provides a time sequence classification method based on relation change among multivariate variables, which comprises the following steps: acquiring sample data from the observation data set, calculating a partial correlation coefficient between every two variables of the sample data, and constructing a partial correlation coefficient matrix; coding the partial correlation coefficient matrix through a convolutional neural network to obtain a corresponding characteristic diagram; respectively stretching each feature map into feature vectors to be circularly input into the long and short memory neural network, thereby obtaining a hidden state for capturing variable inter-relation change modes; and inputting the hidden state into a label classifier, outputting a corresponding sample class, and finishing the classification of the time sequence. The time sequence classification method based on the relation change among the multivariate variables fully considers the relation among different variables in time sequence data, performs classification based on the relation mode of the variables, fully expresses the change mode of the relation among the different variables in the time sequence data, and has better robustness on the input noise value and high classification precision.

Description

Time sequence classification method based on relation change among multivariate variables
Technical Field
The invention relates to the technical field of data mining, in particular to a time sequence classification method based on relation change among multivariate variables.
Background
Applications of time series data in the fields of industrial systems, information systems, medical health, financial markets, etc. are becoming more and more common today. Therefore, the task of time-series classification has become an important and valuable research topic, such as abnormality detection and the like. Conventional similarity-based time sequence classification methods such as K-nearest neighbor (KNN) and Dynamic Time Warping (DTW). However, this type of method is sensitive only to the values of the variables and does not take into account the relationships between the different variables.
Another type of method that is currently popular is to perform a series of feature transformations on time series data, so as to mine patterns therein for classification, such as multi-layer perceptrons (MLPs), long short term memory neural networks (LSTM), Convolutional Neural Networks (CNN), and so on. Although this kind of method implicitly captures the relationship between different variables on the feature space, it is difficult to characterize the change pattern of the relationship between the variables. In the time series classification problem, a certain type of variation mode of the relation between variables often represents a classification category. For example, in an information system, it is common that "CPU temperature" is increased by an increase in "CPU usage" of a certain server, and "fan speed" is increased by an increase in "CPU temperature", so that "CPU temperature" is maintained relatively stable when "CPU usage" continues to increase. It can be seen that the relationship between "CPU usage" and "CPU temperature" changes from being dependent to being independent during this time. However, when a fan of a server fails, then both "CPU temperature" and "fan speed" may be irrelevant, and an increase in "CPU usage" causes the "CPU temperature" to continue to rise, even causing the server to be down. Therefore, the relationship between "CPU usage" and "CPU temperature" has not been independent for this period of time.
The relationship between the variables in the two categories is different, but the current method cannot express and classify the variation well.
Disclosure of Invention
The invention provides a time sequence classification method based on the relation change among multiple variables, aiming at overcoming the technical defects that the existing time sequence data classification method cannot effectively express the change mode of the relation among the variables and is human.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a method for time-series classification based on relation change among multivariate variables comprises the following steps:
s1: acquiring an observation data set with a label;
s2: acquiring sample data from the observation data set, calculating a partial correlation coefficient between every two variables of the sample data, and constructing a partial correlation coefficient matrix to obtain a partial correlation coefficient matrix at each moment;
s3: the partial correlation coefficient matrix at each moment is used as an input convolutional neural network CNN, and the convolutional neural network encodes the partial correlation coefficient matrix to obtain a corresponding characteristic diagram;
s4: respectively stretching each feature map into feature vectors to be circularly input into a long and short memory neural network (LSTM), thereby obtaining a hidden state for capturing a variable mode among variable relations;
s5: and inputting the hidden state into a label classifier, outputting a corresponding sample class, and finishing the classification of the time sequence.
Wherein, the step S1 specifically includes:
sampling at fixed time by using a data acquisition device of an industrial system or an information system; different index values are obtained at each sampling moment, the system state corresponding to the moment is represented by a label variable, and an observation data set can be acquired after the system runs for a period of time, wherein:
characterization of the observation dataset as X ═ X1,x2,Λ,xm]Wherein m is the number of samples; set at time tSample data xt∈RnI.e. containing n variables, and each sample data corresponds to a tag variable ytWherein y ist∈R。
Wherein, the step S2 specifically includes:
s21: obtaining sample data X with time length of w from observation data sett=[xt-w+1,xt-w+2,Λ,xt]Wherein X istIs a time slice in X and is used for calculating partial correlation coefficient matrix Pt∈Rn×nAs a relation matrix between variables at time t;
s22: the time sequences of two variables i and j in the period of time are respectively set asThen one coefficient in the partial correlation coefficient matrix
Figure BDA0002191425610000022
The calculation method is as follows:
Figure BDA0002191425610000023
wherein,
Figure BDA0002191425610000024
is a covariance matrix sigmatOf the inverse matrix of (d), and the covariance matrix sigmatElement (1) ofThe calculation method is as follows:
wherein,
Figure BDA0002191425610000032
andrespectively, the mean of the two variables over the time period.
S23: obtaining a partial correlation coefficient matrix P at each moment according to the calculation mode of the step S22tAnd is used for representing the relation between different variables at each moment.
Wherein, the step S3 specifically includes: a partial correlation coefficient matrix P with a time length of lt-l+1,Pt-l+2,ΛPtInputting the convolution neural network, and coding the partial correlation coefficient matrix by the convolution neural network to obtain corresponding l characteristic graphs and corresponding labels y at each momentt
Wherein, in the step S4, the hidden state htFor capturing the l variable inter-relationship variation patterns.
In step S5, the label classifier adopts a full connection layer, and outputs the obtained sample class
Figure BDA0002191425610000034
Wherein the method further comprises step S6: and (4) repeatedly performing the steps S3-S5 by using a gradient descent method by using the cross entropy of the output sample class as a loss function so as to improve the classification precision.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the time sequence classification method based on the relation change among the multivariate variables fully considers the relation among different variables in time sequence data and classifies based on the relation mode of the variables, improves the defects of the existing method, fully expresses the change mode of different variable relations in the time sequence data, has better robustness on the input noise value and high classification precision, and can be applied to the time sequence classification problem in the fields of industrial systems, information systems, medical health, financial markets and the like.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, a method for time-series classification based on relationship change between multivariate variables comprises the following steps:
s1: acquiring an observation data set with a label;
s2: acquiring sample data from the observation data set, calculating a partial correlation coefficient between every two variables of the sample data, and constructing a partial correlation coefficient matrix to obtain a partial correlation coefficient matrix at each moment;
s3: the partial correlation coefficient matrix at each moment is used as an input convolutional neural network CNN, and the convolutional neural network encodes the partial correlation coefficient matrix to obtain a corresponding characteristic diagram;
s4: respectively stretching each feature map into feature vectors to be circularly input into a long and short memory neural network (LSTM), thereby obtaining a hidden state for capturing a variable mode among variable relations;
s5: and inputting the hidden state into a label classifier, outputting a corresponding sample class, and finishing the classification of the time sequence.
Wherein, the step S1 specifically includes:
sampling at fixed time by using a data acquisition device of an industrial system or an information system; different index values are obtained at each sampling moment, the system state corresponding to the moment is represented by a label variable, and an observation data set can be acquired after the system runs for a period of time, wherein:
characterization of the observation dataset as X ═ X1,x2,Λ,xm]Wherein m is the number of samples; let the sample at time tData xt∈RnI.e. containing n variables, and each sample data corresponds to a tag variable ytWherein y ist∈R。
More specifically, the step S2 specifically includes:
s21: obtaining sample data X with time length of w from observation data sett=[xt-w+1,xt-w+2,Λ,xt]Wherein X istIs a time slice in X and is used for calculating partial correlation coefficient matrix Pt∈Rn×nAs a relation matrix between variables at time t;
s22: the time sequences of two variables i and j in the period of time are respectively set as
Figure BDA0002191425610000041
Then one coefficient in the partial correlation coefficient matrix
Figure BDA0002191425610000042
The calculation method is as follows:
wherein,
Figure BDA0002191425610000044
is a covariance matrix sigmatOf the inverse matrix of (d), and the covariance matrix sigmatElement (1) of
Figure BDA0002191425610000045
The calculation method is as follows:
Figure BDA0002191425610000046
wherein,
Figure BDA0002191425610000047
andrespectively, the mean of the two variables over the time period.
S23: obtaining a partial correlation coefficient matrix P at each moment according to the calculation mode of the step S22tAnd is used for representing the relation between different variables at each moment.
More specifically, the step S3 specifically includes: a partial correlation coefficient matrix P with a time length of lt-l+1,Pt-l+2,ΛPtInputting the convolution neural network, and coding the partial correlation coefficient matrix by the convolution neural network to obtain corresponding l characteristic graphs and corresponding labels y at each momentt
More specifically, in the step S4, the hidden state htFor capturing the l variable inter-relationship variation patterns.
More specifically, in step S5, the label classifier adopts a full connection layer, and outputs the obtained sample class
Figure BDA0002191425610000051
More specifically, the method further includes step S6: and (4) repeatedly performing the steps S3-S5 by using a gradient descent method by using the cross entropy of the output sample class as a loss function so as to improve the classification precision.
In the specific implementation process, the time sequence classification method based on the relation change among the multivariate variables fully considers the relation among different variables in time sequence data, and simultaneously classifies based on the relation mode of the variables, so that the method overcomes the defects of the existing method, fully expresses the change mode of different variable relations in the time sequence data, has better robustness on the input noise value, has high classification precision, and can be applied to the time sequence classification problem in the fields of industrial systems, information systems, medical health, financial markets and the like.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (7)

1. A method for time-series classification based on relation change among multivariate variables is characterized by comprising the following steps:
s1: acquiring an observation data set with a label;
s2: acquiring sample data from the observation data set, calculating a partial correlation coefficient between every two variables of the sample data, and constructing a partial correlation coefficient matrix to obtain a partial correlation coefficient matrix at each moment;
s3: the partial correlation coefficient matrix at each moment is used as an input convolutional neural network, and the convolutional neural network encodes the partial correlation coefficient matrix to obtain a corresponding characteristic diagram;
s4: respectively stretching each feature map into feature vectors to be circularly input into the long and short memory neural network, thereby obtaining a hidden state for capturing variable inter-relation change modes;
s5: and inputting the hidden state into a label classifier, outputting a corresponding sample class, and finishing the classification of the time sequence.
2. The method according to claim 1, wherein the step S1 is specifically performed by:
sampling at fixed time by using a data acquisition device of an industrial system or an information system; different index values are obtained at each sampling moment, the system state corresponding to the moment is represented by a label variable, and an observation data set can be acquired after the system runs for a period of time, wherein:
characterization of the observation dataset as X ═ X1,x2,Λ,xm]Wherein m is the number of samples; let sample data x at time tt∈RnI.e. containing n variables, with one for each sample dataIndividual tag variable ytWherein y ist∈R。
3. The method according to claim 2, wherein the step S2 is specifically performed by:
s21: obtaining sample data X with time length of w from observation data sett=[xt-w+1,xt-w+2,Λ,xt]Wherein X istIs a time slice in X and is used for calculating partial correlation coefficient matrix Pt∈Rn×nAs a relation matrix between variables at time t;
s22: the time sequences of two variables i and j in the period of time are respectively set as
Figure FDA0002191425600000011
Then one coefficient in the partial correlation coefficient matrix
Figure FDA0002191425600000012
The calculation method is as follows:
wherein,
Figure FDA0002191425600000021
is a covariance matrix sigmatOf the inverse matrix of (d), and the covariance matrix sigmatElement (1) of
Figure FDA0002191425600000022
The calculation method is as follows:
wherein,
Figure FDA0002191425600000024
andrespectively, the mean of the two variables over the time period.
S23: obtaining a partial correlation coefficient matrix P at each moment according to the calculation mode of the step S22tAnd is used for representing the relation between different variables at each moment.
4. The method according to claim 3, wherein the step S3 is specifically performed by: a partial correlation coefficient matrix P with a time length of lt-l+1,Pt-l+2,ΛPtInputting the convolution neural network, and coding the partial correlation coefficient matrix by the convolution neural network to obtain corresponding l characteristic graphs and corresponding labels y at each momentt
5. The method according to claim 4, wherein in step S4, the hidden state h is hiddentFor capturing the l variable inter-relationship variation patterns.
6. The method according to claim 5, wherein in step S5, the label classifier employs a full connection layer, and outputs the obtained sample class
Figure FDA0002191425600000026
7. The method for time-series classification based on the relation change between multivariate variables according to any one of claims 1-6, further comprising the step of S6: and (4) repeatedly performing the steps S3-S5 by using a gradient descent method by using the cross entropy of the output sample class as a loss function so as to improve the classification precision.
CN201910833290.0A 2019-09-04 2019-09-04 Time sequence classification method based on relation change among multiple variables Active CN110751169B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910833290.0A CN110751169B (en) 2019-09-04 2019-09-04 Time sequence classification method based on relation change among multiple variables

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910833290.0A CN110751169B (en) 2019-09-04 2019-09-04 Time sequence classification method based on relation change among multiple variables

Publications (2)

Publication Number Publication Date
CN110751169A true CN110751169A (en) 2020-02-04
CN110751169B CN110751169B (en) 2023-09-29

Family

ID=69276116

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910833290.0A Active CN110751169B (en) 2019-09-04 2019-09-04 Time sequence classification method based on relation change among multiple variables

Country Status (1)

Country Link
CN (1) CN110751169B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182259A (en) * 2018-01-03 2018-06-19 华南理工大学 A kind of method classified based on depth shot and long term Memory Neural Networks to Multivariate Time Series
CN108491886A (en) * 2018-03-29 2018-09-04 重庆大学 A kind of sorting technique of the polynary time series data based on convolutional neural networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182259A (en) * 2018-01-03 2018-06-19 华南理工大学 A kind of method classified based on depth shot and long term Memory Neural Networks to Multivariate Time Series
CN108491886A (en) * 2018-03-29 2018-09-04 重庆大学 A kind of sorting technique of the polynary time series data based on convolutional neural networks

Also Published As

Publication number Publication date
CN110751169B (en) 2023-09-29

Similar Documents

Publication Publication Date Title
CN111915437B (en) Training method, device, equipment and medium of money backwashing model based on RNN
Yıldız et al. A thermal-based defect classification method in textile fabrics with K-nearest neighbor algorithm
Zhang et al. Fault detection and recognition of multivariate process based on feature learning of one-dimensional convolutional neural network and stacked denoised autoencoder
CN104966105A (en) Robust machine error retrieving method and system
CN116204266A (en) Remote assisted information creation operation and maintenance system and method thereof
CN117041017B (en) Intelligent operation and maintenance management method and system for data center
CN117034123B (en) Fault monitoring system and method for fitness equipment
US20240185582A1 (en) Annotation-efficient image anomaly detection
CN116451139B (en) Live broadcast data rapid analysis method based on artificial intelligence
CN117578715A (en) Intelligent monitoring and early warning method, system and storage medium for power operation and maintenance
CN111898704B (en) Method and device for clustering content samples
Li et al. Semi-supervised process fault classification based on convolutional ladder network with local and global feature fusion
Zheng et al. A group lasso based sparse KNN classifier
Jaiswal et al. Deep learned cumulative attribute regression
Tian et al. High-performance fault classification based on feature importance ranking-XgBoost approach with feature selection of redundant sensor data
Cheng et al. Learning Transferable Time Series Classifier with Cross-Domain Pre-training from Language Model
CN117333717A (en) Security monitoring method and system based on network information technology
CN116561814B (en) Textile chemical fiber supply chain information tamper-proof method and system thereof
CN112154453A (en) Apparatus and method for clustering input data
Zhang et al. Jointly learning dictionaries and subspace structure for video-based face recognition
Zhu et al. Auto-starting semisupervised-learning-based identification of synchrophasor data anomalies
Wang et al. Utilizing VQ-VAE for end-to-end health indicator generation in predicting rolling bearing RUL
Zeng et al. A fault diagnosis method for motor vibration signals incorporating Swin transformer with locally sensitive hash attention
CN110751169B (en) Time sequence classification method based on relation change among multiple variables
Wang et al. Nonlinear feature selection neural network via structured sparse regularization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant