CN112016041B

CN112016041B - Time sequence real-time classification method based on gram sum angle field imaging and Shortcut-CNN

Info

Publication number: CN112016041B
Application number: CN202010879648.6A
Authority: CN
Inventors: 刘然; 崔珊珊; 易琳; 吴立翔; 刘亚琼; 赵洋; 陈希; 王斐斐; 陈丹
Original assignee: Chongqing University; Chongqing University Cancer Hospital
Current assignee: Chongqing University; Chongqing University Cancer Hospital
Priority date: 2020-08-27
Filing date: 2020-08-27
Publication date: 2023-08-04
Anticipated expiration: 2040-08-27
Also published as: CN112016041A

Abstract

The invention discloses a real-time classification method of time series data based on gram sum angle field imaging and Shortcut-CNN, which comprises the following steps: 1) Collecting time sequence data; 2) Normalizing the acquired time series data to obtain data3) Data in polar coordinates

Description

Time sequence real-time classification method based on gram sum angle field imaging and Shortcut-CNN

Technical Field

The invention relates to the technical field of data classification, in particular to a real-time classification method of time series data.

Background

Time series data are ubiquitous. Time series data are generated from both human activity and the natural world at any time. Electroencephalogram (EEG) data, weather data, data for monitoring heart beat, blood pressure and the like, and data generated in the working process of the sensor belong to time series data. One of the main tasks of time series data processing is classification. The classification of the time series data can be widely applied to the fields of finance, industry, medical treatment and the like, and has important social and economic values. In recent years, as the availability of time-series data has been increased, the demand for classifying time-series data has also increased explosively.

As time series data are continuously studied, the performance of time series data classification methods is also continuously improved. The time-series data classification method commonly used at present can be classified into a distance-based method and a feature-based method. Among them, the distance-based methods are Nearest Neighbor (classifier) method and dynamic time warping (Dynamic Time Warping) method; feature-based methods include logistic regression and support vector machines. To improve the performance of the usual classification methods, classification methods based on the integration of some of the classifiers described above or probabilistic voting have been proposed. For example, COTE (Collective Of Transformation-based Ensembles), proposed by baynall et al, is an integration-based approach that integrates 35 different classifiers to achieve higher classification accuracy. COTE is distinguished in that it constructs a classifier according to 4 different data representation methods (shapelet, autocorrelation-Based Transform, etc.), thus completing the integration.

Although the above method can accomplish the task of classifying time-series data, the following two drawbacks still exist: (1) classification performance is to be further improved. The classifiers used in the method are all common classifiers, and the performance is limited. And the two-dimensional convolutional neural network with excellent classification performance in the field of computer vision is expected to further improve the performance of the time series data classifier. However, to apply the two-dimensional convolutional neural network to classification of time-series data, the following problems need to be solved: how to convert time series data into images that can be input to a two-dimensional convolutional neural network, and to construct a high-performance two-dimensional convolutional neural network model. This is because two-dimensional convolutional neural networks are good at learning the features of images, but time-series data are not image data at all, and must be changed into an image with the features highlighted in an appropriate way to be beneficial for the subsequent classification; moreover, a proper two-dimensional convolutional neural network model is constructed to be suitable for classifying the special image. (2) real-time classification is difficult to achieve. Most of the time-series data classification methods are designed for the case that one tag corresponds to a certain period of data. The real-time classification is that the time series data at a certain moment corresponds to a label, and in this case, the classifier can often utilize less data than the former case, so that a higher requirement is put on the classifier.

Disclosure of Invention

In view of the above, the present invention aims to provide a real-time classification method for time series data based on garland summation angular field imaging and Shortcut-CNN, so as to solve the technical problems that the conventional classification method cannot directly use a two-dimensional convolutional neural network to classify the time series data with high performance, and is difficult to classify in real time.

The invention discloses a time sequence real-time classification method based on gram sum angle field imaging and Shortcut-CNN, which comprises the following steps:

1) Collecting time sequence data;

definition x= { X ₁ ,x ₂ ,x ₃ ,...，x _T -represents a univariate time series, i.e. all time series data on one sensor; x is x _i Representing data corresponding to the moment i, wherein i is more than or equal to 1 and less than or equal to T, and T represents the length of the whole time sequence; m-ary time series tableShown as x= { X ¹ ,X ² ,...,X ^M}, wherein Represents the jth univariate time series, j is not less than 1 and not more than M, and>data corresponding to the j-th time sequence moment i is represented; the M-ary time series X is represented by a matrix as follows:

truncated time series length t=1, time series sample X _i The matrix is represented as follows:

2) Using formula (3) to collect the time series samplesScaling to range from [0,1 ]]Between, scaled data is used +.>A representation;

wherein j is more than or equal to 1 and less than or equal to M, M refers to M time sequences, namely M-element time sequences; i refers to a time sequence sample corresponding to the moment i in the current processing;

3) Data in polar coordinates using equation (4)Representing;

wherein ,represents the angle in polar coordinates, r represents the radius in polar coordinates; t is t _j Representing the spatial order inside the time series samples; n is a preset positive integer for controlling the length of the radius r in the polar coordinates;

4) Converting the data processed in the step 3) into a gram matrix with the shape of M multiplied by M through a formula (5), and storing the gram matrix as a gray image;

5) Inputting the gray level image obtained in the step 4) into a Shortcut-CNN model to obtain a classification result, wherein the Shortcut-CNN model comprises an input layer, 3 blocks in series, a global average pooling layer and a full connection layer, an Add layer and a ReLU layer are arranged between adjacent blocks, and an Add layer and a ReLU layer are also arranged between the last Block and the global average pooling layer;

the internal structures of the 3 blocks are the same, and each Block comprises 3 groups of the same data processing layer structures; each group of data processing layer structure in each Block comprises an upper 2 parallel two-dimensional convolution layers, a lower 2 parallel two-dimensional convolution layers and 1 Add layer, the output of the upper 2 parallel two-dimensional convolution layers and the output of the lower 2 parallel two-dimensional convolution layers are used as the input of the Add layer, the summation result of the Add layers is used as the output of the Add layer, and the summation result is also the output of the group of data processing layer structures; the output of the 1 st group data processing layer structure in each Block is used as the input of the 2 nd group data processing layer structure, the output of the 2 nd group data processing layer structure is used as the input of the 3 rd group data processing layer structure, and the output of the 3 rd group data processing layer structure is the output of the Block;

the input data of each Block is input into the upper and lower 2 parallel two-dimensional convolution layers of the 1 st group data processing layer structure in the Block, and is also connected to an Add layer behind the Block in a cross-layer mode, and the Add layer combines the input data of the Block with the output data of the Block and then outputs the combined data after being activated by a ReLU activation function.

Further, the time series data collected in the step 1) is EEG data.

The invention has the beneficial effects that:

1. the invention relates to a real-time classification method of time series data based on gram sum angle field imaging and Shortcut-CNN, which comprises the steps of firstly integrating time series data of the same moment i on M-element time series as a time series sampleCompared with the general time series data classification task, the time series sample obtained by the method has one dimension less, the time series sample does not contain time sequence information and only contains space information. Thus, the invention can classify the time series in real time, i.e. at each time point on the time series.

2. The GASF imaging script is used for imaging the univariate time sequence with time sequence relation between each point of the script, so as to expand the dimension of the univariate time sequence. The invention samples time seriesIf the spatial relationship contained in the image is similar to the time sequence relationship, the GASF imaging method can be applied to the time sequence sample formed by the time sequence data of the same moment i on the M-element time sequence. The time series sample is expanded into a Graham matrix with the shape of M x M after GASF imaging, and the matrix is stored in the form of gray level image to obtain EEG image containing abundant space information. After the time sequence samples are subjected to GASF imaging, the dimension is expanded, and the original space information of the data is enhanced. Meanwhile, because the format of the time series data is an image, the invention can try to process the time series data by using the two-dimensional convolutional neural network which is good at learning the characteristics of the image, and solves the problem that the conventional classification method can not directly use the two-dimensional convolutional neural network to classify the time series data with high performanceProblems.

3. Aiming at the problem of classifying the special image such as the obtained EEG image, the Shortcut-CNN model adopted in the invention is characterized in that: only a two-dimensional convolution layer and an Add layer are arranged in the 3 serial Blocks, and a pooling layer is not needed, so that the spatial information in the image can be fully extracted; the quick connection is added among 3 Blocks in series, so that the gradient disappearance of the model is avoided; and the self-defined Shortcut-CNN model uses a global average pooling (Global Average Pooling, abbreviated as GAP) layer to average each feature map obtained by convolution respectively, and then is fully connected with the last classification layer. The GAP layer preserves the spatial features of the convolution layer extracted from the EEG image while reducing the model parameters. Accuracy, kappa and AUC (Area Under the Curve, abbreviated as AUC, which represents the size of the area under the ROC curve) are taken as evaluation indexes, and the Shortnut-CNN model is compared with the VGG16 model and the shallow CNN model. Experiments using EEG motion sickness data as an example show that the average accuracy of the Shortcut-CNN model in the invention can reach 90.4%, the average kappa value reaches 80.6%, the average AUC reaches 96.4%, the EEG motion sickness data can be used as a classifier to judge the motion sickness state of the EEG data at each moment, and compared with other comparative models, the model has the best effect on three indexes.

Drawings

Fig. 1 is a time-series data processing flow chart.

Fig. 2 is a polar representation of time-series data.

Fig. 3 is a gray scale image obtained by saving a glamer matrix, wherein (a) is an image corresponding to a non-motion state and (b) is an image corresponding to a motion state.

FIG. 4 is a block diagram of the Shortcut-CNN model.

Fig. 5 is a structural diagram of VGG16 model.

Fig. 6 is a diagram of the shallow CNN model structure.

FIG. 7 is a training accuracy and loss curve for each model two class. Wherein, (a) is a accuracy curve of Shortcut-CNN, (b) is a loss curve of Shortcut-CNN, (c) is an accuracy curve of shallow CNN, (d) is a loss curve of shallow CNN, (e) is an accuracy curve based on VGG16 model, and (f) is a loss change curve based on VGG16 model.

FIG. 8 is a graph of accuracy after increasing training runs based on the VGG16 model, where (a) is the accuracy after 500 runs and (b) is the accuracy after 800 runs.

Detailed Description

The invention is further described below with reference to the drawings and examples.

The real-time classification method of time series data based on the gram sum angle field imaging and the Shortcut-CNN in the embodiment comprises the following steps:

1) Time series data is collected.

Definition x= { X ₁ ,x ₂ ,x ₃ ,...，x _T -represents a univariate time series, i.e. all time series data on one sensor; x is x _i Representing data corresponding to the moment i, wherein i is more than or equal to 1 and less than or equal to T, and T represents the length of the whole time sequence; then the M-ary time series is denoted as x= { X ¹ ,X ² ,...,X ^M}, wherein Represents the jth univariate time series, j is not less than 1 and not more than M, and>data corresponding to the j-th time sequence moment i is represented; the M-ary time series X is represented by a matrix as follows:

the time series data acquired in this embodiment is EEG motion sickness data, and the acquisition process is as follows: VR-based vehicle driving simulator to induce motion sickness (motion movement) of a user using a wearable wireless device Muse ^TM Collecting 4-channel EEG motion sickness data; each channel contains 5 frequency bands, so EEG motion data forms a 20 (5×4) dimensional feature vector, and the component of the feature vector has no range of valuesAnd the same is true. The 20 bands of EEG can be seen here as 20 univariate time series, respectively, constituting an M-ary time series, i.e. m=20, the number of rows x being the length T of each time series.

EEG motion sickness data are stored in an Excel file, each row corresponds to a time series sample, and each time series sample is provided with a label which indicates the motion sickness state at the current moment. Tag 0 indicates the non-motional state and tag 1 indicates the motional state. In this embodiment, EEG motion sickness data contains 2463 time series samples.

A general time-series data classification task is to intercept consecutive time-series X of length T from the whole time-series X of length T (T>1) As time series samples X _i ，X _i The matrix is represented as follows:

in the time sequence real-time classification method in this embodiment, real-time classification of the time sequence is to be implemented, one time corresponds to one label, so that the length t=1 of the intercepted time sequence is obtained to obtain a time sequence sample X _i The matrix is represented as follows:

compared with the general time series data classification task, the time series sample of the time series real-time classification method in the embodiment has one dimension less, and the time series sample does not contain time sequence information and only contains space information. In order to increase the dimension of the time series sample in the embodiment and facilitate the use of a two-dimensional convolutional neural network to process the problem of time series data classification, the time series real-time classification method in the embodiment applies the GASF imaging technology originally used for imaging the univariate time series to the time series sample formed by the time series data of the same moment i on the M-element time series, and the specific process is as follows:

2) Normalization step 1) acquisitionTime-series sample X of (2) _i Using formula (4) to collect the obtained time series samplesScaling to range from [0,1 ]]Between, scaled data is used +.>A representation; j is more than or equal to 1 and less than or equal to M, i refers to a time sequence sample corresponding to the moment i in the current processing;

3) Data in polar coordinates using equation (5)Representing;

wherein ,represents the angle in polar coordinates, r represents the radius in polar coordinates; t is t _j Representing the spatial order inside the time series samples; n is a predetermined positive integer for controlling the length of the radius r in the polar coordinates, in this embodiment n=m=20.

4) After the transformation, the original range is [0,1]Is mapped to [0, pi/2 ]]. The mapping of equation (5) has two important characteristics: first, this mapping is bijective, for a givenThere is a unique result in polar coordinates corresponding to it. Second, the polar coordinates maintain an absolute spatial relationship within the time series samples at the same time. This relation can be seen from fig. 2, data +.>The corresponding angle is distorted while the length of the radius also remains gradually increasing. Absolute spatial relationships are maintained by radius.

5) Converting the data processed in the step 4) into a gram matrix with the shape of M multiplied by M through a formula (6), and storing the gram matrix as a gray image;

in this embodiment, one EEG image obtained by GASF is an image corresponding to a time series sample at a certain time. There is no time sequence relation between the pixel points in the image, and the whole image corresponds to a label. The model in this embodiment can classify a single image, that is, time-series samples at a certain moment, so that it can implement real-time classification.

The time-series samples at the same time, which were originally 1×m in shape, were converted into a gram matrix in shape of m×m by the normalization, polar coordinates, and GASF imaging. The gray scale image as shown in fig. 3 is obtained by saving the gram matrix in the form of an image. In this embodiment, after the GASF is imaged, a 20×20 gram matrix is obtained, so that the spatial relationship inside the time series samples at the same time is maintained, and the time series samples are expanded into the 20×20 gram matrix. The 20×20 glaamer matrix is stored in the form of a gray-scale image (20×20×3). The gray image contains three components of RGB, but its three components are all the same. For 2463 EEG images obtained, each image has a tag 0 (non-motion state) or a tag 1 (motion state) corresponding thereto, so that the EEG images can be processed by techniques such as image processing, and classification of time-series data such as EEG motion data can be achieved.

6) Inputting the gray level image obtained in the step 5) into a Shortcut-CNN model to obtain a classification result, wherein the Shortcut-CNN model comprises an input layer, 3 blocks in series, a global average pooling layer and a full connection layer, an Add layer and a ReLU layer are arranged between adjacent blocks, and an Add layer and a ReLU layer are also arranged between the last Block and the global average pooling layer;

The purpose of cross-layer connection in the Shortcut-CNN model is to prevent the increase of the two-dimensional convolution layer number, and the model is difficult to train due to the gradient disappearance phenomenon; the global average pooling layer (GAP layer) reduces the likelihood of model overfitting while reducing the number of parameters compared to the fully connected layer in existing CNN networks.

Table 1 below shows the super parameter settings and dimensional changes of each layer in the Shortnut-CNN model in this embodiment, and the output shape of each two-dimensional convolution layer and Add layer in each Block is consistent with the final output of the Block because the parameter settings of the two-dimensional convolution layers in each Block are consistent and the filling modes are the same. Therefore, the Block internal details are ignored, and the super parameter setting and the dimension change condition of each layer of the Shortnut-CNN model are shown only by taking the Block as a unit. Note that since the RGB components of the gradation image are the same, this embodiment takes only one of the components when inputting the image.

Table 1 model layer-by-layer hyper-parameter settings and dimensional changes

To demonstrate the effectiveness of the Shortcut-CNN model presented in this example, we compared it to the pre-trained VGG16 model. The specific method is to load the pre-training weight of the VGG16 model, take the original convolution layer of the VGG16 as a convolution base (Conv-base), and add a dense connection classifier on the convolution base. Freezing the convolution basis, and only updating the weight of the added dense connection classifier in the training process. The classification model structure based on VGG16 is shown in fig. 5, and the layer parameter settings are shown in table 2.

TABLE 2 super parameter settings and dimensional changes for each layer of VGG16 model

In this embodiment, the Shortcut-CNN model is compared with the pre-trained VGG model, and also compared with the shallow CNN model, the structure of which is shown in fig. 6, and the specific parameter settings of each layer are shown in table 3. Note that since the RGB components of the gradation image are the same, this embodiment takes only one of the components when inputting the image.

Table 3 shallow CNN model parameters settings and dimension changes for each layer

In this example, the validity of the Shortcut-CNN model was confirmed using accuracy, kappa and AUC as evaluation indexes. Kappa is used to measure the consistency of the class in which the model classifies correctly, AUC is the area under the ROC curve. The above experiments were run on an Intel Core i7-4710HQ processor by calling the Keras library of Python 3.6 and Sklearn toolkit.

The experiment divides 2463 images, which are changed after the GASF imaging, into a training set and a test set according to the ratio of 9:1, and then divides 10% from the training set as a verification set, and the ratio of the tag 0 and the tag 1 is kept unchanged before and after division. The EEG image is finally transferred as input data to the Shortnut-CNN model in this embodiment. The training run for each model was designated 200 runs (epochs=200). Because some parameters of the model are randomly initialized, the problem cannot be explained by only one experimental result, and the average value of ten experiments is taken as the final experimental result for each model.

Motion sickness state classification results

In the Shortcut-CNN model, the VGG16 model, the shallow CNN model, the choice of optimizers (optimizers), penalty functions (Cost functions) and Batch size is consistent, except for the non-uniformity of the super parameter settings of the internal parameters of the model. Taking two classifications as an example, the parameter settings are shown in Table 4.

TABLE 4 model super parameter settings

For the two-classification experiment, the data were trained and evaluated on three classifier models, namely, a Shortcut-CNN model, a VGG16 model, and a shallow CNN model. The experimental data for the two categories are shown in table 5. The underlined data in the table indicate the best accuracy and kappa.

TABLE 5 results of classification experiments for different classifiers

As can be seen from Table 5, the best performing classifier model was the Shortcut-CNN model we proposed, with an average accuracy (average Acc) of 0.904, and with the average kappa value (average kappa) 0.806 and average AUC (average AUC) of 0.964 being highest. Next is the shallow CNN model with average accuracy and average kappa values of 0.884 and 0.765, respectively. The VGG16 model is the worst performing of the three classification models, whether average accuracy, average kappa value, or average AUC.

FIG. 7 shows the accuracy and loss curves during three model training 200 runs, with the abscissa in each figure corresponding to the training run (epoch) and the ordinate to the loss or accuracy of the classifier model for each run.

The training accuracy curves of the three models in fig. 7 (a), (c) and (e) are respectively, and it can be seen that the training accuracy and the verification accuracy of all models are improved gradually along with the increase of training rounds, but the data fitting degree is better in the training process of the significantly self-defined Shortcut-CNN model, and the verification accuracy is the highest of the three models. While it can be seen from (b) and (d) that the validation loss of the Shortcut-CNN model and the shallow CNN model falls to 50 rounds before starting to rise, and oscillates within a certain range, the population tends to stabilize. The loss seen here becomes large and does not account for model overfitting. The loss curve therefore shows the average loss, but it is the distribution of loss values that affects the model accuracy, not the average.

It can be seen from the graphs (e) and (f) that the model based on VGG16 is in a state of under-fitting, which also explains why the VGG16 model Acc in table 5 is only 0.803. The training runs of the VGG16 model were thus increased, and the accuracy curves for training 500 runs and 800 runs are shown in fig. 8. The training accuracy rate of the model increased to 800 rounds is close to 1, and the verification accuracy rate is not increased any more compared with 500 rounds. The accuracy on the model test sets for training 500 and 800 rounds was 0.86 and 0.846, respectively. It can be seen that even if training rounds are added, the test accuracy based on the VGG16 model is not increased any more and still is higher than that of the Shortcut-CNN model.

Experiments show that the accuracy of the Shortnut-CNN model in the embodiment can reach 90.4%, kappa reaches 80.6%, AUC reaches 96.4%, the method can be used as a classifier for judging the motion sickness state of EEG data at each moment, and compared with other models, the Shortnut-CNN model has the best effect on three indexes.

Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered by the claims of the present invention.

Claims

1. A real-time classification method of time series data based on gram sum angle field imaging and Shortcut-CNN is characterized by comprising the following steps:

1) Collecting time sequence data;

definition x= { X ₁ ,x ₂ ,x ₃ ,...，x _T -represents a univariate time series, i.e. all time series data on one sensor; x is x _i Representing data corresponding to the moment i, wherein i is more than or equal to 1 and less than or equal to T, and T represents the length of the whole time sequence; then the M-ary time series is denoted as x= { X ¹ ,X ² ,...,X ^M}, wherein Represents the j-th univariate time sequence, j is more than or equal to 1 and less than or equal to M,data corresponding to the j-th time sequence moment i is represented; the M-ary time series X is represented by a matrix as follows:

3) Data in polar coordinates using equation (4)Representing;

wherein ,represents the angle in polar coordinates, r represents the radius in polar coordinates; t is t _j Representing the spatial order inside the time series samples; n is a positive integer set in advance,for controlling the length of radius r in polar coordinates;

2. The real-time classification method of time series data based on glatiramer summation angular field imaging and Shortcut-CNN according to claim 1, wherein:

the time series data collected in the step 1) are EEG data.